How Journaling Works

From InterBase
Jump to: navigation, search

Go Up to About Journals, Journal Files, and Journal Archives


Journaling turns off forced writes to the hard drive, and synchronously writes the updates to the journal file instead. Because journal file writes are sequential rather than random, there is no disk head movement, which improves database performance.

To save changed pages in the database cache to the hard disk, you set up journaling checkpoints to occur automatically. A checkpoint specifies the time at which InterBase must save all the changed pages in the database cache to the database file. After the checkpoint has been reached, the data in the journal file is no longer needed, so the file can be reused. For best performance, place the journal files on a dedicated hard drive. The journal files must be on the database server machine.

Journaling guarantees that all changes are on disk before a transaction is marked committed as long as O/S and hardware caching are disabled.

You do not need to use journal archiving to use journaling and journal files. However, journal archiving lets you recover from a disaster that completely destroys the database server.

When a database is configured for direct I/O, adding journaling does not automatically convert the database to asynchronous buffered I/O as it does when the database is configured for synchronous buffer I/O. This is to avoid buffered I/O at all costs when the database is set to direct I/O.

InterBase uses buffered file I/O on all platforms to perform I/O on database pages for the file on disk. The pages are delivered via the System File Cache, which acts as a duplicate store of the pages on RAM. Subsequent loads of the same page(s) are quickly served by the OS kernel if the page exists in the System File Cache. On systems where there is high contention with other files for the System File Cache (a shared pool used by all processes for buffered file I/O) the performance of InterBase may not be optimal. If available System File Cache is limited due to RAM resource limitations, the kernel must spend time cleaning up unused blocks of memory from other processes as well as provide for servicing a new block I/O request.

The performance problem is alleviated by using "direct I/O" (also known as non-buffered I/O) so blocks of pages are directly read from the disk into the process space and do not need to use the System File Cache.

This is supported on Windows OS only. This setting is not supported on non-Windows platform databases; you will see the following error.

feature is not supported

-direct I/O operation

If a database enabled with "direct" I/O is then copied to an older version of InterBase, the setting will not be used by the older InterBase server. The older server will employ the "sync" write mode in this case.

The fixes are as follows:

The gfix command line tool has been modified to allow setting a database to be in "direct" I/O write mode.

# gfix [-write {async, sync, direct}] . . .

For example:

#gfix -write direct foo.ib -user sysdba -password masterkey

The gbak command line tool now has a new restore option (optional) setting to override a database write mode. The "write" mode will be preserved during a backup/restore lifecycle.

# gbak [-write {async, sync, direct}] . . . For example: # gbak -write direct -r foo.ibk foo.ib -user sysdba -password masterkey

Services API support for the new and updated gfix and gbak options.

You can find the various new agruments and respective values in ibase.h

API Guide Table 12.5: Service API database restore arguments

Argument: isc_spb_res_write_mode

Purpose: Set the write mode of the database. The next byte must be one of:

isc_spb_res_wm_async isc_spb_res_wm_sync isc_spb_res_wm_direct Corresponds to gbak -write

Argument Length: 1 byte

Argument Value: byte

API Guide Table 12.6: Service API database properties arguments

Add isc_spb_prp_wm_direct to the following argument: isc_spb_prp_write_mode

gstat command line tool will exhibit the following setting,direct, in its "Attributes" header line output.

# gstat -h foo.ib -user sysdba -password masterkey

. . .

Database header page information:

Flags 0

Checksum 12345

Write timestamp Mar 3, 2011 13:36:31

Page size 8192

ODS version 15.0

. . .

Creation date Feb 23, 2011 14:58:27

Attributes force write, direct, no reserve

Variable header data:

Sweep interval: 20000

*END*

. . .

It is important to note that a database needs to be set with "gfix -write direct" option and reloaded by the database engine for this to take effect.

Also, since the System File Cache will not be used when "direct" I/O is set, it is recommended that the database cache setting and database linger interval be set suitably. This allows the most frequently used pages to be in memory, the InterBase database cache, when new connections are serviced.

This "direct" I/O setting on a database is only possible if the database page size is an exact multiple of the underlying disk sector size of the file. The standard for so many decades has been 512 bytes per sector on hard disks. Newer hard disks however are trying to adopt the more Advanced Format of 4096 bytes per sector. InterBase supports the following database page sizes: 1024, 2048, 4096, 8192 and 16384 bytes per page. Databases that have a page size of 1024 or 2048 bytes cannot be set to "direct" I/O on hard disks that only support the 4096 bytes per sector standard; you need to restore your database to a larger page size on such disks before enabling "direct" I/O on them.

If you try to enable "direct" I/O on an incompatible device, the following error message is returned stating the minimum required database page size. The following example shows an error message where the disk sector size is 4096 bytes.

Error: must backup and restore to DB page size

>= 4096 bytes to support direct I/O on this device.

Topics