How Journaling Works

From InterBase

Go Up to About Journals, Journal Files, and Journal Archives


Journaling turns off forced writes to the hard drive, and synchronously writes the updates to the journal file instead. Because journal file writes are sequential rather than random, there is no disk head movement, which improves database performance.

To save changed pages in the database cache to the hard disk, you set up journaling checkpoints to occur automatically. A checkpoint specifies the time at which InterBase must save all the changed pages in the database cache to the database file. After the checkpoint has been reached, the data in the journal file is no longer needed, so the file can be reused. For best performance, place the journal files on a dedicated hard drive. The journal files must be on the database server machine.

Journaling guarantees that all changes are on disk before a transaction is marked committed as long as O/S and hardware caching are disabled.

You do not need to use journal archiving to use journaling and journal files. However, journal archiving lets you recover from a disaster that completely destroys the database server.

When a database is configured for direct I/O, adding journaling does not automatically convert the database to asynchronous buffered I/O as it does when the database is configured for synchronous buffer I/O. This is to avoid buffered I/O at all costs when the database is set to direct I/O.

InterBase uses buffered file I/O on all platforms to perform I/O on database pages for the file on disk. The pages are delivered via the System File Cache, which acts as a duplicate store of the pages on RAM. Subsequent loads of the same page(s) are quickly served by the OS kernel if the page exists in the System File Cache. On systems where there is high contention with other files for the System File Cache (a shared pool used by all processes for buffered file I/O) the performance of InterBase may not be optimal. If available System File Cache is limited due to RAM resource limitations, the kernel must spend time cleaning up unused blocks of memory from other processes as well as provide for servicing a new block I/O request.

The performance problem is alleviated by using "direct I/O" (also known as non-buffered I/O) so blocks of pages are directly read from the disk into the process space and do not need to use the System File Cache.

This is supported on Windows OS only. This setting is not supported on non-Windows platform databases; you will see the following error.

feature is not supported
-direct I/O operation

If a database enabled with "direct" I/O is then copied to an older version of InterBase, the setting will not be used by the older InterBase server. The older server will employ the "sync" write mode in this case.

The fixes are as follows:

  • The gfix command line tool has been modified to allow setting a database to be in "direct" I/O write mode.
# gfix [-write {async, sync, direct}] . . .
For example:
#gfix -write direct foo.ib -user sysdba -password masterkey
  • The gbak command line tool now has a new restore option (optional) setting to override a database write mode. The "write" mode will be preserved during a backup/restore lifecycle.
# gbak [-write {async, sync, direct}] . . .
For example:
# gbak -write direct -r foo.ibk foo.ib -user sysdba -password masterkey
  • Services API support for the new and updated gfix and gbak options. You can find the various new arguments and respective values in ibase.h

Service API database restore arguments

Argument Purpose Argument length Argument value
isc_spb_res_write_mode Set the write mode of the database. The next byte must be one of:

isc_spb_res_wm_async

isc_spb_res_wm_sync

isc_spb_res_wm_direct

Corresponds to gbak -write

1 byte byte

Service API database properties arguments

Add isc_spb_prp_wm_direct to the following argument: isc_spb_prp_write_mode >

  • gstat command line tool will exhibit the following setting,direct, in its "Attributes" header line output.
# gstat -h foo.ib -user sysdba -password masterkey
. . .
Database header page information:
Flags 0
Checksum 12345
Write timestamp Mar 3, 2011 13:36:31
Page size 8192
ODS version 15.0
. . .
Creation date Feb 23, 2011 14:58:27
Attributes force write, direct, no reserve
Variable header data:
Sweep interval: 20000
*END*
. . .

It is important to note that a database needs to be set with "gfix -write direct" option and reloaded by the database engine for this to take effect.

Also, since the System File Cache will not be used when "direct" I/O is set, it is recommended that the database cache setting and database linger interval be set suitably. This allows the most frequently used pages to be in memory, the InterBase database cache, when new connections are serviced.

This "direct" I/O setting on a database is only possible if the database page size is an exact multiple of the underlying disk sector size of the file. The standard for so many decades has been 512 bytes per sector on hard disks. Newer hard disks however are trying to adopt the more Advanced Format of 4096 bytes per sector. InterBase supports the following database page sizes: 1024, 2048, 4096, 8192 and 16384 bytes per page. Databases that have a page size of 1024 or 2048 bytes cannot be set to "direct" I/O on hard disks that only support the 4096 bytes per sector standard; you need to restore your database to a larger page size on such disks before enabling "direct" I/O on them.

If you try to enable "direct" I/O on an incompatible device, the following error message is returned stating the minimum required database page size. The following example shows an error message where the disk sector size is 4096 bytes.

Error: must backup and restore to DB page size
>= 4096 bytes to support direct I/O on this device.

How Journal Archiving Works

The purpose of journal archiving is to provide effective and efficient disaster recovery. As mentioned above, a journal archive is a directory that contains a full database dump and all of the completed journal files that have been archived since that dump. As such, a journal archive enables you to recover to the last committed transaction in the most recently archived and completed journal file.

Important:
For disaster recovery purposes, a journal archive should always be located on a different machine — ideally, in a remote location — than the one that houses the database server.

Only completed journal files are archived to the archive directory. This means that up to the moment recovery is possible when the hard drive that contains the current, active, unarchived journal file remains intact. However, if disaster wipes out the hard drive that contains the active, incomplete journal file, the data on that file will also be lost.

Note:
Before you can activate journal archiving, you must first enable journaling. For instructions on how to do so, see Enabling Journaling and Creating Journal Files. For instructions on how to activate journal archiving, see Using Journal Archiving.

Advance To: