Creating a Sample Journal Archive

From InterBase
Jump to: navigation, search

Go Up to Journaling Tips and Best Practices


To get started, issue:

CREATE JOURNAL ARCHIVE <journal archive directory>

This activates journal archiving and performs the initial database dump.

Then copy the completed journal files to the archive directory, using the following syntax:

gbak -archive_journals <dbname>

Now, in the archive directory listing below, the database dump, EMPLOYEE.2006-08-21T15-48-17Z.1.DATABASE, has no database changes made after 2006-08-21 15:48:17. It does not care what updates are going to the main database while it is being dumped or after it is finished dumping. This includes the checkpoint process.

24 Aug 21 15:45 IB_JOURNAL
 24 Aug 21 15:45 IB_JOURNAL_ARCHIVE
 130399832 Aug 21 16:00 EMPLOYEE.2006-08-21T15-45-11Z.1.JOURNAL
 979562496 Aug 21 16:00 EMPLOYEE.2006-08-21T15-48-17Z.1.DATABASE
 130397262 Aug 21 16:00 EMPLOYEE.2006-08-21T15-51-51Z.2.JOURNAL
 130399932 Aug 22 18:13 EMPLOYEE.2006-08-21T15-57-03Z.3.JOURNAL
 130398336 Aug 22 18:13 EMPLOYEE.2006-08-22T18-06-19Z.4.JOURNAL
 130397418 Aug 22 18:14 EMPLOYEE.2006-08-22T18-10-52Z.5.JOURNAL
 35392721 Aug 23 00:27 EMPLOYEE.2006-08-22T18-14-47Z.6.JOURNAL

Use the gstat -L EMPLOYEE.2006-08-21T15-48-17Z.1.DATABASE command to generate the following summary:

Database log page information:
 Creation date Aug 21, 2006 15:45:11
 Log flags: 1
 Recovery required
 Next log page: 0
 Variable log data:
 Control Point 1:
 File name: E:\EMPLOYEE_JOURNALS_AND_ARCHIVES\ EMPLOYEE.2006-08-21T15-45-11Z.1.JOURNAL

 Partition offset: 0 Seqno: 1 Offset: 5694

This is what the log page of the main database looked like at precisely 2006-08-21 15:48:17. If you attempt to recover using this database dump, it will start with journal file, EMPLOYEE.2006-08-21T15-45-11Z.1.JOURNAL, at offset 5694 and continue through the last journal file or whatever timestamp was specified with an optional -UNTIL clause:

GBAK -ARCHIVE_R E:\EMPLOYEE_JOURNALS_AND_ARCHIVES\EMPLOYEE.2006-08-21T15-48-17Z.1.DATABASE E:\EMPLOYEE_RECOVER\EMPLOYEE.GDB -UNTIL "2006-08-21 18:08:15"

and in the interbase.log:

IBSMP (Server) Tue Aug 22 22:49:08 2006
 Database: E:\EMPLOYEE_RECOVER\EMPLOYEE.GDB
 Long term recovery until "2006-08-21 18:08:15" begin

 IBSMP (Server) Tue Aug 22 22:49:09 2006
 Database: E:\EMPLOYEE_RECOVER\EMPLOYEE.GDB
 Applying journal file: E:\EMPLOYEE_JOURNALS_AND_ARCHIVES\EMPLOYEE.2006-08-21T15-45-11Z.1.JOURNAL

 IBSMP (Server) Tue Aug 22 22:51:38 2006
 Database: E:\EMPLOYEE_RECOVER\EMPLOYEE.GDB
 Applying journal file: E:\EMPLOYEE_JOURNALS_AND_ARCHIVES\EMPLOYEE.2006-08-21T15-51-51Z.2.JOURNAL
IBSMP (Server) Tue Aug 22 22:53:24 2006
Database: E:\EMPLOYEE_RECOVER\EMPLOYEE.GDB
Applying journal file: E:\EMPLOYEE_JOURNALS_AND_ARCHIVES\EMPLOYEE.2006-08-21T15-57-03Z.3.JOURNAL

IBSMP (Server) Tue Aug 22 22:55:44 2006
Database: E:\EMPLOYEE_RECOVER\EMPLOYEE.GDB
Applying journal file:
E:\EMPLOYEE_JOURNALS_AND_ARCHIVES\EMPLOYEE.2006-08-22T18-06-19Z.4.JOURNAL

IBSMP (Server) Tue Aug 22 22:55:57 2006
Database: E:\EMPLOYEE_RECOVER\EMPLOYEE.GDB
Long term recovery end

GBAK -ARCHIVE_DATABASE (creating archive db dump) never locks anything. The only archive management restriction is that archive operations are serialized. You cannot do multiple GBAK/GFIX operations against it at the same time. The important point here is that the main database is fully accessible at all times.

GBAK -ARCHIVE_JOURNALS <my_database> causes non-archived journal files to be copied to the archive (or marked as archived as above) when you do not want to dump the whole database. Again, a row is entered into RDB$JOURNAL_ARCHIVES for each archived journal file.

GFIX -ARCHIVE_SWEEP <sequence no.> <my_database> deletes all files in RDB$JOURNAL_ARCHIVES with RDB$ARCHIVE_SEQUENCE less than the requested sequence.

GFIX -ARCHIVE_DUMPS <number> <my_database> configures the maximum number of database dumps allowed in the archive. After issuing GBAK -ARCHIVE_DATABASE, archive management will automatically delete the oldest archive database dump and all earlier journal files if the dump limit has been exceeded by the addition of the new database dump.

GBAK -ARCHIVE_RECOVER <archive_directory/archive_database> <new_database> [-UNTIL <timestamp>], will recover a database from the archived journal files. Remember that <archive_directory> has to be mounted for read access on the machine performing the recovery.

Archive directories can be located on InterBase servers or passive file servers and appliances. The archived files are opened directly by clients and not through an InterBase server. Archive database dumps are sealed so you can simultaneously run database validation (usually requires exclusive), logical GBAK, and have multiple, same-platform machines on the network attach the database for read-only queries, which implies high levels of page I/O over the network.

If the most current, non-archived journal files are accessible from the machine where the recover is being executed, then the recovery process will “jump” to those journal files to recover the most recently committed transactions, notwithstanding the optional -UNTIL clause. The recovered database is divorced of any journal or journal archive so it is necessary to define them again if desired.

However, it is more useful to leave the recovered database in a perpetual state of long term recovery. That is, every time after the first GBAK -ARCHIVE_RECOVER, subsequent GBAK -ARCHIVE_RECOVER statements apply the incremental journal changes. This provides perfect symmetry with the online dump feature:

rem Full online dump
gbak -dump employee.gdb dump.ib
rem Incremental dump
gbak -dump employee.gdb dump.ib 
rem Incremental dump
gbak -dump employee.gdb dump.ib 
rem Divorce from main DB, and make the dump database online for read-write operations
gfix -mode read_write dump.ib

rem Archive Database
gbak -archive_database employee.gdb 
rem Archive Journals
gbak -archive_journal employee.gdb 
rem To recover, find the lastest employee.gdb.*.database file in the archive folder, and recover from that. For example, if the latest full database archive file is employee.gdb.5.database, execute the following archive recover command:
gbak -archive_recover employee.gdb.5.database recover.ib
rem  Make the recovered database online for read-write operations. Note: the recovered database does not have any journal archive setup at this point. You will need to set this up again.
gfix -mode read_write recover.ib

Note: The sample above is a Windows batch script.

This functional modification is much more efficient. Full, archival recovery can take hours depending on the volume of journal changes.

If you divorce from the database, you save 1 second in not having to type GFIX -MODE READ_WRITE at the cost of having to create another full recovery if you want a more recent copy (hour(s)). Now you have to run GFIX -MODE READ_WRITE to divorce, but you gain hours of efficiency by being able to get the incremental journal changes since the last GBAK -ARCHIVE_RECOVER. This also means that the recovered database can be deployed more quickly if the main database is lost. It also can function as a more up-to-date READ_ONLY database for queries and reporting purposes.

Lastly, the journal archive is never implicitly dropped as a side-effect of DROP DATABASE or DROP JOURNAL. It is necessary to explicitly issue a DROP JOURNAL ARCHIVE statement before DROP DATABASE. The journal archive potentially represents the last known source of the contents of a dropped database so it is intentionally difficult to delete.