Multi-Generational Architecture
Go Up to IBConsole - Performance Guidelines
Multi-Generational architecture derives it name from process by which InterBase updates a record. Each time a record is updated or deleted a new copy (called a generation, or version) of the record is created. Its main benefit is that writers do not block readers. This means you can run a single query for weeks while people are updating the database. The answer you get from the query will be consistent with the committed contents of the database when you started your query transaction. How does this work?
Every operation on the database, whether it is a read or a write, is time stamped with a transaction number. They are assigned sequentially in ascending order. A user who has transaction number 20 started work with the database earlier in time than someone with transaction 21. How much earlier one cannot tell, all you know is that transaction 20 started before transaction 21. And you know what state transaction 20 was in when you started transaction 21. It was either active, committed, rolled back, limbo, or dead. We are just concerned with active, committed, and rolled back for this discussion.
Every record in the database is stamped with the transaction number that inserted, updated, or deleted the record. This number is embedded in the record header. When a record is changed, the old version of the record is kept with the old transaction number and the new version gets the transaction number that changed it. The new version of the record has a pointer to the old version of the record. The old version of the record has a pointer to the prior version of it, and so on. There is a mechanism in place to determine how many old versions need to be kept. If necessary, it will keep every version that has been created.
When you update a record in the database, the old version is compared to the new version to create a Back Difference Record (BDR). The BDR is moved to a new location and the new version is written in same the location where the original version was. Even though we keep old versions or records around, the BDR will never be larger than its ancestor. Usually, it will be very small unless you are changing the whole record. With deleted versions it is even smaller. The version being deleted is kept intact as a BDR with the new version just having the current transaction number and a flag indicating that the record is deleted.
Now lets take a look at an added benefit, the ability to lock a record without taking out an explicit record lock. Assume that transaction 21 (t21) wants to update a record that you are viewing with transaction 20 (t20). If t21 updates the record before you can issue the update, then they have effectively locked the record because the new version of the record will be stamped with transaction number 21. If t20 tries later, or just a split-second later, the system will immediately detect there is a new version of the record and deny the update. There are simple rules for dealing with transactions and record versions:
- If your transaction number is less than the transaction number of the record, then you cannot see or update it.
- If your transaction number is equal to the transaction number of the record, then you can see and/or update it.
- If your transaction number is greater than the transaction number of the record AND that transaction was committed before you started your transaction, then you can see and/or update it.
Garbage Collecting
Even with our efficiencies in keeping the BDRs few in number, the database can still accumulate a great deal of unnecessary record versions, i.e. garbage. There are two ways to clean out all the garbage from the database. The first is called Cooperative Garbage Collection. It happens automatically every time a record is touched, on a select, update, or delete operation. When the record is touched, the InterBase kernel follows the pointers to each BDR and compares its transaction number with what is called the Oldest Interesting Transaction (OIT). This number is kept in the header page for the database. If the BDR transaction number is less than the OIT, then the BDR can be purged from the database and the space reclaimed for new data. This will not clean up deleted records and their BDRs.
For more information on the OIT and sweeping, see Overview of Sweeping.
Sweeping
The second method is called sweeping the database. It can either be kicked off manually or automatically. By default, when the OIT is 20,001 transactions less then the Oldest Active Transaction number, the process that tried to start the transaction will sweep the entire database and remove as many BDRs as possible. While this is happening, other users can continue to use the database. This threshold can be changed.
If you are going to start the sweep manually, then it is advised that you first make sure there is no one connected to the database. This will not only clean out the BDRs and clean out the erased records, but also update the OIT number on the header page to be one less than the Oldest Active Transaction number. It can do this because there are no other active transactions that might need to see any of the BDRs.
For more information on the OIT and sweeping, see Overview of Sweeping.
Backup and Restore as Database Maintenance
Periodically you will want to shut down the database and backup and restore it. Note that the backup is performed as a transaction, which means that it sees only a snapshot of the committed records in the database at the time the backup began. This will backup only the current committed version of each record while also putting all the data for each relation on contiguous pages in the database. It will also rebuild all the indices and reset the statistics for each. This usually will increase performance significantly.
Now take this one step further. All of the metadata is stored in InterBase tables. This means that it is also multi-generational and has transaction numbers associated with it. If you change the metadata (like a field type) you are actually changing records in an InterBase table. The old versions are kept, and data that used the structure specified by old metadata versions are not changed to match your alteration of the database structure. Because the metadata has old versions, it is possible to have one record with the most current version with one structure and the next record with a different structure. InterBase resolves these via the transaction numbers and the metadata. When you do a query that returns data that is in an old structure, InterBase retrieves the data and must dynamically convert the data to the current structure. If the data are in a structure that is three generations old, it goes through three conversions before being returned to you. Do a backup and restore, and finally the data are physically converted to the current structure.
All this is done in the name of performance. When you commit a transaction, the record versions created by operations during that transaction are already written in the database, so only the status of the transaction has to be updated. Thus, commit and rollback are fast. Similarly, when you make a metadata change, InterBase does not change all the data to match the new structure; there may be gigabytes of data to change. Thus, metadata changes are fast.