Tablespace

From InterBase

An InterBase database maybe be contained in a single file or across multiple files. In order to support databases greater in size than 2GB it was necessary to allow the database to have multiple files. An InterBase database may contain a maximum of 2 billion database pages, which, depending on the page size, can span from 2TB to 32TB. Logical database pages seamlessly spillover to the next database file when the prior database file has filled up. Prior to InterBase 2020 there was no way to assign database objects to a particular file, many tables had fragments uncontrollably spread across all of the database files.

Tablespace allow a database object definition to specify a storage location in a particular logical file collection. This constitutes a physical schema independent of the logical schema of the database. A database administrator can leverage tablespace characteristics to optimize runtime database performance. Note that a suitably configured RAID subsystem can provide general performance gains in a less granular manner.

In this context, today's InterBase multifile database may be interpreted as occupying a single primary (unnamed) tablespace.

These are some tablespace characteristics and their exploits:

Placement: Indexes and tables may be placed on different devices for parallel I/O or faster I/O i.e. SSD drives. Similarly, commonly joined tables may be placed on different tablespace for the same purpose.

Archival: Historical or audit tables may be placed together to prevent contention with production (OLTP) tables on a different tablespace.

Management: It may be possible to employ different backup/restore schedules and techniques for tablespace with the aforementioned characteristics.

Resources: It may be possible to deploy tablespace with different page sizes and page buffer caches for more efficient access methods. Furthermore, each tablespace can specify encryption, journaling and write mode options appropriate for its contained database objects.

Growth: An InterBase database size is unconstrained via the definition of additional tablespace.

Automountable: It should not be necessary for any individual tablespace (except the primary) to be available for a database to be used. The tablespace can automount on database object access. If it is not available then only those affected queries will fail.

Transportable: If a tablespace is designed as self-describing with its own metadata, it may be possible to attach it to other databases in a dynamic fashion.

Temporal: Temporary tables (global or local (module-specific)), materialized views, time-series, versioned and temporal tables, and (TMP$) system monitor tables may be placed in their own tablespace.


Usage

This section explains the data definition language (DDL) to manage the lifecycle of a tablespace. There is no data manipulation language for developers as a tablespace is an artifact of the physical schema. For application development, a tablespace is invisible to the developer. However, database administrators need some knowledge of tablespace for backup and restore operations at the database and tablespace level. This is described by companion documentation "InterBase Tablespace - Backup and Restore Operations".

It should be noted that the tablespace concept is not part of the official SQL standard. It is in good company though, as neither luminary concepts such as database and index exist in the standard. It follows that the DDL described here is an InterBase-specific SQL dialect.

A tablespace is a targeted container for database objects. The main database file and its secondary files constitute the primary tablespace with a keyword name of PRIMARY. The primary tablespace comes into existence implicitly as a consequence of database creation. Thereafter secondary tablespace must have their life cycle explicitly managed using the appropriate CREATE, ALTER and DROP syntax. Tablespace syntax is similar to that of similar database level syntax.

First, a tablespace must be created. It can have storage attributes different from the primary tablespace, notably a page size that is most appropriate for the database objects allocated to the tablespace. If a tablespace page size is equal to the primary tablespace then it uses the same database page buffer cache; if it is a different size then a separate page buffer cache of the default size is created. The number of page buffers can be modified by altering the tablespace with the SET PAGE CACHE clause.

Like a database, tablespace can consist of multiple distinct files. This is mostly for backward compatibility; with 64-bit filesystems this capability is not as powerful of an option as it used to be.

A tablespace goes out of existence when it is dropped. As database objects contained by a tablespace can have dependents in other tablespace, the DROP statement has a RESTRICT behavior whereby the drop fails because of these dependencies. The error can only be resolved by dropping the dependents or altering the tablespace of the database objects that have the dependencies. This has the effect of migrating or recreating those database objects in the remaining tablespace.

Note: By default indices reside in the same tablespace as that of the table unless a different tablespace is specified.

CREATE TABLESPACE <tablespace_name> FILE 'filespec'
DROP TABLESPACE <tablespace_name>

After creating a tablespace, it is necessary to assign or allocate database objects to it. Only tables and their indexes can be targeted for residence in a tablespace. Tables and their indexes can exist in different tablespace; they do not have to be co-resident. By default, any index created for the table resides in the same tablespace as that of the table; this includes user defined indexes, and referential integrity constraint enforcing indexes such as PRIMARY KEY and FOREIGN KEY. Use the TABLESPACE clause to assign to a tablespace. As mentioned above, the unnamed primary tablespace is referenced by the keyword PRIMARY. If no TABLESPACE clause is given (as is the default or usual case) then the database object is created in the primary tablespace.

If a database object already exists, it can be reassigned from its current tablespace to a new tablespace; for tables, a table's content is moved to the new tablespace; for indexes, they are recreated in the new tablespace. During these operations the database objects must be locked exclusively until the reorganization has completed.

CREATE TABLE <table_name>
    <column list>
    [TABLESPACE {<tablespace_name> | PRIMARY}]

CREATE INDEX <index_name>
    <column list>
    [TABLESPACE {<tablespace_name> | PRIMARY}]

ALTER TABLE <table_name>
    <column list>
    [ALTER TABLESPACE {<tablespace_name> | PRIMARY}]

ALTER INDEX <index_name>
    <column list>
    [ALTER TABLESPACE {<tablespace_name> | PRIMARY}]

Note: By default indices reside in the same tablespace as that of the table unless a different tablespace is specified. For more information on table spaces refer to the Tablespace documentation.

ISQL support

The command line tool, ISQL, provides some helper commands to browse through the entities in a specific tablespace. Also, ISQL can now extract tablespace specific metadata information when ISQL -a is executed on the database.

SHOW TABLESPACE
SHOW TABLESPACE [<tablespace_name>]

Show tablespace specific properties such as FILE reference, and others such as page size, encryption key, page cache etc.

By default, providing no tablespace name reference, just lists the names of existing tablespaces in the database. This is similar to SHOW TABLESPACES.

SHOW TABLESPACES

List the names of existing tablespaces in the database.

SHOW TABLES IN TABLESPACE
SHOW TABLES IN TABLESPACE [<tablespace_name> | PRIMARY ]

List the names of tables in the tablespace provided. By default, providing no tablespace name reference, just lists the names of tables in the default PRIMARY tablespace

SHOW INDICES IN TABLESPACE
SHOW INDICES IN TABLESPACE [<tablespace_name> | PRIMARY ]

List the names of indices in the tablespace provided. By default, providing no tablespace name reference, just lists the names of indices in the default PRIMARY tablespace

Requirements and Constraints

  • A database may have 255 secondary named tablespaces in addition to the primary unnamed tablespace.
  • A tablespace does not have to be available to connect to the database it is contained by. A tablespace is automounted when a database object it contains is referenced. If the tablespace is not available at that point then a tablespace error is raised.
  • Support for tablespaces starts with ODS version 18.
  • By default, a blob or array is created in the primary tablespace. If the blob is assigned to a table contained by a secondary tablespace, it is necessary to move the blob from the primary tablespace to that tablespace. As this is sub-optimal, a blob parameter block (BPB) can be passed to the isc_create_blob2() API to specify the target table. If the isc_blob_gen_bpb2() API is used then be sure to specify the target table name in the ISB_BLOB_DESC to_desc parameter.

Otherwise, a BPB can be generated manually:

isc_bpb_version1, isc_bpb_target_relation_name, <name_length>, <comma delimited, byte-by-byte table_name)>

When the blob is populated by calls to isc_put_segment(), the blob data will be placed in the same tablespace as the table to which it will ultimately be assigned and avoid the inter-tablespace move.

  • Auto-generated indices in support of SQL Integrity and Referential Constraints are always created by the system in the PRIMARY tablespace. There is no tablespace clause for those DDL commands to indicate an alternative tablespace assignment. However, it is possible to use the system-generated names of such indices in an ALTER INDEX ... ALTER TABLESPACE statement to move them from the PRIMARY tablespace to another secondary tablespace.
  • ALTER TABLE ... ALTER TABLESPACE does not move a table's associated indices. Those indices stay put in their current tablespace, The converse is also true: When an index is moved, its table does not move with it.
  • The default location for a table is the PRIMARY tablespace. If a TABLESPACE clause is not provided in a CREATE statement, tables are created in the PRIMARY tablespace. When a tablespace clause is not provided in CREATE INDEX, the default location of that index is in the same tablespace where the table resides.

Backup, Restore and Recovery

Description

Backup, restore and recovery services are enhanced to support tablespaces. This document describes the functional specification of new capabilities provided by the GBAK command-line tool as manifested a new switch. It also covers the behavioral effects of the new switch, which is quite different from the behavior of a full database backup, restore or recovery. It is assumed that this switch and functionality are accessible through InterBase's services management facility.

A database can now be organized as a collection of tablespaces, which can contain a user-specified set of tables and indexes. In a related manner, individual tablepaces can be backed up and restored. A consequence of this flexibility is that data restored from individual tablespace backups may be from a different point in time than data in other tablespaces because of different transaction snapshots used to perform the backup. Therefore, it requires careful planning with respect to tablespace content and referential and integrity constraints, which may cross tablespace boundaries.


User interface/Usability

The gbak command-line tool introduces the {+|-} tablespace switch for specifying one or more named tablespaces. When this switch is not present, gbak will backup, restore or recover all tablespaces comprising the full database. Full database restoration or recovery of a database can only occur if the backup file was created with no tablespace switches. Such a backup file is called a database backup and includes all metadata and user data for the database. If a tablespace switch is given then the backup file is called a tablespace backup and only includes user data for the tables contained by the tablespace; It does not include any metadata except for what is needed to restore the tables contained by the tablespace backup file. A tablespace backup can include multiple named tablespaces.

In the case of restoring the PRIMARY tablespace from a tablespace backup, the primary tablespace file is not written over as doing so would destroy the database metadata. Instead, the user tables are truncated and then the data from the backup file is inserted. This is the behavior for a logical database or tablespace backup.

When individual tablespaces are restored either -c(reate) or -r(estore) must also be specified. The -c switch provides a "soft" restore of user tables, by performing a TRUNCATE operation the table. This allows tables and indices that are created after the backup file is made to be preserved. The -r switch deletes the tablespace files at the file system level and recreates the files at the same or different location. If the existing file has been corrupted, the -r switch may be the only choice.

<tablespace-switch> ::= {<tablespace-include> | tablespace-exclude>}
<tablespace-include> ::= +TABLESPACE <tablespace-specification> [<tablespace_switch> ...]
<tablespace-exclude> ::= -TABLESPACE <tablespace-specification> [<tablespace_switch> ...]
<tablespace-specification> ::= <tablespace-name> <optional-file-name>
<tablespace-name> := <SQL identifier (regular or delimited)>

The + or - switch designator is used to include or exclude one or more tablespaces from the tablespace backup.

As an example, assume the InterBase Employee database is stored at /databases/employee.idb It is composed of the default PRIMARY tablespace, the HR tablespace stored at /tablespaces/hr.its, and the EMPLOYEE tablespace stored at /tablespaces/employee.its. Here are some gbak command examples:

Logical Database Backup

The following two examples show that full database backup and restore operate as usual:

/* Backup entire database as usual to a single file */
gbak -b employee.idb employee.idbk
 
/* Restore entire database and implicitly restore tablespaces HR and EMPLOYEE to their original file locations. This overwrites the primary database file and both tablespace files. */
gbak -r employee.idbk employee.idb

This example shows that individual tablespaces can be extracted from a full database backup.

/* Restored to file location in backup file */
gbak -r employee.idbk employee.idb +tablespace EMPLOYEE

or restore the EMPLOYEE tablespace to a different file locaiton:

gbak -r employee.idbk employee.idb +tablespace EMPLOYEE /temp/employee.its

Tablespace backup files are created when the tablespace switch is used.

/* Backup HR and EMPLOYEE tablespaces to single file */
gbak -b employee.idb hr_eh.itbk +tablespace HR +tablespace EMPLOYEE

or equivalently:

gbak -b employee.idb hr_eh.itbk -tablespace PRIMARY

The -tablespace switch can be use to exclude individual tablespaces when using the +tablespace switch would be too lengthy or risk excluding new tablespaces in the future if the gbak command was not updated.

/* Restore all tablespaces in the backup file except the PRIMARY tablespace */
gbak -c employee.idbk employee.idb -tablespace PRIMARY

To restore individual tablespaces from a database or tablespace file use the -create_tablespace and -replace_tablespace switches. The -create_tablespace does not raise an error if the tablespace file already exists and does not delete an existing tablespace file. Instead it performs a TRUNCATE TABLE command on every table contained by the given tablespaces. It then restores the tables contained by each tablespace from the backup file. This preserves tables that were created after the backup was made.

/* Restore an individual tablespace by truncating its tables in place
   without deleting the existing tablespace file. */
gbak -create_tablespace employee.idbk employee.idb +tablespace EMPLOYEE


If the existing tablespace files are corrupt the TRUNCATE TABLE command cannot be performed then it may be necessary to delete the existing files and create new files by using the -replace_tablespace switch.

/* Restore an individual tablespace by deleting the existing tablespace
   file and creating a new file into which its tables are restored. */
gbak -replace_tablespace employee.idbk employee.idb +tablespace EMPLOYEE

Note: The trailing suffix "_tablespace" for these switches must specify at least the "_tablesp" substring to be recognized.

Backup files are either "database backup file" (where all schema information is stored in addition to the data) or "tablespace backup file" (where only data is stored for the selected tablespaces). You can use gbak to get a listing of metadata information from the backup file, by using the following command.

/* Provide a verbose listing of all information from the backup file.  The output will provide information about tablespaces, tables etc. that are   included in the backup file, and also state if the backup file    is a "database" or "tablespace" type. */
gbak -v -meta employee.idbk

Physical Database Dump

The following shows an example of a physical database dump:

/* Online dump entire database with tablespaces */
gbak -du employee.idb employee.idmp +tablespace FOO foo.tdmp +tablespace BAR bar.tdmp

Full and incremental online dump as well as distinguished dump are supported. The first file after the database file is the primary dump file. Its contents include the database metadata, the PRIMARY tablespace and all other secondary tablespaces that did not designate a separate file in their tablespace definition.

It is a constraint of online dump that all secondary tablespaces allocated to a file must be included in the gbak dump command line. It is an error to omit a tablespace or to name a tablespace name more than once. Each output file name must be unique among the file names given in the command.

Services API

There are five new service parameters for use with the Services API:

  • isc_spb_res_create_tablespace
  • isc_spb_res_replace_tablespace
  • isc_spb_tablespace_include
  • isc_spb_tablespace_exclude
  • isc_spb_tablespace_file

These are used as arguments to the Services API database backup/restore actions: isc_action_svc_backup and isc_action_svc_restore. The first two parameters are bit flag format and passed by OR'ing them with other bit flag parameters before passing them as an argument to isc_spb_options. They generate the -create_tablespace and -replace_tablespace switches, respectively.

The rest of the new spb parameters are shown below:

Services parameters sample Generates...
isc_spb_tablespace_include,2,0,'H','R' +tablespace HR
isc_spb_tablespace_exclude,8,0,'E','M','P','L','O','Y','E','E' -tablespace EMPLOYEE
isc_spb_tablespace_include,2,0,'H','R',isc_spb_tablespace_include,8,0,'E','M','P','L',O','Y','E','E' +tablespace HR +tablespace EMPLOYEE
isc_spb_tablespace_include,7,0,'P','R','I','M','A','R','Y', isc_spb_tablespace_file,9,0,'/','M','A','I','N','.','T','S','P' +tablespace PRIMARY /MAIN.TSP

For using the service API to dump a database with tablespaces, follow your isc_spb_dmp_file definition with a list of tablespaces each with their name after isc_spb_tablespace_include followed by destination path after isc_spb_tablespace_file.

System Metadata Additions

ODS 18.1 databases add the following columns to system table RDB$FILES:

RDB$FILE_TYPE VARCHAR(32)
RDB$NODE_NAME VARCHAR(128)
RDB$FILE_SET_ID INTEGER
RDB$FILE_TIMESTAMP TIMESTAMP

There is a system index defined on column RDB$FILE_SET_ID and a system generator named RDB$FILES for generating unique RDB$FILE_ID_SET values.

A logical database backup or online dump writes a row to RDB$FILES after successful completion. RDB$FILE_TYPE is assigned the string "BACKUP", "DUMP" or "SPLIT" depending on the type of backup file created. For the database files, secondary database files, shadow database files and secondary tablespace files, the file type is "DATABASE".


For an online dump a separate row is written to RDB$FILES for every tablespace file dumped. The node(host) name of the computer that outputs the files is written to RDB$NODE_NAME. For an online dump this is always the node name of the server. For logical backups it is the client node name for a non-services backup and the server node name for a services-oriented backup.


RDB$FILE_SET_ID is used to group related sets of files from a backup or dump. The RDB$FILES rows for the ordinary database files (RDB$FILE_TYPE of "DATABASE") are always assigned a value of 0 (zero) for RDB$FILE_TYPE. Each logical backup or physical dump assigns a the same system generated value to RDB$FILE_SET to every new file created by the executed command. This column can be GROUP(ed) BY in a query on RDB$FILES to see the output files of a single backup or dump.


RDB$FILE_TIMESTAMP represents the creation date of ordinary "DATABASE" files. For backups and dumps it represents the transaction timestamp of the data written to the files. It is not the timestamp of when the operation completes. If an operation takes 10 minutes to complete then the data in the backup or dump file is as it existed in the database 10 minutes ago.

The RDB$FILES rows for backups and dumps are not backed up or restored After a database restore the only rows in RDB$FILES with be the ordinary database files with RDB$FILE_TYPE of "DATABASE"