Expression Index

From InterBase

Go Up to Using CREATE INDEX


For: 2017 Update 1 and above only.

This section describes support for creating and using expression based index definition in databases to help improve performance of queries with expressions. Since index definitions or usage are not part of the SQL standard, database vendors enable various forms of index definitions to enhance and enforce various features in their products.

An index is typically defined in InterBase on a set of columns in a table. An index is later used by the query optimizer to identify candidate queries that can be sped up by accessing the index instead of doing a natural scan on the base table to service the query. InterBase also uses an index to enforce any referential integrity so declared via PRIMARY, UNIQUE or FOREIGN KEY definitions. InterBase 2017 and earlier versions only allow index definitions on a set of fields, as a simple index on one field, or, a compound (composite) index on a set of fields. Index definitions were not allowed on an expression entailing any fields from the same base table.

Support for Expression Indices (called EI in short henceforth) enhance the index definition and optimization of queries. Such an EI definition is open to any valid expressions containing any field(s) from the same base table with normal expression operators that InterBase users are used to provide either as part of a column result set or in WHERE expressions in SELECT statements.

The InterBase DDL syntax for EI allows the index definition to persist in the database through database backup/restore as well, and will be duly used to optimize queries that can benefit from a similar expression use semantically. Some expressions, like addition and multiplication, are commutative in nature; hence, the index will also be used if the expression used in the query is compatible with the index definition.

In addition to the above, this feature also enables defining an index on an existing COMPUTED BY column in the table. This allows developers to define a COMPUTED BY column once, but not have to reproduce the expression in the index definition again. Previous releases of InterBase did not allow this.

Usability

Index definitions for EI syntax closely follow the CREATE INDEX syntax that is currently available. In place of a set of columns in the table, one uses a single expression. You can further use ALTER INDEX to alter an index state to ACTIVE/INACTIVE, and, DROP INDEX to drop an existing index respectively. Refer to CREATE INDEX definition for more information.

syntax:

CREATE [UNIQUE] [ASC[ENDING] | DESC[ENDING]] INDEX <index>
ON <table> (<col> [, <col> ] | COMPUTED BY <expression>);
  
<col> ::= column in table to index

Index definition using a COMPUTED BY <expression>

/* Explicit Expression Index using COMPUTED BY clause */
CREATE TABLE t1 (
  dummy0 integer,
  f1 integer not null,
  f2 integer not null,
  f3 integer not null,
  f3_min_f2 computed by (f3 - f2),
  first_name VARCHAR(10) DEFAULT 'Give me ',
  last_name VARCHAR(10) DEFAULT 'a name'
);
  
CREATE INDEX idx_expr_t1_mul ON t1 COMPUTED BY (f1 * f2);
CREATE INDEX idx_expr_t1_udf ON t1 COMPUTED BY (UPPER(first_name) || UPPER(last_name));

Index definition on a COMPUTED BY column

/* Implicit Expression index via a COMPUTED BY column */
CREATE INDEX idx_expr_t1_comp_fld ON t1 (f3_min_f2);
  
/* PRIMARY KEY on COMPUTED BY column
 * Please note PRIMARY KEY by itself can only be on a set of base columns, or, on ONE COMPUTED BY column.
 * PRIMARY KEY definition can also not be on an explicit expression.
 * Also, you cannot mix and match base/computed columns together to form a composite index.
 */
CREATE TABLE t2 (
  f1 integer not null,
  f2 integer not null,
  f3 computed by (f1 * f2),
  PRIMARY KEY (f3)
);

Sample queries:

The query optimizer in InterBase will try to match up the index, where applicable, to expressions in the query that match the index definitions. Some of the expressions are commutative (like a single addition (a + b == b + a), or, multiplication (a * b == b * a); the optimizer will do a best effort to match such expressions with any compatible commutative index definition as well. Expressions in ORDER BY, GROUP BY, and JOIN conditions via ON clause, are also optimized to use an expression index, where applicable.

/* Following sample queries are optimized to use the expression index; SET PLAN ON to see the plan. */
 
/* Multiplication */
select * from t1 where f1 * f2 = 1;
select * from t1 where (f1 * f2) = 1;
/* Multiplication Commutative property; a*b=b*a */
select * from t1 where (f2 * f1) = 4;
 
/* UDF */
select * from t1 where (UPPER(first_name) || UPPER(last_name)) = 'THREE MUSKETEERS';
 
/* ORDER BY */
select f1, f2, f1 * f2 from t1 ORDER BY (f1 * f2);
select (f1 * f2), (f1 + f2) from t1 WHERE f1 * f2 = 100 ORDER BY 1, 2;
select f1 * f2 from t1 PLAN (T1 ORDER IDX_EXPR_T1_MUL) ORDER BY 1;
  
/* GROUP BY */
-- by ordinal position
select f1 * f2, count(*) from t1 GROUP BY 1;
-- by expression
select (f1 * f2), count(*) from t1 GROUP BY (f1 * f2);
  
/* JOIN with commutative expression */
select a.f1, a.f2, (a.f1 * a.f2), (b.f2 * b.f1)
  from t1 a JOIN t1 b ON (a.f1 * a.f2) = (b.f2 * b.f1);
select a.f1, a.f2, (a.f1 * a.f2), (b.f2 * b.f1)
  from t1 a LEFT OUTER JOIN t1 b ON (a.f1 * a.f2) = (b.f2 * b.f1);

Requirements and Constraints

  • An Expression index can only include column names from a single base table in an expression.
  • You cannot mix and match base/computed columns together to form a composite expression index.
  • An Expression Index can have only ONE key segment, by design. i.e. only ONE expression can be part of a single Expression Index.
  • A PRIMARY KEY definition using a COMPUTED BY column can only use ONE COMPUTED BY column. PRIMARY KEY definition can also not be on an explicit expression; it has to be using a defined column.

Migration issues

  • If a database has newly defined Expression Index entities, such databases cannot be used with InterBase versions 2017 (13.0.0) and earlier versions. Such index definitions have to be declared INACTIVE before the database is restored or used in older InterBase versions which can use such ODS versions.

Advance To: