Partitioning a Table

From ER/Studio Data Architect
Jump to: navigation, search

Go Up to Developing the Physical Model

The performance of almost all database applications is I/O bound. To improve I/O throughput, you should physically separate tables that are frequently joined together. You should also separate tables from their indexes. The objective is to have the database read or write data in parallel as much as possible.

  1. In the Data Model Window, double-click the table that you want to partition.
  2. From the Table Editor, click the Partitions tab and then select the index for which you want to specify storage options.
  3. Complete the Partitions options as required, click Add to launch the Table Partition Editor.
  4. Complete the Table Partition Editor options as required, click OK to exit the editor and then click OK again to exit the Table Editor.
  • The index storage options available on the Partitions tab and the Table Partition Editor depend on the database platform and version.

The following describe options that require additional explanation. For information on the storage options available for your database platform, click a link below:

Clustering Options for Snowflake Tables

Snowflake produces well-clustered data in tables; however, over time, particularly as DML occurs on very large tables (as defined by the amount of data in the table, not the number of rows), the data in some table rows may no longer cluster optimally on desired dimensions.

To improve the clustering of the underlying table micro-partitions, Snowflake supports automating optimization tasks by designating one or more table columns/expressions as a clustering key for the table. A table with a clustering key defined is considered to be clustered. The clustering key is simply a selection of columns, expressions of columns, or expressions on paths in VARIANT columns for that table.

You can set clustering keys for a table in two ways:

  1. Using the Clustering Keys column picker to choose a list of column names; OR
  2. If you want to use a column expression, select the Use text Column Expression checkbox, and then enter a free-form expression in the Column Expression textbox. Selecting the checkbox disables the column picker.
  • DDL: In the DDL tab is generated based on the selected option.

Table Partition Options for Sybase

  • Partition Table: ( PARTITION num ) Select if you want to partition the table and then enter the number of partitions desired.

Table Partition Options for Oracle 8.x, 9i, 10g, 11g and 12c

  • Type: Select the partition type.
    • Composite: Select to hash partition range partitions on a different key. When selected, specify the subpartitions by clicking Subpartition Type. More partitions increase the possibilities for parallelism and reduce contention. For example the following range partitions a table on invoiceDate and subpartitions these partitions by invoiceNo, creating 16 subpartitions.

CREATE TABLE invoices

(invoice_no NUMBER NOT NULL,

invoice_date DATE NOT NULL)

PARTITION BY RANGE (invoiceDate)

SUBPARTITION BY HASH (invoiceNo)

SUBPARTITIONS 4

(PARTITION invoices_Q1 VALUES LESS THAN (TO_DATE('01/04/2007', 'DD/MM/YYYY')),

PARTITION invoices_Q2 VALUES LESS THAN (TO_DATE('01/07/2007', 'DD/MM/YYYY')),

PARTITION invoices_Q3 VALUES LESS THAN (TO_DATE('01/09/2007', 'DD/MM/YYYY')),

PARTITION invoices_Q4 VALUES LESS THAN (TO_DATE('01/01/2008', 'DD/MM/YYYY'));

    • Range: ( PARTITION BY RANGE (column_name) ) Select to store distinct ranges of data together, such as dates. Searches can then be limited to partitions with data that is the correct age. Also, once historical data is no longer useful, the whole partition can be removed.
    • Hash: ( PARTITION BY HASH (column_name) PARTITIONS count STORE IN tablespace) ) Select if there is no obvious range key or if range partitioning might cause uneven data distribution.
    • List: ( PARTITION BY LIST ) Select to control how rows map to partitions. Specify a list of discrete values for the partitoning key in the description of each partition.
  • Subpartition Type button: Launches the Subpartition Type Editor where you can specify the columns to use in the subpartition. For a HASH subpartition, you can specify the number of subpartitions and where they are stored. For a LIST subpartition, you can specify the subpartitions to include.
  • Available Columns: Displays the columns available to add to a partition.
  • Selected Columns: Displays the columns that make up the partition key. This guides how data is divided along the partition boundaries. Typically, a partition key is date-based or based on a numerically ordered data grouping. Use the left and right arrows to move columns to and from this grid. Columns that are in this grid make up the partition.
  • Up / Down buttons: Let you reorder the columns on the partition. The partition order can affect the access speed. The most frequently accessed columns should be at the top of the partition list.
  • Add: Click to add table partitions. When clicked, launches the Table Partition Editor for Oracle 8.x, 9i, 10g, 11g and 12c.
  • Edit: Select a partition and click to modify it. When clicked, launches the Table Partition Editor for Oracle 8.x, 9i, 10g, 11g and 12c.

Table Partition Editor for Oracle 8.x, 9i, 10g, 11g and 12c

  • Max Value: ( VALUES LESS THAN (MAXVALUE) ) Select to create a catch-all partition for values that exceed all specified ranges. Otherwise, specify a High Value for the partition range.
  • High Value: ( VALUES LESS THAN (high value) ) Specifies the ending value for the partition range.
  • Tablespace: Specifies the name of the tablespace on which the table is stored.
    • No Logging: Select if you do not want the DLL operations to be logged in the redo file. This can reduce index creation and updates by up to 30%.
  • Pct Free: ( PCTFREE ) Specifies the maximum percentage of space in each data block to reserve for future updates. This reserved space helps to avoid row migration and chaining caused by an update operation that extends a data row's length. Tables that will not be updated should have this value set to 0.
  • Pct Used: ( PCTUSED ) Specifies the maximum percentage of available space in each data block before re-adding it to the list of available blocks. When deletes take place and the room available in a block falls below this value, the block is made available for new inserts to take place. Tables that will not be updated should have this value set to 99. The default value is 40% means that blocks are available for insertion when they are less than 40% full.
  • Initial Trans: ( INITTRANS ) Specifies the number of DML transactions for which space is initially reserved in the data block header. Oracle stores control information in the data block to indicate which rows in the block contain committed and uncommitted changes. The amount of history that is retained is controlled by this parameter. If too many transactions concurrently modify the same data block in a very short period, Oracle may not have sufficient history information to determine whether a row has been updated by a "too recent" transaction. Specify a higher value for indexes that may experience many transactions updating the same blocks.
  • Max Trans: ( MAXTRANS ) Once the space reserved by INITRANS is depleted, space for additional transaction entries is allocated out of any available free space. Once allocated, this space effectively becomes a permanent part of the block header. This parameter limits the number of transaction entries that can concurrently use data in a data block and therefore limits the amount of free space that can be allocated for transaction entries in a data block.
  • Edit Subpartition button: Launches the Subpartition Editor where you can specify how to subpartition the partition and where the subpartitions should be stored.
  • Initial Extent: ( INITEXTENT ) Specifies the initial number of data blocks that Oracle should reserve. Oracle will reserve the number of data blocks that correspond to the initial extent for that table's rows.
  • Next Extent: ( NEXT ) Specifies the size in kilobytes of the next extent. You should monitor this figure against the largest available chunk of free space in the tablespace. If a table cannot allocate its next extent, it will no longer be able to extend and, therefore, cannot accept additional data.
  • Pct Increase: ( PCTINCREASE ) Specifies the percentage by which the next extent should grow over the previous extent's size. Be careful when setting Percent Increase because it magnifies how an object grows and, therefore, can materially affect available free space in a tablespace.
  • Min Extents: ( MINEXTENTS ) Specifies the number of extents to allocate when the segment is created. Controls free space fragmentation by making sure that every used or free extent is at least as large as the value you specify.
  • Max Extents: ( MAXEXTENTS ) Specifies the maximum number of extents that Oracle can allocate to the index. Once this limit is reached, Oracle prevents further growth of the cluster and cannot accept additional data. Carefully monitor the number extents already allocated to the index with this limit.
  • Free Lists: ( FREELISTS ) Specifies the number of free lists to apply to the index. The default and minimum value is 1. Free lists can help manage the allocation of data blocks when concurrent processes are issued against the index. Oracle uses the free list to determine which data block to use when an INSERT operation occurs. Oracle allows table and indexes to be defined with multiple free lists. All tables and index free lists should be set to the high-water mark of concurrent INSERT or UPDATE activity. For example, if the index has up to 20 end users performing INSERTs at any time, then the index should have FREELISTS=20.Too low a value for free lists will cause poor Oracle performance. An increase in FREELISTS or FREELIST_GROUPS can alleviate segment header contention.
  • Free List Groups: ( FREELIST GROUPS ) Applicable only if you are using Oracle with the Parallel Server option in parallel mode. Specifies the number of free list groups which allow the index to have several segment headers. This enables multiple tasks to insert into the index; thereby alleviating segment header contention. Free list groups should be set to the number of Oracle Parallel Server instances that access the index. For partitioned objects and cases of segment header contention, freelist_groups my be set for non-RAC systems.
  • Buffer Pool: ( BUFFER_POOL ) Specifies the memory structure that is used for caching data blocks in memory, providing faster data access.
    DEFAULT: Caches data blocks in the default buffer pool.
    KEEP: Retains the object in memory to avoid I/O conflicts.
    RECYCLE: Removes data blocks from memory as soon as they are no longer in use, thereby saving cache space.

Table Partition Options for IBM DB2 for z/OS 8.x, 9.x, 10.x, and 11.x

  • Available Columns: Displays the columns available to add to a partition.
  • Selected Columns: Displays the columns that make up the partition key. This guides how data is divided along the partition boundaries. Typically, a partition key is date-based or based on a numerically ordered data grouping. Use the left and right arrows to move columns to and from this grid. Columns that are in this grid make up the partition.
  • Up / Down buttons: Let you reorder the columns on the partition. The partition order can affect the access speed. The most frequently accessed columns should be at the top of the partition list.
  • Add: Click to add table partitions. When clicked, launches the Table Partition Editor for IBM DB2 for z/OS 8.x, 9.x, 10.x, and 11.x.
  • Edit: Select a partition and click to modify Edit to modify it. When clicked, launches the Table Partition Editor for IBM DB2 for z/OS 8.x, 9.x, 10.x, and 11.x.

Table Partition Editor for IBM DB2 for z/OS 8.x, 9.x, 10.x, and 11.x

  • Partitions: Displays existing partitions. If more than one column is used to create the partition, the ending values of the partition keys (columns) are listed in the Ending At column in the same order as the columns in the partition definition. You can change the partition definition by selecting the partition and then changing the High Values in the Partition Definition area.
  • Partition Definition: Defines the high values of each partition key. You must enter a high value for each partition key listed.
  • Inclusive: If selected, includes the high value in the partition; otherwise the partition will end just before the high value is reached.

Table Partition Options for IBM DB2 for LUW 9.x and 10.x

  • Available Columns: Displays the columns available to add to a partition.
  • Selected Columns: Displays the columns that make up the partition key. This guides how data is divided along the partition boundaries. Typically, a partition key is date-based or based on a numerically ordered data grouping. Use the left and right arrows to move columns to and from this grid. Columns that are in this grid make up the partition.
  • Up / Down buttons: Let you reorder the columns on the partition. The partition order can affect the access speed. The most frequently accessed columns should be at the top of the partition list.
  • Add: Click to add table partitions. When clicked, launches the Table Partition Editor for IBM DB2 for LUW 9.x (and later versions).
  • Edit: Select a partition and click to modify Edit to modify it. When clicked, launches the Table Partition Editor for IBM DB2 for LUW 9.x (and later versions).

Table Partition Editor for IBM DB2 for LUW 9.x (and later versions)

Lets you create a PARTITION BY RANGE statement similar to the following:

CREATE TABLE orders (id INT, shipdate DATE,...)
PARTITION BY RANGE(shipdate)
(
STARTING '1/1/2006' ENDING '12/31/2006'
EVERY 3 MONTHS
)

  • Partitions: Displays existing partitions. If more than one column is used to create the partition, the ending values of the partition keys (columns) are listed in the End Value column in the same order as the columns in the partition definition. You can change the partition definition by selecting the partition and then changing the Start Value in the Partition Definition area.
  • Name: Lets you define the name of the partition. This field is not used when creating several partitions automatically. Use the Range Width and Duration Label fields to create a range partitioning statement such as EVERY 3 MONTHS.
  • Column Name: Displays the column names selected to be part of the partition on the Partitions tab.
  • Start Value: Lets you specify the earliest possible date or time, or the lowest integer to include in the partition. In the partitioning statement, the start value is the STARTING FROM value.
    A date string can be specified using one of the following formats:
    yyyy - mm - dd
    mm / dd / yyyy
    dd. mm. yyyy
    A timestamp string can specified as:
    yyyy - mm - dd hh. mm. ss. nnnnnn
    yyyy - mm - dd - hh. mm. ss. nnnnnn
    You can also choose MINVALUE or MAXVALUE from the list, the values of which depend upon the datatype.After specifying the Start Value, specify the End Value.
  • Start Value Inclusive: Select to include the specified start value in the partition. Keywords inserted are INCLUSIVE IN. If not selected, the partition will start at the value just greater than the specified start value, and the keywords inserted will be EXCLUSIVE IN.
  • End Value: Lets you specify the latest possible date/time or the largest integer to include in the partition. When working with a date-partitioned table, this would be the ENDING value. See Start Value above for acceptable formats for date and time stamp strings.
  • End Value Inclusive: Select to include the specified end value in the partition. Keywords inserted are INCLUSIVE IN. If not selected, the partition will end at the date/time or integer just greater than the specified end value, and the keywords inserted will be EXCLUSIVE IN.
  • Range Width: Use Range Width and Duration Label to automatically create multiple partitions. The Range Width is the date, time or integer interval used to partition the table. For example, in the EVERY 3 MONTHS clause, the Range Width is 3.
  • Duration Label: From the list, select the unit of measure to use to partition the table. In the EVERY 3 MONTHS clause, the Duration Label is MONTHS. For an integer-partitioned table, select (none) as the duration label.
  • Tablespace: Specifies the tablespace where the partition is stored.
  • LONG IN: Lets you specify a large tablespace where long data is to be stored. Large objects are by default stored in the same tablespace as the corresponding data objects, whether the partitioned table uses only one tablespace or multiple tablespaces. The LONG IN clause overrides this default behavior, causing long data to be stored in the tablespace specified, which can improve database performance.

Table Partition Columns for IBM DB2 for LUW 5.x, 6.x, 7.x, 8.x

  • Available Columns: Displays all the columns available to add to the partition. Select the column you want to add to the partition and move it to the Selected Columns box.
  • Selected Columns: Displays the columns that make up the partition key. This guides how data is divided along the partition boundaries. Typically, a partition key is date-based or based on a numerically ordered data grouping. Use the left and right arrows to move columns to and from this grid. Columns that are in this grid make up the partition.
  • Up / Down buttons: Let you reorder the columns on the partition. The partition order can affect the access speed. The most frequently accessed columns should be at the top of the partition list.
  • Use Hashing: If selected, ER/Studio Data Architect partitions the table according to a hash function, which can speed up table lookup or data comparison tasks.

See Also