Skewed Tables

From ER/Studio Data Architect
Jump to: navigation, search

Go Up to Developing the Physical Model

On the Skewed By tab of the Table Editor for the Hive platform, you can select the columns of a table by which you want to skew. When at least one of the available columns is added to the selection, the ON clause box becomes available.

Use the Skewed By tab to improve performance for tables where one or more columns have skewed values. By specifying the values that appear very often (heavy skew), Hive will split out those values into separate files.

Skewed By Tab.png

The following options are available:

  • Available Columns. Displays all of the columns available to add to the Skewed By. Select the column you want to add to the organizing keys and move it to the Selected Columns box. Use the left and right arrows to move columns to and from the Selected Columns box.
  • Selected Columns. Displays the columns that make up the cluster.
  • Up and Down. Buttons that let you reorder the columns in the bucket. The column order can affect the access speed. The most frequently accessed columns should be at the top of the Selected Columns list.
  • ON. Clause box where you can enter your skew values.

See Also