Creating and Editing Columns

From ER/Studio Data Architect
Jump to: navigation, search

Go Up to Creating and Editing Attributes and Columns

Attributes represent the relevant properties or characteristics of an entity. In the physical model, attributes are represented as table columns. There are two types of attributes and columns:

  • Identifiers. Helps to identify a entity/table instance because it is all or part of the entity's/table's primary key.
  • Descriptor. A non-key attribute/column. Following the rules of normalization, if an attribute/column is not part of the primary key, then its only purpose is to describe each entity instance.

Attribute definitions together with relationships can determine or enforce business rules. You can define a particular set of valid values for any attribute of any entity. For example, you can define a valid range of salaries for a subset of employees for an attribute called Salary. This set of values is known as a domain. For more information, see Ensuring Consistent Domain Definitions Using User Datatypes.

If you define your entities carefully, then defining attributes should be relatively straightforward. When you first add attributes to an entity, be sure to name your attributes appropriately. Attributes can be native to an entity or inherited via an identifying or non-identifying relationship. To promote consistency, ER/Studio Data Architect lets you edit only the properties of an entity's native attributes. To change the properties of a foreign key of a child entity, you must edit the attribute in the parent entity. For information on attribute naming rules and conventions, see Best Practices.

When determining column names, make sure you consider the naming rules of the target database. Some key considerations are name length, avoiding the use of reserved words, and prohibited characters. Most databases restrict name lengths to 30 or 18 characters. All have reserved key words that cannot be used in names unless surrounded by quotes. Certain characters, such as a space, *, +, and % are also prohibited from being used in names.

  1. In the Data Modeling window, double-click the table to which you want to add a column.
  2. On the Columns tab of the Table Editor, click Add.
  3. Define the attribute as required, and then click OK.

Main page

Field Description
Domain Name Lists all defined domains. You can bind any available domains to the column.
Attribute Name The attribute name is by default the column name. If you want a different column name, type a new one in the box. ER/Studio Data Architect uses this name when generating the physical model.
Default Column Name The attribute name is by default the column name. If you want a different column name, type a new one in the box. ER/Studio Data Architect uses this name when generating the physical model.
Logical Rolename This option is only available when there is an overlapping foreign key in a logical model and the Synchronize Column Rolename with Logical Rolename option is not selected. Use this when the name of the child attribute differs from the parent attribute. In the logical model, ER/Studio Data Architect distinguishes between the logical rolename and the column rolename. ER/Studio Data Architect uses the column rolename when generating the physical model.
Hide Key Attribute If selected, hides the key in the child table. Commonly used to support partial migration of PK columns when you do not want to migrate primary key columns into child tables. ER/Studio Data Architect will not generate a foreign key because most database systems require that FKs have the same number of columns as the child. By selecting this option, you will 'remove' the column from the child table so that it won't show up in any child tables further down the FK chain, in DDL generations or in comparisons to the database. It will display as unavailable when editing the child table, but you can make it available again later.
Physical Only If selected, the table will not be implemented as a table when generating a physical model, and will be ignored when comparing the logical model with a physical model.
Synchronize Column Rolename with Logical Rolename Available when the attribute selected is a foreign key. This option affects name modification in the entity or table editor, on-screen name editing, and name modification in the Data Model Explorer. Lets you change the role names of foreign attributes or columns to the same names as other attributes or columns in the same entity or table. Duplicates will be unified when the user ends the edit session.
Add to Primary Key Add the column as a primary key for indexing.

Datatype tab

Dataype drop-down list. Click the list to select the appropriate datatype. The list of datatypes presented is dependent on the database platform selected. This discussion describes datatypes that need further explanation, i.e. ROWVERSION, TIMESTAMP, INTERVAL, and NUMBER.

  • ROWVERSION. The ROWVERSION datatype exposes automatically generated, unique binary numbers in a database. It is an incrementing number, 8 bytes in size, that can be used to version-stamp table rows. A table can have only one rowversion column. Every time that a row with a ROWVERSION column is modified or inserted, the incremented database ROWVERSION value is inserted in the rowversion column. This property makes a ROWVERSION column a poor candidate for keys, especially primary keys. Any update made to the row changes the ROWVERSION value and, therefore, changes the key value.
    Notepad blue icon 2.pngNote: Rowversion is a synonym for timestamp. You can specify ROWVERSION as data type in ER/Studio Data Architect, but the created column on MS SQL Server 2008 is TIMESTAMP.
  • TIMESTAMP. For DB2 LUW 9.x, and DB2 z/OS 9.x, 10.x, and 11.x. A Generated By Default option is available for a TIMESTAMP datatype if the column does not allow nulls (NOT NULL), and you selected Generated Column.
    For Teradata 13.x and later, A PERIOD option is available for Timestamp, Time and Date.
    The following is an example of the SQL produced when these options are selected:
    HireDate TIMESTAMP NOT NULL GENERATED BY DEFAULT FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP
  • INTERVAL. Teradata databases only. The Interval datatype includes options to specify the units for the datatype.
    • Allow Nulls. The Allow Nulls control corresponds to the platform-specific nullability option/argument for a column specification.
  • NUMBER. NUMBER or NUMBER(*) you must set the width and scale to *. A width of 0 is converted to * when generating DDL.
    Against Oracle 8 and later platforms, if an Allow Nulls value of No is selected, providing a Constraint Name results in a column definition that specifies an inline named NOT NULL constraint.
    Against Oracle 12c and later platforms, the NUMBER datatype supports Identity columns.
    • Override Bound Data. Overrides datatypes on an attribute basis (for supported types: SQL Server, Sybase, IBM DB2 for LUW, and IBM DB2 for z/OS). Used to handle any exceptions while maintaining the binding to the same domain. For example, there is a VARCHAR domain on two separate DBMS platforms. The character domain's primary use is on Oracle. You specified it as VARCHAR, which is VARCHAR2 in Oracle. Using this option, you can have a DB2 model where it is implemented it as CHAR. You can customize the domain's implementation to accommodate the specific platform, without losing the bindings between DB2 columns and the domain.

Identity Property. For numeric datatypes for which nulls are not allowed and for whose datatype supports identity properties, you can specify these properties by selecting Identity Column, and then specifying Seed and Increment values. The seed and increment values are used to automatically generate sequential numbers for each row in the table beginning with the entered Seed value and incrementing by the amount specified by Increment value. The default seed and increment values are 1. When you check Identity Column, the Identity Properties area is enabled.

Virtual. For Oracle 11g and later only. The following describes the requirements for virtual columns as described by Oracle. Note, these restrictions are not enforced by ER/Studio.

  • Column restrictions. You can specify a column as virtual if the following criteria are met:
    • The table containing the column is not index-organized, external, object, cluster, or temporary, AND
    • The virtual column is not an Oracle-supplied datatype (e.g. Spatial Types, XML Types, Any Types and Media Types), a user-defined type, an object type, or a LONG, LOB, BLOB, RAW, REF, or BFILE.
  • Expression restrictions. To specify a column as virtual, check the Virtual option and specify an expression, such as (isnull('AW'+[dbo].[ufnLeadingZeros]([CustomerID]),)) in the adjacent expression field. The expression used on virtual columns must satisfy the following restrictions:
    • It cannot refer to another virtual column by name.
    • It can only refer to other columns in the same table.
    • It can refer to a deterministic user-defined function but then you cannot use the virtual column as a partitioning key column.
    • The result of the expression must be a scalar value.

When you check the Virtual option, you may specify that the datatype of the column is displayed by selecting Show Datatype. When Show Datatype is not selected, then data type list, Width, Scale, Allow Nulls, and Identity Property are disabled. These options are available when Show Datatype is selected, however, the Identity Property options may not be available if the requirements are not met.

  • Edit LOB Segment: If the specified datatype of the column is BLOB, CLOB, or NCLOB, you can click the Edit LOB Segment option to define LOB storage. For more information, see Oracle LOB Segment Storage Options.
  • Character Set: For the Teradata platform, if the specified datatype of the column is CHAR, VARCHAR or CLOB, you can then specify the character set for the datatype. Click the list to choose the appropriate datatype. For the CHAR and VARCHAR datatypes, you can choose DEFAULT, LATIN, UNICODE, GRAPHIC, KANJI1, KANJISIS. The CLOB datatype supports only LATIN and UNICODE character sets.
  • CASESPECIFIC: For the Teradata platform, if the specified datatype of the column is CHAR or VARCHAR, you can choose the CASESPECIFIC option, which indicates the system should differentiate between upper-case and lower-case characters.
  • Compress: For the Teradata platform, if the selected column is not a primary key and the appropriate datatype is selected, such as BIGINT, DATE or CHAR, you can choose to compress the columns to minimize the table size. Teradata does not support the compression of CHAR datatype with a length more than 256 characters. You can either enter the compress statement in the space provided or if the statement is longer, clicking More brings up the Compression Statement Editor where you can enter a more lengthy compression statement. You can compress up to 256 column values. For example, in a customer database you could compress the state names to reduce the table size. The Compression statement could look like,

COMPRESS ('New Hampshire', 'California', 'Alaska',...)

which would replace the state values in the table with a state code and store the state values in the table header.

Notepad blue icon 2.pngNote: Teradata does not support compress for a column with these data types: INTERVAL, TIME, TIMESTAMP, VARBYTE, VARCHAR, CLOB, BLOB, CHAR(n) with n > 255. Also if the column is a primary key, it cannot be compressed.

Collect Statistics: ( COLLECT STATISTICS) Collects demographic data for one or more columns of a table, computes a statistical profile of the collected data, and stores the profile in the data dictionary. Collecting full statistics involves scanning the base table, and sorting to determine the number of occurrences for each unique value. The Teradata Optimizer uses this profiling data when it generates its table access and join plans. Full statistics is generally recommended for relevant columns and indexes on tables with less than 100 rows per Access Module Processor (AMP).

This is the SQL generated when Collect Statistics has been selected in the Table Editor for a column in the table.

COLLECT STATISTICS ON table_1 COLUMN column_1;

This is the SQL generated when Collect Statistics has been selected in the Index Editor for an index in the table.

COLLECT STATISTICS ON table_1 INDEX unique_1;

Using Sample Collecting sampled full-table statistics is generally faster than collecting full table statistics, but may not produce the best table access and join plans. When you specify this option, Teradata will determine the appropriate percentage to the data to scan. The best choices for USING SAMPLE are columns or indexes that are unique or 95% unique.

This is an example of the SQL generated when Collect Statistics and Using Sample have been selected in the Tale Editor for a column in the table.

COLLECT STATISTICS USING SAMPLE ON table_1 COLUMN column_1;

This is the SQL generated when Collect Statistics and Using Sample have been selected in the Index Editor for an index in the table.

COLLECT STATISTICS USING SAMPLE ON table_1 INDEX unique_1;

  • RowGuidCol: Lets you return the row global unique identifier column. When using the ROWGUIDCOL property to define a globally unique identifier column, consider that a table can have only one ROWGUIDCOL column, and that column must be defined using the unique identifier data type.
  • Computed Column: If selected you can enter a formula to transform the column; for example, sale commission = 10% total sales.
  • Sparse: Specify sparse if the column can have many NULLS. Specifying sparse for such columns can improve retrieval response. A sparse column is stored nulls do not take up any space; however, a sparse column with data in it requires a little more space than usual. This option will be available only if the datatype selected supports the sparse property.
  • Collate: Lets you specify collations for each character string. ER/Studio Data Architect will use the COLLATE clause with the CREATE TABLE or ALTER TABLE statement.
  • For Bit Data: Lets you use binary SQL to specify a binary character array for character SQL types. IBM database platforms support columns with binary data types by defining CHAR, VARCHAR, LONG VARCHAR, columns with the FOR BIT DATA attribute.
  • LOB Unit: If the datatype is BLOB, CLOB, or DBCLOB, lets you select the LOB Unit measurement. Specify K - for Kilobytes, M for Megabytes or G for Gigabytes.

Datatype Properties. For Oracle 12c and later only. Activated when you choose Identity Column.

  • Default. Options include Generated Always, Generated By Default, and Generated By Default on Null.
  • MinValue and MaxValue. The minimum and maximum of the supplied IDENTITY values. You can leave one or both of these fields empty for NoMinValue.
  • Cache. By default, this field displays 1000 as the number of values stored in the local client cache to use for the next IDENTITY value. When you complete the set of values, another set of cvalues is requested from the server to store in your local cache.
  • No Cycle. By default, this checkbox is selected which guarantees that each IDENTITY value in the column is unique, but not necessarily sequential. Clear this box to reuse the set of all possible values for the IDENTITY datatype.

Default tab

Field Description
Default Binding If a default value is not already bound to the entity attribute (Data Dictionary > Defaults > Edit Default Definition > Binding Information), you can type a Declarative Default or select Override Bound Data, and then choose another value from the Default Binding list. The declarative default appears in the SQL produced when you select Generate Database from the physical model. If a default value attribute was previously bound to the entity, you can override the default by selecting another default from the Default Binding list. For more information, see Working with the Data Dictionary. If the attribute selected is a primary key of the entity, you have the option to select the option to Propagate Default to Child Keys.
Default Name Specifies a name for a declarative default when a declarative default is specified below, or when Override Bound Data is selected. If you are using a bound default from the Data Dictionary, you cannot specify a name for an inline default.
Declarative Default Specifies the default column value when a value is not explicitly specified during an insert. DEFAULT definitions can be applied to any columns except those defined as timestamp, or those with the IDENTITY property. Only a constant value, such as a character string; a system function, such as SYSTEM_USER() ; a NULL ; or a constraint name can be assigned to a DEFAULT.
Rules/Constraints Lets you select a rule binding from the Data Dictionary or manually type in a rule binding. The exact text in the declarative box gets inserted into the physical model DDL. Add, modify, or delete a constraint. Check constraints are used to ensure the validity of data in a database and to provide data integrity. For more information, see Enforcing Data Integrity Using Check Constraints. If the information is grayed out (read-only) it is because a domain is being used and the definition, datatype, rule, constraint or default is inherited from the domain.

Glossary tab

Displays a table consisting of all business terms related to the associated Column. Table columns include Term Name, Glossary (parent glossary of the term), and Description.

Available options include:

  • Add Term. Launches the Business Term Selector dialog box, allowing users to relate/unrelate terms from the Object.
  • Remove Term. Unrelates the selected term from the Object.
  • Refresh Term. Starts caching of the terms in the background with the status displayed next to the button.

Definition tab

Enter or edit a definition for the column. If the target database supports it, ER/Studio Data Architect adds this definition as a column comment when generating SQL code.

Notes tab

The notes added here are included as part of the HTML report when you generate a report for the model. You can format the notes using standard HTML tags.

Where Used tab

Shows which logical and physical models and submodels implement the column or attribute. Also shows the history of the column in terms of Denormalization Mappings. From the Where Used tab you can edit both the mappings and the user-defined mappings. For more information, see Associate Similar Objects Using User-Defined Mappings.

User-Defined Mappings tab

Shows any user-defined mappings (UDM) for this table. UDMs tie data together that can be conceptually related but not necessarily with concrete transactional relationships. UDMs are not used when comparing data models or when generating SQL. Use the Notes tab to describe the relationship between entities that is created using user-defined mappings. For more information, see Associate Similar Objects Using User-Defined Mappings.

Reference Values tab

Reference Values are attributes that define allowed data. They represent look-up table columns or code-value ranges or a rule or constraint applied to columns. For more information, see Defining Valid Attribute Data Using Reference Values.

Naming Standards tab

Controls User and Object Permissions to modify the affected item. When a Naming Standards Template is bound to an object, then that template is used instead of the Naming Standards Template selected in the Naming Standards Utility or the Generate Physical Model utility.

This allows you to apply different naming standards to different objects when portions of a model require different versions of a standard. For example, some objects already exist in the database and you do not want to apply a newer naming standard to them. The Freeze Names option when selected prevents any naming standards applied from changing the name of the column selected. For more information, see Enforcing Naming Standards Using Naming Standards Templates.

Compare Options tab

Select the differences you want the Compare and Merge Utility to ignore.

LOB Storage tab

HiRDB only. If the column datatype specified on the Datatype tab is BLOB, then this tab is available for you to specify a large object storage location.

Data Lineage tab

Maps the rules that describe data movement from source data to target entities and attributes in the model. At the attribute level, you add transformation logic for source/target attribute mapping. You can map multiple source/targets to one attribute in the physical model. For more information, see Documenting Data Extraction, Transformation, and Load.

Security Information tab

Displays the current security information as defined in the Data Dictionary. You can add or edit security reporting information in by double-clicking the field you want to change. To set the model-wide default value for a property, go to the Data Dictionary tab and expand the Data Security Information Node. For more information, see Enforcing Security Using Data Security Types and Properties.

Extended Properties tab

Allows you to manage SQL Server extended properties stored on the column in the database. You can add, edit, and delete property values and descriptions.

Attachment Bindings tab

Bind an external piece of information or attachment to the column. You can also remove an attachment from an object, override an attachment binding's default value, or change the position of a bound attachment. To override the value of the attachment you moved to the Selected Attachments grid, double-click the Value field of the target attachment. ER/Studio Data Architect opens the Value Override Editor or a list, depending on the attachment datatype. Attachments are created in the Attachments folder of the Data Dictionary and must be applied to the column before they will display on this tab. For more information, see Attaching External Documents to the Data Model.

Data Movement Rules tab

Displays the data movement rules that have be assigned to the attribute. Allows you to override the default rules assigned to the table/entity in the Data Movement Rules editor of the Data Dictionary. Here you can apply or remove data movement rules. The Data Movement rule could be something like 'Data for this table is updated after 30 days.' For more information, see Relating Source and Target Tables and Columns.

Notes

  • Choosing the correct datatype is a critical data integrity and storage consideration. Select a datatype that is efficient for its intended use but properly accommodates the data it stores. Your choice of database platform dictates your choice of column datatypes because each has its own set of datatypes. For example, some databases have binary datatypes; others do not. Some provide support for blobs and text, but many do not. Finally, only a few databases support user-defined datatypes. ER/Studio Data Architect can convert datatypes between different databases; however, some conversions can only be approximations.
  • Another important consideration in specifying datatypes is determining the appropriate width, precision and scale, as applicable. You want to store data efficiently, but not at the expense of completeness, accuracy or precision.
  • For information on datatype mapping, see Customizing Datatype Mappings. If you want to enforce datatypes across common columns, use Data Dictionary Domains. For more information, see Reusing Attribute Definitions Using Domains.
  • When determining attribute and column names, make sure you consider the naming rules of the target database. Some key considerations are name length, avoiding the use of reserved words, and prohibited characters. Most databases restrict name lengths to 30 or 18 characters. All have reserved key words that cannot be used in names unless surrounded by quotes. Certain characters, such as a space, *, +, and % are also prohibited from being used in names.
  • You can specify naming rules, the default datatype and whether NULL values are permitted for the attributes created with the default datatype on the Logical tab of the Options Editor. Choose Tools > Options > Logical.
  • You can specify the order in which the attributes appear on the Display tab of the Options Editor. Choose Tools > Options > Display.
  • You can specify the level of synchronization between the attribute and column names on the Name Handling tab of the Options Editor. Choose Tools > Options > Name Handling.
  • The status bar at the bottom right of the application window displays the number of entities in the selected model.

See Also