Creating Entity Relational or Dimensional Models

From ER/Studio Data Architect
Jump to: navigation, search

Go Up to Customizing the Data Model

You can generate a DM diagram by starting with a logical model, by reverse-engineering, by importing an SQL or an ERX file, or by adding a new physical model.The Model Notation controls whether the data model design technique used is Entity Relational (ER) or Dimensional (DM). You can use either technique depending on your requirements.

What is a Dimensional Model?

A Dimensional Model is a specific type of 'entity/relationship' design used for designing systems which help optimize queries used in decision support databases (e.g. Data Warehouses, Data Marts etc).

Dimensional modeling is often used for modeling data warehouses, providing for high-performance, end-user access to data. An extremely important factor is the identification and modeling of the different sources of data that are used to populate the data warehouse. This is referred to as Extract, Transform, and Load (ETL). ER/Studio's Data Lineage Modeling, which is another key differentiator, allows the architecture to be fully modeled and documented, including data sources, staging areas, the data warehouse, and ETL.

ER/Studio Data Architect supports Dimensional Modeling as a new 'type' of Physical Model. Designating a model as Dimensional Modeling occurs:

  • During Generate Physical Model time (as a choice in the Wizard)
  • As a notational choice in the Physical Model to toggle the Notation to 'Dimensional' which will update any existing Physical Model to Dimensional status

Where entity relationship modeling is used to define the structure and association of data (most often implemented in relational databases), dimensional modeling not only covers that aspect (i.e. structure) but moves deeper into designing the SEMANTICS of the data itself. What happens to the data in the decision support system (Is it fixed data or does it change? And if it changes, to what extent?) The purpose of capturing this level of metadata in the design is so that when you leverage software systems to query the 'measures' in the database, you will more easily know what is the 'truth' behind the lineage/history of the data.

Dimensional Model Table Type Determination

Tables must be assigned to a specific dimensional model table type. This is one of the most critical aspects of offering a dimensional modeling 'notation' in ER/Studio Data Architect. Initially, a choice of table types will be available from the Dimensional Modeling Toolbar when creating new tables. Users can choose a Fact, Dimension or Snowflake table to drop it on the workspace. Two things can influence the Dimensional Table Type:

  • User Over-ride (see Table Editor info below):
  • Automatic Assignment by ER/Studio: The dimensional modeling rules discussed in detail below will automatically be enforced by ER/Studio Data Architect.

Example:

1. User drops down a Dimension 'A'

2. User Drops down another Dimension 'B'

3. User draws relationship from DimA to DimB

4. DimB automatically changes its type to a 'Fact.'

Generally speaking, this dimensional metadata now managed in the DM1 is to be leveraged externally by other 'consumers' such as reporting tools like Business Objects, Cube Views, Cognos and so on in an effort to ensure that people querying data from these reporting engines know what type of data the table represents, i.e. a Facts, Dimensions, etc. Interfacing ER/Studio with these large B.I. and ETL products is helpful because ERStudio can pre-populate your Repositories with this important metadata.

Dimensional Model Table Types

  • Undefined: An Undefined Table is a table managed in the Dimensional Model that has been intentionally set by the designer (over riding the default type of 'Dimension') in order to flag it that a decision must be made about what it is to become (Dimension, Snowflake etc).
  • Fact: A Fact Table is the focus of a Dimensional Model's intended query to determine specific business performance measurement on a selected 'topic' of data. The fact table is an aggregate collection of Dimensional Table Foreign Keys (in addition to other measurement attributes) assembled to gather specific numeric or additive data the business is interested in reviewing historical performance of.
  • Dimension: A Dimension Table contains the very specific and descriptive data on a particular aspect of a query which the Fact Table assembles. An example would be a Dimension called 'Product' which is a table containing product specific information fed into the Fact. Other Dimensions provide other 'descriptors' such as 'Region' When fed into the Fact, queries can be derived for Product sold within a specific Region.
  • Snowflake: Outriggers (sometimes called Snowflakes due to the structure they visually look like in a Dimensional Model as extensions to Dimensions) act as 'qualifier' for Dimensions. Snowflaking essentially 'normalizes' a Dimension table by removing specific attributes in a Dimension (oftentimes these are 'types') and implementing them as separate tables acting as lookup parents of the Dimension. An example would be a Dimension called 'Marketing Brand' which at one time contained attributes called Finance Category and Marketing Category. In order to normalize this data, those two attributes were removed from the dimension and established as their own tables and provided with a PK which are then propagated via non identifying relationship to the child dimension, Marketing Brand.

These table types are further discussed below.

See Also