Normalization
Go Up to Logical Design Concepts
Normalization helps eliminate redundancy and streamline your logical design. A normalized database design is well-organized, even elegant. It is very important to complete the normalization process before transforming the logical design to a physical design; doing so allows a more efficient implementation of the physical design on an actual database.
Normalization is the process of separating data into multiple, related tables. A normalized database design is characterized by multiple entities with relatively few attributes in each; an unnormalized design is characterized by few entities with relatively many attributes in each. Normalizing the logical design helps eliminate certain types of redundancy and incompleteness in the model.
Part of the normalization procedure entails moving certain sets of repeating data from one entity to its own entity, then creating a relationship between the two. For example, consider the development of a data model for a university to store information about alumni. One entity can hold information about alumni, like their names, degrees they have earned, and the year they graduated. If some alumni have earned more than one degree from the university, then degree information becomes a set of repeating data. In the normalization process, the degree information is removed from the entity and placed in a separate entity. Next, a relationship is created between the first entity and the new entity so that information in each entity is associated with information in the other.
While working with logical data models, a data modeler should understand the rules (or forms) of normalization to produce a good data model.
The following defines the normalization forms:
- Eliminate Repeating Groups (1NF): A separate table exists for each set of related attributes and each table has a primary key.
- Eliminate Redundant Data (2NF): Attributes depending on only part of a multi-valued key is removed it to a separate table.
- Eliminate Columns Not Dependent On Key (3NF): Attributes not contributing to a description of the key, are removed to a separate table.
- BCNF. Boyce-Codd Normal Form (BCNF): Non-trivial dependencies between candidate key attributes are separated out into distinct tables.
- Isolate Independent Multiple Relationships (4NF): No table contains two or more 1:n or n:m relationships that are not directly related.
- Isolate Semantically Related Multiple Relationships (5NF): There may be practical constraints on information that justify separating logically related many-to-many relationships.
- Isolate Semantically Related Multiple Relationships (ONF): A model limited to only simple (elemental) facts, as expressed in Object Role Model notation.
- Domain-Key Normal Form (DKNF): A model free from all modification anomalies.
- All normal forms are additive, so if a model is in 3rd normal form, it is also in 2nd and 1st normal form.