Published on

ER Modeling and Normalization

Authors
Article Cover
Table of Contents

1. E-R Modelling

1.1. RDBMS Logical Data Modeling / E-R Logical Data Model

Entity - Relationship Model is a fundamental concept in database design that:

  • Is a type of flowchart that illustrates how "entities" such as people, objects or concepts relate to each other within a system
  • Defines a conceptual data schema
  • Defines the conceptual view of a database
  • Illustrates the logical structure of database

The ER model serves as a blueprint for database design, helping developers understand the structure and relationships within the data before implementing the physical database.

1.2. Entity

An entity represents a distinguishable object or concept in the real world:

  • An entity can be a real-world object, either animate or inanimate, that can be easily identifiable
  • All these entities have some attributes or properties that give them their identity
  • An entity set is a collection of similar types of entities
  • An entity set may contain entities with attributes sharing similar values
  • Entities can be thought of as Nouns
  • Typically shown as a rectangle in ER diagrams

Examples of entities:

  • Customer, Employee, Product, Order, Department

1.3. Relationships

Relationships define how entities are connected to each other:

  • Relationships represent logical associations between two or more entities
  • The association among entities is called a relationship
  • For example:
    • An employee works at a department
    • A student enrolls in a course
    • A customer places an order
  • Relationships are represented by diamond-shaped boxes in ER diagrams

1.4. Cardinality

Cardinality defines the numerical relationship between entities:

  • Cardinality defines the number of entities in one entity set that can be associated with the number of entities of another set via relationship
  • Cardinality is a couple of natural numbers that specifies the minimum and maximum number of instances of a relationship in which an entity instance can participate
  • It's possible to use the constant N to refer to a generic number greater than 1
  • In the ER diagram, the cardinality of the relationship is labeled on the arc that links the entity to the relationship
Cardinality Example

Example Analysis:

In the relation between CONSULTANT and PROJECT PLANNING:

  • 1 CONSULTANT is allocated to 1 or more Project Planning (e.g., MRO01 is planned to work on 1 or more projects) – one-to-many relation
  • 1 PROJECT PLANNING can have one and only 1 consultant planned on it – one-to-one relation

In the relation between PROJECT and PROJECT PLANNING:

  • 1 PROJECT can be managed with 1 or more Project Planning (e.g., project Fashion#1 is managed with two project planning) – one-to-many relation
  • 1 PROJECT PLANNING is referred to one and only 1 project – one-to-one relation

2. Normalization

In database design, normalization is a systematic approach to organizing data efficiently:

Mathematical Analogy: In mathematics, a set B of elements of a vector space V is called a basis if every element of V can be written in a unique way as a finite linear combination of elements of B. Similarly, Normal Form is a systematic way to express every single tuple without losing information.

Key Definitions:

  • Normal Form: A state at which relations are organized in accordance with specific requirements
  • Normalization: The process of structuring a database in accordance with specific Normal Forms

If a relation has several concepts that are independent of each other, it is decomposed into smaller relations, one for each concept, until the desired Normal Form is reached.

2.1. First Normal Form (1NF)

For each relation contained in the database, a relation is in 1NF if and only if:

  • Each attribute is defined over a domain with atomic (indivisible) values
  • Each attribute contains a single value from that domain

2.2. Second Normal Form (2NF)

A database is in 2NF (2nd normal form) when:

  1. It is in 1st normal form, and
  2. All non-key attributes functionally depend on the entire key (no partial dependencies)

2.3. Third Normal Form (3NF)

A database is in 3NF (3rd normal form) if it is:

  1. In 2nd normal form, and
  2. All non-key attributes depend on the key only (no transitive dependencies)

This means there should be no non-key attributes that depend on other non-key attributes. For example, to eliminate transitive dependencies, you should move attributes like City, State, and Zip into a separate Location table, so that each non-key attribute depends only on the primary key, not on other non-key attributes.

References

  • Lecture notes from Business Analytics course, Prof. Paolo Menna, University of Verona 2024-2025