- Published on
ER Modeling and Normalization
- Authors
- Name
- Mai TIEU Khoi
- @tieukhoimai

Table of Contents
1. E-R Modelling
1.1. RDBMS Logical Data Modeling / E-R Logical Data Model
Entity - Relationship Model is a fundamental concept in database design that:
- Is a type of flowchart that illustrates how "entities" such as people, objects or concepts relate to each other within a system
- Defines a conceptual data schema
- Defines the conceptual view of a database
- Illustrates the logical structure of database
The ER model serves as a blueprint for database design, helping developers understand the structure and relationships within the data before implementing the physical database.
1.2. Entity
An entity represents a distinguishable object or concept in the real world:
- An
entity
can be a real-world object, either animate or inanimate, that can be easily identifiable - All these entities have some attributes or properties that give them their identity
- An
entity set
is a collection of similar types ofentities
- An
entity set
may containentities
with attributes sharing similar values Entities
can be thought of as Nouns- Typically shown as a rectangle in ER diagrams
Examples of entities:
- Customer, Employee, Product, Order, Department
1.3. Relationships
Relationships define how entities are connected to each other:
Relationships
represent logical associations between two or moreentities
- The association among
entities
is called a relationship - For example:
- An employee works at a department
- A student enrolls in a course
- A customer places an order
Relationships
are represented by diamond-shaped boxes in ER diagrams
1.4. Cardinality
Cardinality defines the numerical relationship between entities:
- Cardinality defines the number of
entities
in oneentity set
that can be associated with the number of entities of another set via relationship - Cardinality is a couple of natural numbers that specifies the minimum and maximum number of instances of a relationship in which an entity instance can participate
- It's possible to use the constant N to refer to a generic number greater than 1
- In the ER diagram, the cardinality of the relationship is labeled on the arc that links the entity to the relationship

Example Analysis:
In the relation between CONSULTANT and PROJECT PLANNING:
- 1 CONSULTANT is allocated to 1 or more Project Planning (e.g., MRO01 is planned to work on 1 or more projects) – one-to-many relation
- 1 PROJECT PLANNING can have one and only 1 consultant planned on it – one-to-one relation
In the relation between PROJECT and PROJECT PLANNING:
- 1 PROJECT can be managed with 1 or more Project Planning (e.g., project Fashion#1 is managed with two project planning) – one-to-many relation
- 1 PROJECT PLANNING is referred to one and only 1 project – one-to-one relation
2. Normalization
In database design, normalization is a systematic approach to organizing data efficiently:
Mathematical Analogy: In mathematics, a set B of elements of a vector space V is called a basis if every element of V can be written in a unique way as a finite linear combination of elements of B. Similarly, Normal Form is a systematic way to express every single tuple without losing information.
Key Definitions:
- Normal Form: A state at which relations are organized in accordance with specific requirements
- Normalization: The process of structuring a database in accordance with specific Normal Forms
If a relation has several concepts that are independent of each other, it is decomposed into smaller relations, one for each concept, until the desired Normal Form is reached.
2.1. First Normal Form (1NF)
For each relation contained in the database, a relation is in 1NF if and only if:
- Each attribute is defined over a domain with atomic (indivisible) values
- Each attribute contains a single value from that domain
2.2. Second Normal Form (2NF)
A database is in 2NF (2nd normal form) when:
- It is in 1st normal form, and
- All non-key attributes functionally depend on the entire key (no partial dependencies)
2.3. Third Normal Form (3NF)
A database is in 3NF (3rd normal form) if it is:
- In 2nd normal form, and
- All non-key attributes depend on the key only (no transitive dependencies)
This means there should be no non-key attributes that depend on other non-key attributes. For example, to eliminate transitive dependencies, you should move attributes like City
, State
, and Zip
into a separate Location
table, so that each non-key attribute depends only on the primary key, not on other non-key attributes.
References
- Lecture notes from Business Analytics course, Prof. Paolo Menna, University of Verona 2024-2025