- Published on
Data Modeling
- Authors
- Name
- Mai TIEU Khoi
- @tieukhoimai

Table of Contents
Data Modeling
Data modeling is the process of creating a conceptual representation of data and its relationships, which helps us understand how data is stored, accessed, updated, and queried within the database. The output of that process is a formal description of the data that has been modeled, which is called a data model. A data model can be compared to a roadmap, an architect's blueprint, or any formal diagram that facilitates a deeper understanding of what is being designed IBM Data Modeling
- A conceptual data model consists of highly abstract levels of data elements called entities and relationships between entities. It presents a high-level overview of the database system through a visual representation of the entities it contains and their relationships to one another. The conceptual model provides the basis for the logical data model.
- A logical data model builds on the conceptual model by providing a more detailed overview of the entities in their respective relationships. It identifies the attributes of each entity, defines the primary keys, and specifies the foreign keys. The logical data model serves as an intermediary between the high-level conceptual model and the physical implementation. It describes how the data will be organized in the database and provides a blueprint for building the physical database.
- A physical data model is used to create the schema of the database, which is then implemented in the database management system. The physical data model defines the structure of the database, including the data types, constraints, and attributes. It specifies how the data will be stored in the specific database management system being used.
In conclusion, a conceptual data model is a high-level view of the database that is used to communicate with stakeholders about the requirements of the database. The conceptual data model can then be used to create a logical data model, which is a more detailed view of the database that includes the relationships between entities, and the constraints and rules that apply to the data. Finally, the logical data model can be used to create a physical data model, which is the actual database schema that is implemented in a database management system.
Data Warehouse Model (OLTP vs OLAP)
Understanding the difference between OLTP and OLAP models is crucial for effective data warehouse design.
CRITERIA | OLTP DATABASE | DATA WAREHOUSE |
---|---|---|
Purpose | Execute business process | Evaluate business process |
Transaction Type | Insert, update, delete, select | Select |
Transaction style | Predefined | Unpredictable |
Optimized for | Efficiency and consistency | Query performance/ease of use |
Update Frequency | Real Time – business event | Periodic – scheduled – near rt |
Update concurrency | High | Low |
Historical data access | Recent | Years of history |
Selection criteria | Precise, narrow | Fuzzy, broad |
Comparison | Infrequent | Frequent |
Query complexity | Low | High |
Joins | Few | Many |
Transaction per day | Millions | Thousands |
Data volumes | Tera | Peta |
Data | Raw – detailed data | Detailed and aggregated data |
Design technique | ER – 3rd Normal Form 3NF | Dimensional/other |
From Logical Data Model to Physical Data Model
The transition from a Logical Data Model (LDM) to a Physical Data Model (PDM) is a critical step in database implementation.
Physical Table Structure Should Include:
- Technical metadata (job_id, insert/update data, etc)
- Technical columns for managing the data quality
- Primary and/or secondary indexes of the table to improve performance
- Table partitioning functions
- Adding surrogate key to simplify join, improve performance, optimize space
References
- Lecture notes from Business Analytics course, Prof. Paolo Menna, University of Verona 2024-2025
- IBM. (n.d.). What is data modeling?