Data Modelling: What?
Data Modelling is a very important part of Data engineering and it is one of the topics any data engineer needs to have a solid grasp on if they hope to have a successful career.
Data modelling involves conceptualising the relationships between data objects, representing them and the rules that control the behaviour of the data. It aims to organise elements of data and how they relate to one another. With this method, a database or information system design blueprint is created. Data models come in a variety of forms, such as conceptual, logical, and physical data models.
- Conceptual data models: These models highlight the connections between various entities without going into the specifics of how they were implemented. They reflect the high-level view of data. Typically made for stakeholders, it aids in their comprehension of the data needs and their connections. This stage of data modelling is mainly done on paper or during a giant white boarding session.
- Logical Data Models: A detailed display of the properties, entities, and relationships is provided by a logical data model. It is more precise and emphasizes the semantics and structure of the data. A physical data model is typically built on top of logical models.
- Physical Data Models: The physical implementation of data, including tables, columns, and relationships, is described by a physical data model. It is developed following the logical data model and is utilized to build the real database using DDLs. At this stage, the Logical data models from the last stage are converted into schemas and tables.
There are various steps in data modelling, including:
The initial phase in data modelling is requirements gathering, which involves getting input from users and stakeholders.
Conceptual modeling: In this phase, a high-level data model that reflects the information needs is created.
Logical modeling: This stage involves transforming the high-level data model into a more in-depth data model that demonstrates the connections between entities and characteristics.
Physical modelling: This stage involves converting the logical model into a physical model that include the real database tables, columns, and relationships.
Implementation: This is the last step, where the database is built and the physical model is put into action.
Data Modelling: Why?
Why is it important to do data modelling?
- Data organisation is important.
- Organised data will determine the ease of later use or the uselessness of the data. Imagine having to do 4 table joins to get the email address of a customer when it could have been added as a column on the facts table in the first place.
- It is an iterative process in the lifetime of a project.
