Data Warehouse

A Guide to Data Modeling and its Different Phases

Data modeling refers to the exploration of data-oriented structures. It is a process that involves documenting a complex software design in a comprehensive diagram, using symbols and texts, to represent the way your data needs to flow. This detail-oriented diagram then acts as a blueprint to re-engineer your legacy application or to construct new software.  A data model is applicable to any kind of software development that may involve creating database objects in order to store and manipulate data.

Just like all other modeling artifacts, a data model design can be utilized for a variety of purposes, from conceptual models (high-level) to physical models. Conceptually, data modeling is quite similar to class modeling. A class model is used to identify classes whereas data modeling helps recognize entity types.

Generally, data models were built during the design and analysis phases of a project, allowing users to understand the requirements of a new application completely. A data model design is like a flowchart, illustrating relationships among different data entities.

When you present well-prepared conceptual, logical and physical data models, they allow your stakeholders to catch errors and make relevant changes before the programming code is written. Data modeling emphasizes how your data should be organized and what data is required rather than what operations should be performed on the data.

It is like an architect’s building plan that assists in crafting more of a conceptual model while establishing relationships among data items.

A DBMS Comprises of Different Data Models

A complete Database Management System (DBMS) is based on a number of data models – the most commonly used ones are:

  • Network model and hierarchical model
  • Object-oriented model
  • Object-relational model
  • Relational data model

What Makes Data Modeling So Important?

Data models serve the following purposes:

  • To provide an accurate representation of all data objects which are required by the database. If not, the omission of any data may lead to faulty reports and incorrect results.
  • To design a database at different levels i.e. conceptual, physical and logical.
  • To define the stored procedures, relational tables, and primary/foreign keys.
  • To obtain a clear picture of the most relevant data that assists developers to design a physical database.
  • To identify the redundant and missing data.

Stages of Data Models

Majorly, there are three levels of data modeling, namely:

  1. Conceptual data model
  2. Logical data model
  3. Physical data model

Let us discuss the difference of all three levels in detail.

Conceptual Data Model

This data model identifies the most high-level relationships among different entities. It defines WHAT a system contains and is typically designed by data architects and business stakeholders. The primary goal is to define the scope and organize business rules and concepts.

It focuses on establishing entities, their relationships, and attributes. In conceptual data modeling, there are no details available related to the actual database structure. The three fundamental tenants of this data model include:

  1. Entity – a more real-world thing
  2. Attribute – the properties or characteristics of an entity
  3. Relationship – the association and dependency between two entities.

To give an example:

  • Product and Customer are two entities
  • Customer’s name and ID are Customer attributes. Similarly, Product’s price and name are Product’s attributes
  • The sale is the relationship between the Product and the Customer

Features of Conceptual Data Model

Conceptual data models or domain models establish a common vocabulary for every stakeholder by creating a basic scope and concepts.

  • Organization-wide coverage of all business concepts
  • Development and design to cater to the target audience
  • Its development is independent of any hardware specifications, such as technology or data storage capacity. The aim is to represent the data just as a user would see it in the real world.

Logical Data Model

This data model design defines HOW a system must be implemented, without factoring in how it would be physically implemented in the DBMS. Typically, it is created by business analysts and data architects. The aim is to develop data structures and a technical map of rules.

A logical data model helps add additional information to the conceptual model elements. It illustrates data elements’ structure while establishing the relationship between them. This data model offers a great benefit of providing a strong foundation to form the base of a physical data model. However, the basic modeling structure continues to be generic.

Features of Logical Data Model

It helps describe data requirements for one project but can be integrated with various other data models based on the project scope.

  • The model is developed and designed separately from the DBMS.
  • Data attributes have data types with accurate length and precision.
  • At this level, no key (primary or secondary) is defined and you must adjust and verify the connector details, which were set before for relationships.

Physical Data Models

This data model represents database-specific implementation and an application of the logical data model. It assists in generating schema and offers database abstraction. This happens due to the richness of meta-data offered by a physical data model.

Features of Physical Data Model

This data model assists in visualizing database structure. It facilitates to model the constraints, database columns keys, triggers, indexes, and characteristics of various Relational Database Management Systems (RDBMS).

  • A physical database model describes data required for one application or project that can also be integrated with various other models on a project scope basis.
  • It contains relationships among tables which addresses nullability and cardinality of the relationships.
  • The model is developed for a particular DBMS version, technology, or data storage required in the project.
  • Columns must have accurate data types, default values, and lengths assigned.
  • Foreign and primary keys, indexes, views, authorizations, and access profiles are defined.

The Final Words

The complexity of a data model increases as we move from conceptual to logical and then to physical modeling phases. It’s best that you always start with understanding the different entities in your database and how they relate to each other, at a higher level (conceptual model). Then, understand the data details without stressing about their implementation (logical model). Finally, realize the implementation of your data model in a particular database, i.e. physical data model. However, considering a data warehousing project, both the conceptual and logical data model may be considered as one deliverable.

Astera Centerprise