SCHEMAS & DATA MODELLING IN POWER BI

#beginners #data #dataengineering #microsoft

Introduction

Effective data management is a critical component of any organisation that aids in smarter, data-driven business decisions. At the heart of this lies a foundational understanding of Schemas and Data Modelling — two concepts that work together to define how data is structured, stored, and ultimately used to generate meaningful insights.

Data Modelling

Data Modelling is the process of designing and organising data structures to support databases. It provides a structural representation that demonstrates how data is stored, organised, and manipulated — serving as the blueprint for how information flows through a system.

Key Components of Data Modelling

Entities
Entities are real-world concepts that can be distinctly identified and about which data can be stored. For example, in a retail context, a Customer, a Product, or a Sale would each represent a separate entity.

Attributes
Attributes are the characteristics that describe or define an entity. They serve as the data points that can be used to sort, filter, or order a dataset. For instance, a Customer entity might have attributes such as Name, Age, and Location.

Relationships

Relationships refer to the connections between entities and their attributes. These connections help ensure that the model accurately reflects real-world actions or dependencies between data points. There are three types of relationships in a data model:

• One-to-One: One entity is related to exactly one instance of another entity. For example, one employee may be assigned exactly one staff ID.
• One-to-Many: The most common type in data modelling. One entity can have multiple instances of another entity. For example, one customer can place many orders.
• Many-to-Many: Multiple instances of one entity are related to multiple instances of another entity. This is the most complex type of relationship and is often resolved using a bridge or junction table to make it manageable.

Fact and Dimension Tables

Fact Table
The Fact Table is the core of a data model. It stores raw quantitative data for analysis — holding numerical values and metrics that can be aggregated to answer key business questions. Examples include sales revenue, units sold, or transaction counts.
Dimension Table
Dimension Tables provide the necessary context to Fact Tables. They contain descriptive attributes that can be used to filter, group, and label the data stored in a Fact Table. For example, a Date dimension might include attributes like Day, Month, Quarter, and Year, enabling time-based analysis of sales figures.

Schemas

Schemas define the structure and organisation of data within a data model. They determine how data is connected and related, which in turn influences the efficiency and performance of data queries and reports. A schema represents a logical grouping of tables that are associated with each other. Understanding schemas is essential for designing data models that support comprehensive, reliable analysis.
Types of Schemas

Star Schema
The Star Schema is the most commonly used schema in data warehousing — and in Power BI. It features a single central Fact Table surrounded by multiple Dimension Tables, with each Dimension Table connecting directly to the Fact Table. This layout resembles a star shape, hence the name. Its simplicity makes it easy to query and highly performant, making it the preferred choice for most reporting scenarios.

Snowflake Schema
The Snowflake Schema is a normalised version of the Star Schema. In this structure, Dimension Tables are further divided into sub-dimension tables, creating a more layered and complex arrangement. Normalisation reduces data redundancy by breaking dimension tables into multiple related tables. While this results in a web-like structure resembling a snowflake, it can add complexity to queries and may impact report performance if not managed carefully.

Why Good Modelling is Critical for Performance and Reporting

A well-designed data model simplifies complexity by visually mapping how different entities relate to each other, making it easier to understand, manage, and analyse datasets. Beyond clarity, good modelling delivers several tangible benefits:

• Ensures data consistency and quality through simplified database design and management.
• Improves query performance by reducing unnecessary calculations and data redundancy.
• Enhances data storage efficiency, keeping models lean and responsive.
• Provides the flexibility to scale models as data volumes grow and business needs evolve.
• Simplifies troubleshooting by making data relationships transparent and logical.
Together, these benefits create data models that are not only reliable today but adaptable for the future — supporting effective, data-driven decisions at every level of an organisation.

Conclusion

Schemas are vital for building an effective data model in Power BI. When deciding which schema to use, it is essential to balance the benefits of normalisation against the need for simplicity and performance. Rather than treating schemas as mutually exclusive choices, combining the strengths of each — the accessibility of the Star Schema with the reduced redundancy of the Snowflake Schema — can lead to stronger, more efficient data model designs.

The result is improved query performance, better data storage efficiency, and greater overall data operations — all of which contribute to scalable models that enable smarter, more confident business decisions.