Niyibigira Ornella

Posted on Feb 8

Schemas and Data Modelling in Power BI

#analytics #database #dataengineering #microsoft

Introduction

When talking about data modelling it is important to mention that it is one of the most critical step in building reliable reports in Power BI. Data modelling involves structuring raw data into meaningful tables, defining relationships between them, and organising them in a way that supports efficient querying and visualisation. In Power BI a well designed data model ensures accuracy when it comes to reporting, it also improves performance.

Schemas in Data Modeling

Schema refers to the logical structure of a database or data model that defines how tables are organised and related.
In Power BI, schemas are used to optimise data storage, retrieval, and reporting.
A schema in Power BI is crucial because it defines how data is structured, stored, and connected in a data model.
The two most used schemas in Power BI are star schema and snow flake schemas. Each schema organises data differently which in return influences performance, the complexity and usability. Optimised schemas improve query speed and report loading time, a schema also improves data accuracy meaning that proper relationships prevent incorrect aggregations or duplication, with a clear schema it is easier to create reports and dashboards and efficient schemas lead to better DAX performance and calculations.

In power BI there are also other types of schemas apart from star and snow flake schemas.
We have a galaxy schema which consists of multiple fact tables sharing common dimension tables. It is an extension of the Star Schema, designed for handling complex analytical requirements.

*Fact Tables *

A fact table contains measurable, quantitative data used for analysis. These measurements, known as facts, typically represent business processes such as sales, orders, transactions, or financial metrics.
Common characteristics of fact tables include: Large number of rows, numerical values such as revenue, quantity, cost, and profit. A fact table also includes foreign keys linking to dimension tables.

Dimension Tables

A dimension table contains descriptive attributes that provide context to the numerical data stored in fact tables. These tables help users filter, group, and categorise data for analysis.

Common characteristics of dimension tables include:
Smaller in size compared to fact tables, contain descriptive fields such as names, categories, and dates and are also used for slicing and filtering data.

Together, fact and dimension tables form the foundation of analytical data models in Power BI. Sub-Dimension Tables then further break down dimension tables to reduce data redundancy.

Star Schema

The star schema is the most widely recommended data modelling technique in Power BI. In this schema, a central fact table is connected directly to multiple dimension tables, forming a star-like structure.
A star schema stores transactional data, contain descriptive attributes and Uses one-to-many relationships, reducing complexity and improving query speed.

The structure of a star schema is as follows:

One central fact table
Multiple dimension tables directly connected to the fact table
No relationships between dimension tables

Advantages of Star Schema

Simple structure with fewer joins improves query speed meaning that it has a faster performance.
Works efficiently with Power BI calculations
Simple relationships make data modelling easier so it is easier to understand.
Enhances Power BI’s performance for dashboards.

Snowflake Schema

A Snowflake Schema is a data modelling approach in Power BI where dimension tables are further normalised into sub-dimension tables. It is a more complex version of the star schema for this design the dimension tables are normalised into multiple related tables, creating a branching structure that resembles a snowflake.

Structure of Snowflake Schema

One central fact table
Dimension tables broken down into sub-dimensions
More relationships and table joins

Advantages of Snowflake Schema

Less Data Redundancy — Normalized tables reduce duplication.
Better Data Integrity — Structured data ensures consistency.
Efficient for Large Datasets — Optimized for big data storage.
Easier Maintenance — Updates are more manageable.

Disadvantages of Snowflake Schema

More complex design.
Slower performance due to multiple joins.
Harder to understand and maintain.

Good Data Modelling in Power BI

A well-designed model, particularly using star schema, minimises the number of joins required during queries, this significantly improves report refresh speed and visual loading time.Correct relationships and table structures ensure that data is aggregated correctly, preventing misleading insights, duplication, or missing values.
Well-structured models are intuitive, allowing users to easily navigate data, build reports, and derive insights without confusion and a strong data model allows datasets to grow without significant performance degradation, ensuring long-term usability.

Conclusion

Creating an optimized data model is key to ensuring fast, scalable, and insightful reporting in Power BI. The choice of schema impacts performance, storage, and complexity and choosing the right schema helps in faster data retrieval, efficient reporting, and seamless Power BI performance. For most cases, Star Schema is the winner, schemas and data modelling play a vital role in the success of Power BI reports. Understanding concepts such as fact and dimension tables, star and snowflake schemas, and table relationships enables analysts to design efficient, scalable, and accurate data models.

DEV Community

Schemas and Data Modelling in Power BI

Top comments (0)