Judy Wambugu

Posted on Feb 3

Schemas and Data Modelling in Power BI: A Beginner-Friendly Guide

#powerbi #schema #datarelationships #datamodellinginpowerbi

When working with Power BI, one of the most important (and often overlooked) step is data modelling. A good data model makes your reports faster, easier to build and more accurate. A bad model can lead to slow performance, confusing visuals and incorrect results.

In this article we will explore:

What data modelling is
Fact and dimension tables
Common schema and model types such as Star schema and Snowflake schema
Relationships in Power BI
Why good data modelling is critical for performance and reporting accuracy

What is Data Modelling in Power BI?

Data Modelling is the process of organizing tables and defining relationships between them so that Power BI understands how data connects. Data modelling is not just a technical step, it directly affects performance, accuracy, and usability of Power BI reports.

A data model answers questions like:

How are sales linked to customers?
How do dates connect to transactions?
Which tables store descriptive data and which store numbers?

Power BI uses this model to:

Calculate measures correctly
Filter visuals properly
Optimize performance using its internal engine (Vertipaq). This engine performs best when data is structured in a simple and predictable way.

Fact Tables and Dimension Tables

Most Power BI models are built using fact tables and dimension tables.

Fact Tables

A fact table stores:

Numeric values (measures)
Business events or transactions

Examples of facts:

Sales amount
Quantity sold
Revenue
Cost
Profit

Typical characteristics:

Very large (many rows)
Contains foreign keys linking to dimension tables
Used for calculations and aggregations

Example fact table: Sales
| DateID | ProductID | CustomerID | SalesAmount | Quantity |
|------|----------|-----------|------------|----------|

Dimension Tables

A dimension table stores descriptive information that gives context to facts.

Examples of dimensions:

Customers
Products
Dates
Locations
Categories

Typical characteristics:

Smaller than fact tables
Contain unique values (one row per item)
Used for filtering, slicing, and grouping data

Example dimension table: Product
| ProductID | ProductName | Category | Brand |
|----------|------------|----------|-------|

Common schema and model types

What Is a Star Schema?

A star schema is the most recommended data model in Power BI.

In a star schema:

There is a single fact table sits at the center
There are multiple dimension tables that connect directly to the fact table. There is also no relationships between dimensions.
The model looks like a star

Some advantages of using a star schema model include:

Better/fast performance. A clean star schema reduces memory usage
Simple DAX formulas
Easier filtering and slicing
Fewer relationship issues. Poor models with unnecessary relationships slow everything down.

Some disadvantages of a star schema model include:

Data redundancy since dimension tables in a star schema are large and often store repeated descriptive values, for instance a product dimension can repeat category name and the brand name even though there may be a category or brand dimensions as well
More complex relationships
Harder DAX calculations
Less intuitive for report users

What Is a Snowflake Schema?

A snowflake schema is a more complex version of a star schema.

In a snowflake schema:

There is also a central fact table
There are also multiple dimension tables which are broken into multiple related tables. These dimensions connect to other dimensions instead of directly to the fact table

Advantages of a snowflake schema include:

Reduces data redundancy especially when the dimension tables are very large
Normalized structure (similar to databases)

Disadvantages of a snowflake schema include:

Slower performance in Power BI
More complex relationships
Harder DAX calculations
Less intuitive for report users

Relationships in Power BI

Relationships define how tables are connected.

Common Relationship Types

1. One-to-Many (1:*)

This is the default and most recommended relationship in Power BI. It is characterized by:
One dimension record which relates to many fact records For instance: One customer → many sales or One product → many transactions
It is the most preferred relationship type because it is simple, fast, easier to use for accurate filtering and works perfectly with a star schema.

2. Many-to-One(*:1)

This is the same relationship as one-to-many, just viewed from the other direction.

3. One-to-One(1:1)

It is characterised by one row in one tables matching exactly one row in another table. For instance: Employee table ↔ Employee details table.
It is mostly used when splitting a wide table into two or for security reasons.

4. Many-to-Many(:)

It is characterised by multiple rows in one table relating to multiple rows in another table. For instance: products belonging to multiple categories or customers in multiple segments.
Can cause ambiguity filtering and incorrect results if not handled carefully

Relationship Direction (Filter Direction)

Single direction: Filters flow in one direction only. A typical flow would be from dimension → fact (recommended). It is preferable because it is predictable, has better performance and there are fewer circular dependencies.
Both directions: Filters in both directions. For instance: Dimension ⇄ Fact. It is mostly used where there is many-to-many relationships and complex slicer interactions. It's downside is that it can cause confusing results and slower performance.

Why Good Data Modelling is critical for Performance and Reporting Accuracy

In conclusion, good data modelling is the foundation of every successful Power BI report. Even with clean data and beautiful visuals, a poorly designed data model can cause slow performance and incorrect results.
Power BI relies heavily on the structure of the data model to decide how data is filtered, aggregated, and calculated.
Good modelling helps performance because there are fewer tables and relationships to scan, there is reduced memory usage, faster query execution and smoother interactions (slicers, filters, visuals).

DEV Community