DEV Community

Daniel Kemei
Daniel Kemei

Posted on

Getting started with Data Modeling in Power BI

What is Data Modelling?

Data modeling is the process of organizing and structuring your data tables and defining the relationships between them.This creates a logical framework that allows Power BI to understand how your data connects and flows.
There are two types of table in Power BI:
-Fact Tables
-Dimension Tables

Fact Tables

Fact tables contain the measurable and quantitative data in your model.They contain numbers you want to analyze, such as sales amounts, quantities sold, revenue, costs, or transaction counts.Fact tables typically contain many rows and have foreign keys that link to dimension tables.

Dimension Tables

Dimension tables provide the descriptive context for your facts.A dimension table contains a key column that acts as a unique identifier, and other columns which support filtering and grouping your data.

Star schema

Star schema contains a central fact table connected to multiple dimension tables.

It is the most common and recommended data modeling approach in Power BI.
In a star schema, you have:

-One or more fact tables at the center containing your measurements

-Dimension tables radiating outward, each connecting directly to the fact table

-No connections between dimension tables themselves

Why Star Schema is Recommended

Power BI's engine is specifically optimized for star schemas. The way it compresses data and performs calculations works best with this structure, resulting in faster query performance and smaller file sizes.
Writing DAX formulas becomes more straightforward when your model follows a star schema, because the relationships are predictable and
consistent.

Snowflake dimensions

A snowflake dimension is a set of normalized tables for a single business entity.Products are assigned to subcategories, and subcategories are in turn assigned to categories.

While snowflake schemas can reduce data redundancy and save storage space in traditional databases, they're generally not recommended for Power BI for several reasons:

-Power BI loads more tables, which is less efficient from storage and performance perspectives. These tables must include columns to support model relationships, and it can result in a larger model size.

-Longer relationship filter propagation chains need to be traversed, which might be less efficient than filters applied to a single table.

-The Data pane presents more model tables to report authors, which can result in a less intuitive experience, especially when snowflake dimension tables contain only one or two columns.

-It's not possible to create a hierarchy that comprises columns from more than one table.

Relationships

Relationships are the connections between tables that allow Power BI to combine data from multiple sources.

Types of Relationships

One-to-one relationship

One-to-one relationships occur in two main scenarios: degenerate dimensions derived from fact tables, and row data that spans across tables where a single business entity is loaded as two or more model tables.
For Intra Source Group Relationships:
When a one-to-one intra source group relationship exists between tables,it is reccommended to consolidate the data into a single model table by merging Power Query queries using a left outer join.
For Cross Source Group Relationships:
When a one-to-one cross source group relationship exists between tables, there's no alternative model design unless you pre-consolidate the data in your data source, as Power BI evaluates these as limited relationships.

Many-to-Many Relationships

Many-to-many scenarios occur in three situations: relating two dimension tables, relating dimension tables to higher grain fact tables, and the many-to-many scenario involving entities like bank customers and bank accounts.

Active vs Inactive Relationships

Active Relationships

  • By default, only one active relationship can exist between two tables.
  • Active relationships propagate filters automatically in visuals.
  • Use active relationships where most filtering should occur

Inactive Relationships

  • Inactive relationships do not filter by default.
  • Use them when you have multiple possible joins (like a fact table with two date columns)

Bi-Directional relationship

A bi-directional relationship sends filters in both directions between tables, instead of the usual one-way flow.
When Bi-Directional Is Useful

  • Special cases such as many-to-many with a bridge table or one-to-one relationships.
  • Scenarios where slicers should reflect only values with data (like Excel pivot slicers).
  • Limited dimension-to-dimension analysis where filters need to propagate across multiple tables.

Why Good Data Modeling Matters

The quality of your data model directly impacts three critical areas: performance, accuracy, and usability.

Practical Tips for Building Your Model

-Start by identifying your fact tables—what are you measuring? Then identify the dimensions—what context do you need to analyze those measurements?

-Transform your data in Power Query to create star schemas. Merge snowflaked dimensions, remove unnecessary columns, and ensure your dimension tables have unique keys.
-Create a calendar table using DAX or Power Query, even if you have date columns in your fact tables. This gives you complete control over fiscal years, holidays, and custom date groupings.
-Name your tables and columns clearly and consistently. Use spaces in names to make them readable
-Hide technical columns like keys and internal IDs that users don't need to see in reports. Keep your field list clean and focused on business-relevant fields.

Top comments (0)