Damaa-C

Posted on Feb 3 • Edited on Feb 6

Practical Data Modeling in Power BI: Star and Snowflake Schemas Explained

#analytics #dataengineering #microsoft #tutorial

Introduction

In Power BI projects, many reporting issues such as slow performance, incorrect totals, or complex DAX formulas often stem from one root cause; poor data modeling. While visuals and measures usually get most of the attention, the data model is the true foundation of any reliable analytical solution.

Data modeling is the process of structuring data in a way that supports efficient analysis, accurate relationships, and meaningful insights. In Power BI, this typically means organizing data into fact tables and dimension tables, following proven data warehousing principles.

As Ralph Kimball explains in The Data Warehouse Toolkit:

Dimensions provide the 'who, what, when, where, why, and how'
context surrounding business process events.

This article provides a practical, beginner-friendly guide to data modeling in Power BI, using Sales and Fact Budget CSV datasets as working examples. The explanations are guided by concepts from Ralph Kimball’s The Data Warehouse Toolkit and practical demonstrations inspired by the Pragmatic Works Power BI Data Modeling video.

Why Data Modeling Is Important in Power BI

Improved report performance
Simpler and more readable DAX measures
Accurate filtering and aggregations
Easier maintenance and scalability
Consistent business logic across reports

Power BI is not just a visualization tool; it is also an analytical engine. Without a proper model, even the best visuals can produce misleading results.

Business Scenario and Datasets Used

To demonstrate practical data modeling concepts, this article uses two related fact tables;Sales table represents actual transactional sales data and Fact Budget table represents planned or budgeted values.

These datasets allow analysis of actual performance versus planned targets, which is a common real-world business scenario. By modeling these tables correctly, we can compare revenue against budget, calculate variances, and evaluate performance trends over time.

Both fact tables share common descriptive data such as
products, dates, and markets, making them ideal for demonstrating star and snowflake schemas in Power BI.

How We Arrive at a Data Model in Power BI

Before any relationships are created in the Model view, a good data model begins in Power Query. This is where raw data is shaped, cleaned, and organized into fact and dimension tables. The steps below describe a practical, repeatable approach used in real-world Power BI projects.

Step 1: Load the Source Data

The process starts by loading the source files:

Open Power BI Desktop
Select Get Data
Choose the appropriate source e.g Text(CSV, Excel, database, etc.)

In this example, we load:

Sales data (actual transactions)

FactBudget data (planned or budget figures)

Once loaded, we select Transform Data to open Power Query.

Step 2: Identify Facts and Dimensions

With the data visible in Power Query, the next step is to determine Which tables represent business processes (facts)
and Which attributes describe those processes (dimensions)

The Sales and FactBudget tables are kept as fact tables, while descriptive fields such as Product, Date, Market, or Department are candidates for dimension tables.

This approach follows Kimball’s principle of separating measurements from descriptive context.

Step 3: Create Dimension Tables from Fact Data

Rather than importing separate dimension files, Power BI allows dimensions to be created directly from fact tables.

For each required dimension:

Duplicate the fact table (Right-click → Duplicate)
Rename the duplicated table (e.g., DimProduct, DimMarket)
Unselect all columns, then select only the columns relevant to that dimension
Use Remove Duplicates to ensure one unique row per dimension member

For example:

DimProduct keeps only Product-related columns e.g productID, product, category, segment, unit cost, unit price.
DimCustomer keeps only Customer columns e.g customer ID, email,customer name, zip code, city, country, state.

This results in clean, compact dimension tables that connect efficiently to fact tables.

Step 4: Creating a Proper Date Dimension (DimDate)

Using a dedicated Date dimension is a core data warehousing best practice. While Power BI allows the use of date columns directly from fact tables, this approach is limited and not recommended for analytical models.

A true Date dimension allows analysts to answer questions such as:

Is this date a weekday or weekend?
What month or quarter does it belong to?
How do holidays affect performance?

To achieve this, we create a Date table using Power Query, based on a function by Devin Knight.

Adding the Date Dimension Using Power Query

The steps are:

Go to Transform Data → Power Query Editor
Select Home → New Source → Blank Query
Open Advanced Editor
Paste the Date dimension function code

//Create Date Dimension
(StartDate as date, EndDate as date)=>

let
    //Capture the date range from the parameters
    StartDate = #date(Date.Year(StartDate), Date.Month(StartDate), 
    Date.Day(StartDate)),
    EndDate = #date(Date.Year(EndDate), Date.Month(EndDate), 
    Date.Day(EndDate)),

    //Get the number of dates that will be required for the table
    GetDateCount = Duration.Days(EndDate - StartDate),

    //Take the count of dates and turn it into a list of dates
    GetDateList = List.Dates(StartDate, GetDateCount, 
    #duration(1,0,0,0)),

    //Convert the list into a table
    DateListToTable = Table.FromList(GetDateList, 
    Splitter.SplitByNothing(), {"Date"}, null, ExtraValues.Error),

    //Create various date attributes from the date column
    //Add Year Column
    YearNumber = Table.AddColumn(DateListToTable, "Year", 
    each Date.Year([Date])),

    //Add Quarter Column
    QuarterNumber = Table.AddColumn(YearNumber , "Quarter", 
    each "Q" & Number.ToText(Date.QuarterOfYear([Date]))),

    //Add Week Number Column
    WeekNumber= Table.AddColumn(QuarterNumber , "Week Number", 
    each Date.WeekOfYear([Date])),

    //Add Month Number Column
    MonthNumber = Table.AddColumn(WeekNumber, "Month Number", 
    each Date.Month([Date])),

    //Add Month Name Column
    MonthName = Table.AddColumn(MonthNumber , "Month", 
    each Date.ToText([Date],"MMMM")),

    //Add Day of Week Column
    DayOfWeek = Table.AddColumn(MonthName , "Day of Week", 
    each Date.ToText([Date],"dddd"))

in
    DayOfWeek

Rename the query (e.g., fnDimDate)

After saving the function:

Right-click the function → Invoke
Provide a Start Date and End Date Power BI generates a full Date dimension table

This approach is based on the method described by Devin Knight: Creating a Date Dimension with Power Querydate power query

Why Use a Date Dimension Instead of a Fact Date Column?

Relying on a raw date column from a fact table limits analytical capability. A dedicated Date dimension:

Enables advanced time intelligence
Provides consistent filtering across multiple fact tables
Allows identification of weekdays, weekends, holidays, and fiscal periods
Improves model clarity and reusability

This is why both Sales and FactBudget tables connect to the same DimDate table in the model.

Understanding Fact Tables

Fact tables store quantitative, measurable data generated by business processes. In this example:

Sales Fact Table

The Sales table contains transactional metrics such as:

Sales amount
Quantity sold
Revenue
Profit

The grain of the Sales table is defined at the level of a specific transaction, typically by product, date, and market.

FactBudget Table

The FactBudget table stores planned or forecasted metrics such as:

Budgeted sales amount
Budgeted revenue

Unlike transactional data, budget data is often recorded at a higher level (for example, monthly or by department), which influences how it is modeled.

Dimension Tables

Dimension tables contain descriptive attributes that give meaning to numeric facts. According to Ralph Kimball:

“Dimensions provide the descriptive context for facts.”

Common dimensions used in this model include:

DimDate – when the transaction occurred
DimProduct – what was sold
DimCustomer / Market – who bought it and where as shown in the image below;

Dimensions answer critical business questions:

Who made the purchase?
What product was sold?
When did it occur?
Where did it happen?
How or why did it occur (channel, promotion, etc.)

Date Dimension Theory and Implementation

Using a dedicated Date dimension is a best practice in data modeling. Instead of relying on raw date columns from fact tables, a Date table provides:

Consistent time-based analysis
Support for time intelligence functions
Clear relationships across multiple fact tables
Typical Date Attributes

A Date dimension commonly includes:

DateKey (e.g., YYYYMMDD)
Full date
Year
Quarter
Month number
Month name

In Power BI, a Date table can be generated using DAX. This article includes a Date table created using custom date code, which is then related to both the Sales and FactBudget tables.

In the table below, you can see dimdate relation to factsales table in the star schema.

Surrogate Keys and Relationship Design

In analytical models, surrogate (index) keys are preferred over textual fields. Examples include:

DateKey
ProductKey
CustomerKey
catseg id as shown below;

Surrogate keys improve:

Performance
Relationship consistency
Integration across multiple fact tables

They are especially useful when combining data from different systems or when natural keys are inconsistent or complex.

STAR SCHEMA

The star schema is the most common and recommended modeling pattern in Power BI.

In the Sales model:

The Sales fact table sits at the center
Dimension tables surround it
Relationships are one-to-many
Filters flow from dimensions to facts

This structure simplifies reporting and ensures efficient query performance.

Snowflake Schema

A snowflake schema occurs when dimension tables are further normalized into additional related tables.

In the FactBudget model:

Budget data may link to higher-level entities such as departments or regions
These dimensions connect to other dimension tables rather than directly to the fact table

While snowflake schemas add complexity, they are sometimes necessary, particularly for planning and budgeting data.

Integrating Star and Snowflake Schemas

Power BI allows multiple fact tables to coexist within a single model when they share common dimensions.

In this example:

Sales uses a star schema
FactBudget uses a snowflake schema

Both connect through shared dimensions such as Date and Product

This integration enables:

Actual vs budget comparisons
Variance analysis
Performance tracking across time and products

In conclusion, Power BI data modeling plays a crucial role in transforming raw data into meaningful insights. By structuring data effectively through relationships, calculated columns, measures, and hierarchies, users can create dynamic and interactive reports that support informed decision-making. Proper data modeling ensures data consistency, accuracy, and performance efficiency, allowing organizations to analyze trends, identify patterns, and make data-driven decisions with confidence. Mastery of Power BI’s data modeling capabilities not only enhances analytical capabilities but also empowers users to communicate insights visually, bridging the gap between complex data and actionable knowledge.

DEV Community