Sharon M.

Posted on Feb 2

Schemas and Data Modelling in Power BI

#powerbi #windows #beginners #schema

When most people hear Power BI, they immediately think of dashboards, colorful charts, large numeric cards, clickable slicers, and polished visuals. These elements are what users interact with, so it is natural to assume they are what makes a report “good.”

In reality, however, dashboards are only the final layer. Long before any visual appears on the screen, critical decisions have already been made. These decisions determine whether a report is fast or slow, reliable or misleading, intuitive or frustrating. That earlier and often invisible stage is data modelling.

Data modelling in Power BI is the process of organizing data so that Power BI understands what the data represents and how different pieces of information relate to one another. It involves deciding:

which tables are needed?
what each table should contain?
how those tables should be connected?

When this structure is well designed, Power BI feels logical and predictable. When it is not, even simple questions can return confusing or incorrect results. In other words, the quality of a Power BI report is decided before the first chart is ever created.

What “Schema” Means in Power BI

In Power BI, a schema refers to the overall structure of the data model. This is not a theoretical concept, it is the actual layout you see in Model view, including the tables and the relationships between them.

A schema answers very practical questions:

What tables exist in the model?
Which tables store measurements, and which store descriptions?
When a user clicks a slicer, how does Power BI know which data to include?

Power BI does not “reason” about data in a human way. Instead, it follows the paths you define. The schema determines:

how filters move from one table to another,
how totals and averages are calculated,
and how fast visuals respond when users interact with the report.

Two schema patterns appear most frequently in Power BI models:
Star schema
Snowflake schema

Understanding the difference between these two explains why some Power BI models feel simple and trustworthy, while others feel fragile and unpredictable.

Fact Tables and Dimension Tables: Understanding the Roles of Tables

Most Power BI models are built using two types of tables. Understanding what each one does is the foundation of data modelling.

Fact Tables: Recording What Happened

A fact table records events. Each row represents something that actually occurred.

In a dataset such as Kenya crops data, a single row in the fact table might represent:

a specific crop,
grown in a specific county,
during a specific year or season,
with a measurable outcome such as yield in kilograms.

Because these events are recorded repeatedly over time, fact tables typically:

grow very large,
repeat the same crops or counties many times,
focus on numeric values that can be summed, averaged, or counted.

A fact table does not explain what a crop is or where a county is located. It simply records that something happened.

Dimension Tables: Giving Meaning to the Events

Dimension tables exist to describe and contextualize the facts. Instead of repeating names and descriptions in every row of the fact table, that information is stored once in separate tables, such as:

a Crop table that stores crop names and types,
a County table containing county names,
a Date table containing years or seasons.

Dimension tables typically:

change slowly compared to fact tables,
contain descriptive rather than numerical data,
are used to filter, group, and label results in reports.

When you select a county or crop in a slicer, Power BI relies on the dimension table to determine which rows in the fact table should be included. This separation is what makes analysis both efficient and accurate.

The Star Schema: A Structure That Matches How Power BI Thinks

The star schema is the most effective and widely recommended structure for Power BI models.

In a star schema:

one fact table sits at the center (for example, crop yield records),
each dimension table connects directly to that fact table (crop, county, date),
dimension tables do not connect to each other.

This structure aligns closely with how Power BI processes filters.

When you selects a county in a slicer, Power BI:

Looks at the County table.
Identifies the selected county’s unique key.
Follows the relationship directly to the fact table.
Keeps only the matching rows.
Performs calculations using those rows.

Because each dimension connects straight to the fact table: filters move directly to the data being analyzed and Power BI does not need to pass through intermediary tables which leads to calculations behaving consistently.

This makes much of the analytical logic to be handled by the structure itself, reducing the need for complex formulas later.

Why the Star Schema Performs Better in Power BI

Power BI stores data in columns and is optimized for fast aggregation. It performs best when relationships are simple and unambiguous.

In a star schema, you will observe that:

Power BI follows one clear relationship path,
fewer joins are required to answer questions,
the model is easier to understand and debug.

As a result, reports load faster, slicers respond more smoothly and DAX formulas tend to be shorter and easier to reason about.

The Snowflake Schema: A bit more complex

A snowflake schema starts with the same idea as a star schema but splits descriptive information across multiple related tables.

For example, instead of storing all location details in a single County table, the data might be organized as:

a County table stores county information,
a Region table stores regional information,
the Country table stores country information. When a user selects a country, Power BI must follow a longer path before reaching the data. For Example, Start at the Country table. Then, Move to the Region table. Then move to the County table. Finally reach the fact table.

Each additional step increases processing work for Power BI and increases the chance of errors if any relationship is incorrect.

While snowflake schemas reduce duplicated data, they create challenges in Power BI because filters must travel through multiple tables, more relationships must be managed. Hence, it becomes harder to predict how calculations will behave.

For this reason, snowflake schemas are common in source systems but are often reshaped into star schemas for reporting.

Relationships: How Tables Actually Work Together

Relationships define how tables communicate and how filters flow.

When you select a county, crop, or year in a slicer, Power BI does not search the fact table directly. It looks at the dimension table, then identifies the matching key, then it follows the relationship to the fact table and filters the fact rows accordingly.

In a well-designed model:

each dimension table contains unique values (each crop or county appears once),
fact tables contain many related records linked to those values,
filters flow from dimension tables to the fact table.

This mirrors real-world logic: one county can have many crop records, and one crop can appear across many years.

Cardinality: Understanding “One” and “Many”

Cardinality describes how many rows in one table relate to rows in another.

One-to-Many means one row in a dimension table relates to many rows in the fact table.
One-to-One means one row matches exactly one row in another table. (rare in reporting)
Many-to-Many means multiple rows relate to multiple rows (can cause duplicated totals if not handled carefully)

Note: Incorrect cardinality may still produce a result but those results may not represent reality.

Why Good Data Modelling Matters

Data modelling affects every Power BI report in three key ways.

Performance

Simple structures reduce processing work, resulting in faster visuals and smoother interaction.

Accuracy

Correct relationships ensure each fact is counted once, preventing inflated totals and misleading averages.

Simplicity

Clear models make reports easier to build, understand, and maintain. Complex DAX is often a sign of a model that needs improvement.

Effective models typically:

separate measurements from descriptions,
use star schemas where possible,
define relationships clearly,
rely on the model to handle logic instead of forcing visuals to compensate.

When this foundation is solid, Power BI becomes easier to use and easier to trust the results. Schemas and data modelling directly determine whether Power BI produces reliable insight or confusing results. By understanding fact and dimension tables, choosing appropriate schemas, and defining relationships carefully, analysts create reports that are fast, accurate, and understandable.For more information, feel free to visit Microsoft on more information about PowerBI.

Also Feel free to leave a comment sharing how you approach data modelling in your own Power BI projects. Discussion and different perspectives are always welcome.

DEV Community