DEV Community

Cover image for Data Modeling, Joins, Relationships, and Different Schemas in Power BI
Maina kelvin
Maina kelvin

Posted on

Data Modeling, Joins, Relationships, and Different Schemas in Power BI

Power BI is often sold on its visuals, sleek charts, interactive dashboards, and drill-throughs. But the truth every experienced Power BI developer learns quickly is this: a report is only as good as the data model underneath it. A beautiful dashboard built on a poorly designed model will eventually break down with slow performance, incorrect totals, confusing filter behavior, or numbers that simply don't match what the business expects.

This article walks through the core concepts that make or break a Power BI model: data modeling fundamentals, how relationships and "joins" actually work, and the most common schema designs used in real-world projects.

1. Why Data Modeling Matters More Than Visuals

When you load data into Power BI from Excel, SQL Server, SharePoint, or any other source, you're not just importing rows and columns — you're building a semantic model (formerly called a "data model"). This model defines:

  • What tables exist
  • How those tables relate to one another
  • What calculations (measures) live on top of the data
  • How filters flow from one table to another when a user clicks a slicer or visual

Get this right, and your reports are fast, accurate, and easy to extend. Get it wrong, and you end up writing increasingly complicated DAX formulas to compensate for a structure that's fighting you the whole way (I learnt this the hard way😀😀)

2. Power BI Doesn't Really Have "Joins" — It Has Relationships

If you come from a SQL background, your instinct is to think in terms of JOIN statements, INNER JOIN, LEFT JOIN, and so on. Power BI's Power Query (the data-loading layer) does support merge queries, which behave as SQL joins, but that's only half the story.

Merges in Power Query (the "join" step)

In Power Query, you can merge two tables using options that map directly to SQL join types:

  • Left Outer – all rows from the first table, matching rows from the second
  • Right Outer – all rows from the second table, matching rows from the first
  • Full Outer – all rows from both tables
  • Inner – only matching rows from both tables
  • Left Anti – only rows from the first table with no match in the second
  • Right Anti – only rows from the second table with no match in the first

This happens during data transformation, before the data lands in the model. It physically combines columns from two queries into one table.

Relationships in the Data Model (the real backbone)

Once data is loaded, Power BI doesn't keep joining tables on the fly the way a SQL query would. Instead, you define relationships between tables — logical links based on matching key columns (usually IDs). At query time, the engine uses these relationships to filter and aggregate data across tables without physically merging them.

This is the more important concept for 90% of Power BI work. Most well-modeled reports rarely need Power Query merges at all — they rely on relationships instead, because relationships are faster, more flexible, and easier to maintain.

Key takeaway: Use merges in Power Query when you need to combine columns into a single physical table (e.g., enriching a table with a lookup value). Use relationships when you want two tables to interact dynamically in visuals and filters.

3. Anatomy of a Relationship

Every relationship in Power BI has three defining characteristics:

a) Cardinality

  • One-to-many (1:*) – the most common and recommended type. One row in a "lookup" table (e.g., Customer) relates to many rows in a "data" table (e.g., Sales).
  • Many-to-many (:) – both tables can have duplicate key values. Useful but riskier; can cause ambiguous aggregations if not handled carefully.
  • One-to-one (1:1) – rare; usually signals the two tables should be merged into one.

b) Cross-Filter Direction

  • Single direction – filters flow one way only (typically from the "one" side to the "many" side). This is the default and the safest choice for most star-schema models.
  • Bidirectional – filters flow both ways. Powerful but can introduce ambiguity and performance issues if overused. Use sparingly and intentionally.

c) Active vs. Inactive

A pair of tables can have multiple relationships, but only one can be active at a time (shown as a solid line; inactive ones are dashed). Inactive relationships can still be used in DAX via the USERELATIONSHIP function — this is common when, for example, a Sales table has both an "Order Date" and a "Ship Date" that both relate to a Date table.

4. Star Schema: The Gold Standard

The most recommended structure in Power BI (and in data warehousing generally) is the star schema.

Structure:

  • One or more fact tables in the center — these hold transactional/event data with measures (Sales, Orders, Clicks, etc.) and lots of rows.
  • Several dimension tables around it — these hold descriptive attributes (Customer, Product, Date, Region) with relatively few rows.
        Dim_Customer
              |
Dim_Product — Fact_Sales — Dim_Date
              |
         Dim_Region
Enter fullscreen mode Exit fullscreen mode

Why star schema works so well in Power BI

  1. Performance — Power BI's VertiPaq engine (the in-memory columnar storage engine) is optimized for this shape. Compression is more efficient when descriptive text lives in small dimension tables rather than being repeated millions of times in a fact table.
  2. Simplicity for DAX — Filters propagate cleanly from dimensions to facts in a single direction, which avoids ambiguous calculation paths.
  3. Reusability — A single Date or Customer dimension table can filter multiple fact tables at once.
  4. Easier maintenance — Business logic and hierarchies (e.g., Year > Quarter > Month) live in one place, not scattered across many tables.

5. Snowflake Schema: Star Schema's More Normalized Cousin

A snowflake schema takes the star schema and normalizes the dimension tables further — breaking a dimension into multiple related sub-tables.

For example, instead of one flat Dim_Product table containing Product Name, Category, and Subcategory, a snowflake schema would split this into:

Dim_Product — Dim_Subcategory — Dim_Category
Enter fullscreen mode Exit fullscreen mode

When it happens: Often it's not a deliberate design choice but a side effect of importing data directly from a normalized relational database (like a transactional SQL system) without reshaping it first.

Trade-offs:

  • ✅ Saves storage space in the source system (less relevant in Power BI, since VertiPaq compression already handles repeated text efficiently)
  • ❌ More relationships to maintain
  • ❌ Filters have to "hop" through more tables, which can hurt both performance and DAX clarity
  • ❌ More complex for report builders and end users browsing the field list

General guidance: In Power BI, it's usually better to flatten snowflake structures into a clean star schema during the Power Query stage, rather than leaving multiple small dimension tables chained together. Fewer, wider dimension tables almost always outperform many narrow, linked ones.

6. Other Schema Patterns Worth Knowing

Galaxy Schema (Fact Constellation)

Multiple fact tables share common dimension tables. For example, Fact_Sales and Fact_Returns might both connect to Dim_Product and Dim_Date. This is very common in real business models and is essentially "multiple stars sharing points."

Single Flat Table

Everything crammed into one wide table — no relationships needed. This works for very small, simple datasets but breaks down quickly: it bloats file size, slows down calculations, and makes time intelligence and reusable hierarchies much harder to manage. Generally discouraged once you have more than a trivial dataset.

Many-to-Many Bridge Tables

Sometimes two dimensions have a natural many-to-many relationship (e.g., Customers who can belong to multiple Sales Regions). Rather than forcing a direct many-to-many relationship, best practice is often to introduce a bridge table that resolves the relationship into two clean one-to-many links.

7. Practical Best Practices

  • Build a dedicated Date table and mark it as a date table in Power BI. This unlocks time intelligence functions (SAMEPERIODLASTYEAR, TOTALYTD, etc.) and avoids inconsistent date handling across fact tables.
  • Hide foreign keys used only for relationships from the report view, so end users aren't confused by raw ID columns.
  • Prefer single-direction filtering unless you have a specific, well-understood reason for bidirectional filters.
  • Avoid circular relationships — Power BI won't allow two active relationships to create a loop, so plan your model before connecting everything to everything.
  • Name tables and columns for business users, not database administrators — Customer Name, not cust_nm_txt.
  • Use Power Query to reshape, not the model to compensate. If your model is awkward, it's often because the data wasn t cleaned or flattened enough upstream.

8. Conceptual Summary

Concept What It Does Where It Happens
Merge (Join) Physically combines two tables' columns based on a key Power Query
Relationship Logically links tables so filters flow between them Data Model (Model View)
Star Schema Fact table(s) surrounded by denormalized dimensions Model design
Snowflake Schema Dimensions normalized into sub-tables Model design
Galaxy Schema Multiple fact tables sharing dimensions Model design
Bridge Table Resolves many-to-many relationships cleanly Model design

Final Thoughts

Power BI rewards modelers who think like data architects, not spreadsheet users. The temptation to import everything as one giant flat table, or to lean on Power Query merges for everything, usually leads to a model that's slow and brittle. Investing time upfront in a clean star schema — with well-defined relationships, a proper date table, and sensible cardinality — pays off every time someone adds a new report, a new measure, or a new data source down the line.

Good data modeling isn't the flashy part of Power BI. But it's the part that determines whether everything else actually works.

Top comments (0)