DEV Community

Cover image for Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.
Frederick M
Frederick M

Posted on

Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.

If you’ve ever felt confused about joins vs relationships, or why your Power BI report is giving incorrect totals, this is where data modeling comes in.

This guide breaks it down simply, with real-world examples and practical steps inside Power BI.


1. What is Data Modeling?

Data modeling is how you structure your data so Power BI can:

  • Understand relationships between tables
  • Aggregate data correctly
  • Perform fast and accurate calculations

Think of it like this:

Data modeling = organizing your data into a clean, logical system before analysis.


2. Joins in Power BI (Power Query)

Joins happen before data is loaded, inside Power Query. They physically combine tables.

Where to create joins:

  1. Go to Home → Transform Data
  2. Open Power Query
  3. Select a table
  4. Click Merge Queries
  5. Choose another table and matching column(s)
  6. Select join type

Types of Joins (with real-life examples)

1. INNER JOIN

Returns only matching records.

Example:

  • Customers table
  • Orders table

Only customers who placed orders appear.

Customers      Orders
A              A
B              B
C              -

Result:
A, B
Enter fullscreen mode Exit fullscreen mode

2. LEFT JOIN (Most Common)

Returns all records from left + matches from right.

Example:
All customers, even those without orders.

Result:
A (order)
B (order)
C (null)
Enter fullscreen mode Exit fullscreen mode

3. RIGHT JOIN

Opposite of LEFT JOIN.

Returns all records from the right table.


4. FULL OUTER JOIN

Returns everything from both tables.

Result:
A (match)
B (match)
C (left only)
D (right only)
Enter fullscreen mode Exit fullscreen mode

5. LEFT ANTI JOIN

Returns rows in left table with no match in right.

Example:
Customers who NEVER ordered.


6. RIGHT ANTI JOIN

Returns rows in right table with no match in left.


When to use joins

Use joins when:

  • You need to combine data permanently
  • You're shaping raw data before modeling

3. Relationships in Power BI

Relationships are created after loading data. They do NOT merge tables, just connect them.

Where to create relationships:

  • Go to Model View
  • Drag a column from one table to another OR
  • Go to Manage Relationships → New

Types of Relationships

1. One-to-Many (1:M), Most Common

  • One side = unique values (Dimension)
  • Many side = repeated values (Fact)

Example:

  • Customers (unique IDs)
  • Orders (many orders per customer)

2. Many-to-Many (M:M)

Both sides contain duplicates.

Use carefully, can cause ambiguity.


3. One-to-One (1:1)

Rare. Both tables have unique keys.


Cardinality (IMPORTANT)

Defines how tables relate:

  • 1 → Many
  • Many → Many
  • 1 → 1

Cross Filter Direction

Controls how filters flow:

Single Direction

  • Filters flow one way (recommended)

Both Direction

  • Filters flow both ways (can cause confusion if misused)

Active vs Inactive Relationships

  • Active: Used by default
  • Inactive: Requires DAX (USERELATIONSHIP)

Example:

  • Order Date (active)
  • Ship Date (inactive)

4. Joins vs Relationships (Critical Difference)

Joins Relationships
Done in Power Query Done in Model View
Combines tables Keeps tables separate
Static Dynamic
Increases table size More efficient

Rule of thumb:

Use relationships whenever possible. Avoid unnecessary joins.


5. Fact vs Dimension Tables

Fact Table

  • Contains measurable data
  • Large
  • Example:

    • Sales Amount
    • Quantity

Dimension Table

  • Contains descriptive data
  • Smaller
  • Example:

    • Customer Name
    • Product Category

6. Data Modeling Schemas


1. Star Schema (BEST PRACTICE ⭐)

        Customers
            |
Products — Sales — Dates
Enter fullscreen mode Exit fullscreen mode
  • Fact table in center
  • Dimensions around it

Advantages:

  • Fast
  • Simple
  • Scalable

2. Snowflake Schema

Dimensions are normalized into multiple tables.

Product → Category → Department
Enter fullscreen mode Exit fullscreen mode

Pros:

  • Less redundancy

Cons:

  • More complex
  • Slower

3. Flat Table (DLAT - Denormalized)

Everything in one table.

Pros:

  • Simple to start

Cons:

  • Poor performance
  • Hard to maintain

7. Role-Playing Dimensions

A single dimension used multiple times.

Example:
Date table used as:

  • Order Date
  • Ship Date
  • Delivery Date

Solution:

  • Duplicate the Date table
  • Create separate relationships

8. Common Data Modeling Issues

1. Many-to-Many confusion

Fix:

  • Introduce a bridge table

2. Circular relationships

Fix:

  • Remove unnecessary relationships

3. Incorrect totals

Fix:

  • Check relationship direction & cardinality

4. Duplicate keys in dimension table

Fix:

  • Ensure uniqueness

9. Step-by-Step Workflow (Practical)

Step 1: Load Data

  • Get Data → Import tables

Step 2: Clean Data (Power Query)

  • Remove duplicates
  • Fix data types
  • Create joins only if necessary

Step 3: Build Relationships

  • Go to Model View
  • Connect tables using keys

Step 4: Validate Model

  • Check:

    • Cardinality
    • Filter direction
    • Active relationships

Step 5: Create Measures (DAX)

Example:

Total Sales = SUM(Sales[Amount])
Enter fullscreen mode Exit fullscreen mode

Final Thoughts

If you remember nothing else, remember this:

  • Use Star Schema
  • Prefer relationships over joins
  • Keep dimension tables clean and unique
  • Avoid many-to-many unless necessary

Once your model is clean, everything else (DAX, visuals, performance) becomes easier.

Top comments (0)