DEV Community

Gracemunyi
Gracemunyi

Posted on

Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained

This is a complete beginner guide covering:

A) SQL Joins & Power BI Relationships
  • SQL Joins
  • Power BI Relationships
B) Fact vs Dimension Tables, Star Schema, Snowflake Schema, and Common Concepts
  • Fact Tables
  • Dimension Tables
  • Star Schema
  • Snowflake Schema
C) Modeling Issues
  • Common Modeling Issues
  • Diagrams
  • SQL Code
  • Step-by-step Instructions
Data Modeling Explained

Lets say, a retail company has three tables: Sales, Customers, and Products. We require to model the data by organizing data tables and defining how they connect to each other so that every sale is linked to the right customer and the right product thus be able answer business questions accurately.

1. SQL Joins

A join combines rows from two tables based on a shared column. Before creating relationships in Power BI, understanding joins helps you know what data you are working with.

We use two simple tables throughout all join examples below.

Example

Table A — customers

id name country
1 Maria Germany
2 John USA
3 Georg USA
4 Martin Germany
5 Peter USA

Table B - orders

order_id sales customer_id
1001 $35 1
1002 $15 2
1003 $20 3
1004 $10 6

Note: Peter and Martin have no orders. Order 1004 has customer_id 6
which does not exist in the customers table.

INNER JOIN:Matching rows only

Returns only rows where a match exists in both tables. Peter, Martin,
and Order 1004 are all excluded.

Venn diagram

Inner Join Venn Diagram

INNER JOIN = centre match only

SQL syntax

SELECT c.id, c.name, o.order_id, o.sales
FROM customers AS c
INNER JOIN orders AS o
  ON c.id = o.customer_id;

-- Only rows matching in BOTH tables returned
Enter fullscreen mode Exit fullscreen mode

Result of merged tables after executing INNER JOIN

c.id c.name o.order_id o.sales
1 Maria 1001 $35
2 John 1002 $15
3 Georg 1003 $20

Use INNER JOIN when you only want customers who have actually placed
an order.

LEFT JOIN: All from left, matched from right

Returns all rows from the left table. Where no match exists, NULL
is shown.

Venn diagram

Left Join Venn Diagram
LEFT JOIN = entire left circle

SQL syntax

SELECT c.id, c.name, o.order_id, o.sales
FROM customers AS c
LEFT JOIN orders AS o
  ON c.id = o.customer_id;

-- All customers returned, NULL where no order exists
Enter fullscreen mode Exit fullscreen mode

Result of merged tables after executing LEFT JOIN

c.id c.name o.order_id o.sales
1 Maria 1001 $35
2 John 1002 $15
3 Georg 1003 $20
4 Martin NULL NULL
5 Peter NULL NULL

Use LEFT JOIN when you want all customers including those who havenever placed an order.

RIGHT JOIN: All from right, matched from left

Returns all rows from the right table. Where no matching customer
exists, NULL is shown.

SQL syntax

SELECT c.id, c.name, o.order_id, o.sales
FROM customers AS c
RIGHT JOIN orders AS o
  ON c.id = o.customer_id;

-- All orders returned, NULL where no customer exists
Enter fullscreen mode Exit fullscreen mode

Result of merged tables after executing RIGHT JOIN

c.id c.name o.order_id o.sales
1 Maria 1001 $35
2 John 1002 $15
3 Georg 1003 $20
NULL NULL 1004 $10

Most analysts prefer to swap tables and use LEFT JOIN.


FULL OUTER JOIN: Everything from both tables

Returns all rows from both tables. Unmatched rows get NULL on the
missing side.

Venn diagram

Full Outer Join Venn Diagram

FULL OUTER JOIN = both full circles

SQL syntax

SELECT c.id, c.name, o.order_id, o.sales
FROM customers AS c
FULL OUTER JOIN orders AS o
  ON c.id = o.customer_id;

-- ALL rows from both tables, NULL on missing sides
Enter fullscreen mode Exit fullscreen mode

Result of merged tables after executing FULL OUTER JOIN

c.id c.name o.order_id o.sales
1 Maria 1001 $35
2 John 1002 $15
3 Georg 1003 $20
4 Martin NULL NULL
5 Peter NULL NULL
NULL NULL 1004 $10

LEFT ANTI JOIN — unmatched from left only

Returns only rows from the left table that have no match in the
right table.

Left Anti Join Venn Diagram

SQL syntax

SELECT c.id, c.name
FROM customers AS c
LEFT JOIN orders AS o
  ON c.id = o.customer_id
WHERE o.customer_id IS NULL;

-- Only customers with NO matching order
Enter fullscreen mode Exit fullscreen mode

Result of merged tables after executing LEFT ANTI JOIN

c.id c.name
4 Martin
5 Peter

Use LEFT ANTI to answer: "Which customers have never placed an order?"


RIGHT ANTI JOIN : Unmatched from right only

Returns only rows from the right table that have no match in the left table.

Right Anti Join Veen Diagram

SQL syntax

SELECT o.order_id, o.customer_id, o.sales
FROM customers AS c
RIGHT JOIN orders AS o
  ON c.id = o.customer_id
WHERE c.id IS NULL;

-- Only orders with NO matching customer
Enter fullscreen mode Exit fullscreen mode

Result of merged tables after executing RIGHT ANTI JOIN

o.order_id o.customer_id o.sales
1004 6 (no match) $10

Use RIGHT ANTI for data quality checks: "Which orders have an
invalid customer ID?"


Joins in Power BI

Feature SQL Join / Power Query Merge Power BI Relationship
What it does Physically combines rows into one table Keeps tables separate, filters flow between them
When it runs At data load time At query time when you click a slicer or visual
Best for Flat tables, one-off analysis Star schema models with slicers and visuals

Simple rule: Use a merge in Power Query when you need one combined
table. Use a relationship in Model View when you want tables to filter
each other in your report.

How to create a join in Power Query — step by step

  1. Open Power BI Desktop and load your tables
  2. Click HomeTransform Data to open Power Query Editor
  3. Select the left table (e.g. Customers) in the left panel
  4. Click HomeMerge Queries
  5. Select the right table (Orders), click the matching column in each table, then choose your join type from the dropdown
  6. Click OK — a merged column appears. Click the expand icon to choose which columns to bring in
  7. Click Close and Apply to load the merged table back into Power BI

Available join types in the Power Query dialog:

  • Left Outer (all from left). It is the most common
  • Right Outer
  • Full Outer
  • Inner
  • Left Anti
  • Right Anti

2. Relationships in Power BI

Relationships in Power BI keep tables separate but allow filters to flow between them. You manage them in Model View or through the Manage Relationships dialog.

Types of relationships

One-to-Many (1:M): Most common-One customer has many orders

1-M

Many-to-Many (M:M): Many students take many courses

M-M

One-to-One (1:1): Each employee has one profile |

1-1

Cross-filter direction

  • Single direction: Filters flow one way, from dimension to fact table. Default and safest.
  • Both directions :Filters flow both ways. Useful but can cause circular dependency issues if overused.

How to create a relationship in Model View — step by step

  1. Click the Model View icon on the left sidebar
  2. You will see all your tables displayed as boxes with their columns
  3. Drag the CustomerID column from the Customers table and drop it onto CustomerID in the Orders table
  4. A relationship line appears — 1 on the Customers side, * on the Orders side
  5. Double-click the line to edit cardinality, cross-filter direction, or active/inactive status ---

3. Data Warehouse Schemas

Fact Tables vs Dimension Tables

Feature Fact Table Dimension Table
Stores Measurable events Descriptive context
Row count Many — can be millions Few — hundreds to thousands
Contains Amounts, quantities, foreign keys Names, categories, dates, locations
Examples Sales, Orders, Transactions Customers, Products, Date, Location

Simple rule: If a table answers "how much or how many" it is
a Fact table. If it answers "who, what, when, or where" it is a
Dimension table.


Star Schema, Snowflake Schema, and Flat Table

Flat table

Everything in one single wide table. Easy to start but causes data duplication and incorrect totals when filtering.

Flat Table

Star schema: Recommended for Power BI

Its is one central fact table directly connected to dimension tables. Simple, fast, and best for DAX and Power BI slicers.

In a star schema, customer details live in one row only

Example of one in power bi

Star

Steps to build a star schema in Power BI

  1. Load your Fact table (Sales) and all Dimension tables into Power BI
  2. Go to Model View on the left sidebar
  3. Arrange the Fact table in the center and Dimension tables around it
  4. Drag the foreign key in the Fact table to the primary key in each Dimension table to create a 1:M relationship
  5. Verify all lines show 1 on the dimension side and * on the fact side

Snowflake schema

Similar to a star schema but dimension tables are further broken into sub-tables. More complex and slower in Power BI. Only use snowflake if your data warehouse already uses this structure.

Example of one in Powerbi

Snowflake


When to use each schema

Schema Use when Avoid when
Star Building any Power BI report Never avoid it. Always the default choice
Snowflake Data warehouse already uses it Building a new model from scratch
Flat table Quick one-off analysis on small data Data is updated regularly

6. Role-Playing Dimensions

A role-playing dimension is one dimension table used multiple times in a fact table for different purposes.

Example: Orders fact table with three date columns:

Column Meaning
OrderDateKey When the customer placed the order
ShipDateKey When the order was dispatched
DeliveryDateKey When the customer received it

All three foreign keys point to the same Date dimension table.

How to set this up in Power BI

  1. Create one active relationship from the Date table to OrderDateKey
  2. Create two inactive relationships to ShipDateKey and DeliveryDateKey
  3. Use USERELATIONSHIP() in your DAX measures to activate the right one:
ShippedRevenue =
CALCULATE(
    SUM(Sales[Amount]),
    USERELATIONSHIP(Date[DateKey], Sales[ShipDateKey])
)
Enter fullscreen mode Exit fullscreen mode

7. Common Modeling Issues to Avoid

Issue Why it is a problem How to fix it
Many-to-many relationships Causes duplicated totals Use a bridge table or restructure your model
Bi-directional filters everywhere Creates circular dependencies Use single direction by default
No dedicated Date table Time intelligence DAX will not work Always create a proper Date dimension table
Flat table models Data duplication causes wrong aggregations Switch to a star schema
Wrong cardinality direction Breaks filter propagation Put 1 on dimension side, * on fact side
Calculated columns on fact tables Slows down large tables — stored in memory Use DAX measures instead

8. Key Takeaways

  • Use star schema as your default one fact table, dimensions around it
  • SQL joins combine tables physically: relationships keep them separate and filter across them
  • 1:M relationships are the most common and safest type in Power BI.
  • Always create a dedicated Date table for time intelligence
  • A LEFT JOIN keeps all rows from the left table including those with no match
  • Avoid flat tables and bi-directional filters unless you have a specific reason.
  • Use role-playing dimensions when the same dimension table is needed multiple times in one fact table

This was the second update on my learning journey to becoming a data engineer. If you found this helpful, feel comment on other areas of learning and improvement.

Top comments (0)