Understanding Data Modeling in Power BI

#beginners #microsoft #data #analytics

Introduction
Power Bi is a powerful tool that I have recently been undertaking that has challenged me and opened my eyes differently. Slowly by slowly a few things have greatly stood out to me right from the basics to a couple of advanced concepts.
Before Power BI can draw a chart or a visualization, it must understand how tables relate.
That structure is the data model. A strong model produces fast, accurate insights a weak model leads to wrong results. Power BI works in three layers: Power Query (clean), Data Model (relate) and Report View (visualize).

Joins (Power Query)
A join physically combines rows from two tables based on a shared column. In Power BI, joins live inside Power Query, accessed via Home→Merge Queries. The join type you select determines exactly which rows survive the merge. For any join: open Power Query, select your first table, go to Merge Queries→Merge as New, choose the second table, click the matching key column in each, then pick a join.
Common types: Inner join (matches only), Left join (all left and matches), Right join, Full join and Anti joins for finding missing data.

*Relationships *(Model View)
Joins physically merge tables. Relationships are different, they keep tables separate but tell Power BI how they logically connect. When you build a visual, Power BI follows these relationship lines to pull the right numbers from the right rows automatically.
Create relationships two ways: visually in Model View by dragging a column from one table onto the matching column in another or precisely via Home→Manage Relationships→New, where you select both tables, click matching columns, set cardinality and cross-filter direction and click OK.

Cross- filter direction controls which direction filters flow across a relationship. Single means filters travel from the "one" side (dimension) to the "many" side (fact) — selecting a Customer filters the Orders table, but not vice versa. Both allows
filtering in both directions, which can cause circular dependencies and unexpected DAX
results. Leave it as Single unless you have a specific need to change it.

Cardinality: one to many is best practice. Avoid many-to-many unless necessary. Cross-filter
direction should usually remain single to prevent ambiguity.

Remember- Joins combine data during refresh, increasing size. Relationships link tables when querying, making models efficient.

**Fact & Dimension Tables
**Fact tables store numeric data (e.g., sales). Dimension tables store descriptive attributes (e.g., customers, dates). Fact tables sit on the many side while dimensions on the one side.

****Star Schema
**This is the most recommended schema to use. It contains a central fact table that connects to multiple dimension tables. This structure is fast, scalable and DAX-friendly.

**Before vs After Modeling
**Flat table: duplicated data, large size, slow performance. Star schema has a clean structure, smaller size and faster queries. It reduces a lot of redundancy and repetition and formats trickle
down throughout the tables that are in the schema.

*Common Issues to look out for while working with data *
• Wrong totals: duplicate keys you can fix this by removing duplicates.
• No filtering is tied to a missing relationship.
• Slow model has too many text columns in the fact table.

DEV Community

Understanding Data Modeling in Power BI

Top comments (0)