<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kendi</title>
    <description>The latest articles on DEV Community by Kendi (@kendixy).</description>
    <link>https://dev.to/kendixy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3843192%2Fd6c04fe7-e604-408d-938a-3af520fd81d1.png</url>
      <title>DEV Community: Kendi</title>
      <link>https://dev.to/kendixy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kendixy"/>
    <language>en</language>
    <item>
      <title>Power BI Data Modeling Explained Simply: Joins, Relationships, and Schemas</title>
      <dc:creator>Kendi</dc:creator>
      <pubDate>Sat, 28 Mar 2026 20:43:42 +0000</pubDate>
      <link>https://dev.to/kendixy/power-bi-data-modeling-explained-simply-joins-relationships-and-schemas-b0a</link>
      <guid>https://dev.to/kendixy/power-bi-data-modeling-explained-simply-joins-relationships-and-schemas-b0a</guid>
      <description>&lt;p&gt;Here is something nobody tells you when you start learning Power BI: the visuals are the easy part. Anyone can drag a bar chart onto a canvas. What separates a report that works from one that lies to you is what happens before you touch a single visual — the data model.&lt;/p&gt;

&lt;p&gt;Get the model right and your numbers are accurate, your reports are fast, and your DAX measures are simple. Get it wrong and no amount of formatting or fancy visuals will fix it. You will just have a beautifully designed wrong answer.&lt;/p&gt;

&lt;p&gt;This article covers everything you need to build models that actually work.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Data Modeling?
&lt;/h2&gt;

&lt;p&gt;Data modeling is the process of organizing your tables and defining how they connect to each other so that analysis is accurate, efficient, and scalable.&lt;/p&gt;

&lt;p&gt;In Power BI specifically, your data model determines three things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Whether your calculations are correct&lt;/li&gt;
&lt;li&gt;How fast your report runs&lt;/li&gt;
&lt;li&gt;How easily someone can explore the data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A weak model produces incorrect aggregations, duplicate counts, and reports that take forever to load. A strong model makes all of that disappear.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 1: SQL Joins — Combining Tables in Power Query
&lt;/h2&gt;

&lt;p&gt;Joins are how you physically combine two tables into one based on a shared column called a key. In Power BI, joins happen in &lt;strong&gt;Power Query&lt;/strong&gt; during data preparation — before anything reaches your model.&lt;/p&gt;

&lt;p&gt;Let us use a consistent example throughout. You have two tables:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Staff Table&lt;/strong&gt;&lt;br&gt;
| StaffID | StaffName |&lt;br&gt;
|---------|-----------|&lt;br&gt;
| S01 | James |&lt;br&gt;
| S02 | Amina |&lt;br&gt;
| S03 | Peter |&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Training Table&lt;/strong&gt;&lt;br&gt;
| StaffID | Course |&lt;br&gt;
|---------|--------|&lt;br&gt;
| S01 | Excel |&lt;br&gt;
| S02 | Power BI |&lt;br&gt;
| S04 | SQL |&lt;/p&gt;

&lt;p&gt;Notice: Peter (S03) has no training record. The SQL course (S04) has no matching staff member. This mismatch is exactly what each join type handles differently.&lt;/p&gt;


&lt;h3&gt;
  
  
  INNER JOIN — Only What Matches in Both
&lt;/h3&gt;

&lt;p&gt;Returns rows that have a match in &lt;strong&gt;both&lt;/strong&gt; tables. Anything without a match on either side is excluded.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
| StaffID | StaffName | Course |&lt;br&gt;
|---------|-----------|--------|&lt;br&gt;
| S01 | James | Excel |&lt;br&gt;
| S02 | Amina | Power BI |&lt;/p&gt;

&lt;p&gt;Peter is excluded — no training record. The SQL course is excluded — no matching staff.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; A clean attendance report showing only staff with confirmed training records.&lt;/p&gt;


&lt;h3&gt;
  
  
  LEFT JOIN — Keep Everything on the Left
&lt;/h3&gt;

&lt;p&gt;Returns all rows from the left table. Rows from the right table are included only where a match exists. No match means null.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
| StaffID | StaffName | Course |&lt;br&gt;
|---------|-----------|--------|&lt;br&gt;
| S01 | James | Excel |&lt;br&gt;
| S02 | Amina | Power BI |&lt;br&gt;
| S03 | Peter | null |&lt;/p&gt;

&lt;p&gt;Peter stays — but his Course is null because no training record exists for him.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; All staff members, flagging those who have not yet completed any training.&lt;/p&gt;


&lt;h3&gt;
  
  
  RIGHT JOIN — Keep Everything on the Right
&lt;/h3&gt;

&lt;p&gt;Mirror of the LEFT JOIN. All rows from the right table are kept. Left table rows are included only where a match exists.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
| StaffID | StaffName | Course |&lt;br&gt;
|---------|-----------|--------|&lt;br&gt;
| S01 | James | Excel |&lt;br&gt;
| S02 | Amina | Power BI |&lt;br&gt;
| S04 | null | SQL |&lt;/p&gt;

&lt;p&gt;The SQL course stays — but StaffName is null because S04 does not exist in the Staff table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; All training courses offered, identifying any assigned to staff members who no longer exist in the system.&lt;/p&gt;


&lt;h3&gt;
  
  
  FULL OUTER JOIN — Everything from Both Tables
&lt;/h3&gt;

&lt;p&gt;Returns every row from both tables. Where there is no match, nulls fill the gap on the missing side. Nothing is excluded.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
| StaffID | StaffName | Course |&lt;br&gt;
|---------|-----------|--------|&lt;br&gt;
| S01 | James | Excel |&lt;br&gt;
| S02 | Amina | Power BI |&lt;br&gt;
| S03 | Peter | null |&lt;br&gt;
| S04 | null | SQL |&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; A full audit comparing your HR system against your training system — surfacing every mismatch in both directions at once.&lt;/p&gt;


&lt;h3&gt;
  
  
  LEFT ANTI JOIN — Only the Unmatched Left Rows
&lt;/h3&gt;

&lt;p&gt;Returns only rows from the left table that have &lt;strong&gt;no match&lt;/strong&gt; in the right table. The opposite of an INNER JOIN in a sense.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
| StaffID | StaffName |&lt;br&gt;
|---------|-----------|&lt;br&gt;
| S03 | Peter |&lt;/p&gt;

&lt;p&gt;Only Peter — the staff member with no training record.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; Finding staff members who have not attended any training. A compliance check.&lt;/p&gt;


&lt;h3&gt;
  
  
  RIGHT ANTI JOIN — Only the Unmatched Right Rows
&lt;/h3&gt;

&lt;p&gt;Returns only rows from the right table that have no match in the left table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
| StaffID | Course |&lt;br&gt;
|---------|--------|&lt;br&gt;
| S04 | SQL |&lt;/p&gt;

&lt;p&gt;Only the SQL course — assigned to a StaffID that does not exist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; Identifying orphaned records in a system — training records pointing to staff members who no longer exist.&lt;/p&gt;


&lt;h3&gt;
  
  
  How to Create Joins in Power Query
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;strong&gt;Home tab → Transform Data&lt;/strong&gt; to open Power Query Editor&lt;/li&gt;
&lt;li&gt;Select the table you want as your left table&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Home tab → Merge Queries&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select the second table from the dropdown&lt;/li&gt;
&lt;li&gt;Click the matching column in each table to set the key&lt;/li&gt;
&lt;li&gt;Choose your &lt;strong&gt;Join Kind&lt;/strong&gt; from the dropdown&lt;/li&gt;
&lt;li&gt;Click OK&lt;/li&gt;
&lt;li&gt;Click the expand icon on the new merged column to select which columns to bring in&lt;/li&gt;
&lt;/ol&gt;


&lt;h3&gt;
  
  
  Join Summary Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Join Type&lt;/th&gt;
&lt;th&gt;Keeps from Left&lt;/th&gt;
&lt;th&gt;Keeps from Right&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Inner&lt;/td&gt;
&lt;td&gt;Matching only&lt;/td&gt;
&lt;td&gt;Matching only&lt;/td&gt;
&lt;td&gt;Clean matched data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Left&lt;/td&gt;
&lt;td&gt;All rows&lt;/td&gt;
&lt;td&gt;Matching only&lt;/td&gt;
&lt;td&gt;Enrich left, flag gaps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Right&lt;/td&gt;
&lt;td&gt;Matching only&lt;/td&gt;
&lt;td&gt;All rows&lt;/td&gt;
&lt;td&gt;Enrich right, flag gaps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Full Outer&lt;/td&gt;
&lt;td&gt;All rows&lt;/td&gt;
&lt;td&gt;All rows&lt;/td&gt;
&lt;td&gt;Full system audit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Left Anti&lt;/td&gt;
&lt;td&gt;Unmatched only&lt;/td&gt;
&lt;td&gt;Nothing&lt;/td&gt;
&lt;td&gt;Find missing references&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Right Anti&lt;/td&gt;
&lt;td&gt;Nothing&lt;/td&gt;
&lt;td&gt;Unmatched only&lt;/td&gt;
&lt;td&gt;Find orphaned records&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h2&gt;
  
  
  Part 2: Power BI Relationships — Joins vs Relationships
&lt;/h2&gt;

&lt;p&gt;This distinction trips up almost everyone starting out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A join physically merges two tables into one.&lt;/strong&gt; The result is a single flat table. You use this during data preparation in Power Query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A relationship keeps tables separate and creates a logical filter connection between them.&lt;/strong&gt; The data stays in its own table. Power BI uses the relationship to know how filters should flow when someone interacts with a report.&lt;/p&gt;

&lt;p&gt;In most professional Power BI models, relationships handle the core structure. Joins are reserved for specific data preparation steps where you genuinely need to flatten or enrich a table before loading it into the model.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Joins&lt;/th&gt;
&lt;th&gt;Relationships&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Where&lt;/td&gt;
&lt;td&gt;Power Query&lt;/td&gt;
&lt;td&gt;Model View&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tables&lt;/td&gt;
&lt;td&gt;Physically combined&lt;/td&gt;
&lt;td&gt;Stay separate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;Heavier&lt;/td&gt;
&lt;td&gt;More efficient&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flexibility&lt;/td&gt;
&lt;td&gt;Static&lt;/td&gt;
&lt;td&gt;Dynamic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Use for&lt;/td&gt;
&lt;td&gt;Data preparation&lt;/td&gt;
&lt;td&gt;Analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h3&gt;
  
  
  How to Create Relationships in Power BI
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Method 1 — Drag and drop in Model View:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Click the &lt;strong&gt;Model View&lt;/strong&gt; icon on the left sidebar&lt;/li&gt;
&lt;li&gt;Find the key column in one table&lt;/li&gt;
&lt;li&gt;Drag it onto the matching key column in the other table&lt;/li&gt;
&lt;li&gt;A relationship line connects the two tables&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Method 2 — Manage Relationships:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;strong&gt;Modeling tab → Manage Relationships&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;New&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select both tables and their matching columns&lt;/li&gt;
&lt;li&gt;Set cardinality and cross-filter direction&lt;/li&gt;
&lt;li&gt;Click OK&lt;/li&gt;
&lt;/ol&gt;


&lt;h3&gt;
  
  
  Cardinality — What Kind of Relationship Is It?
&lt;/h3&gt;
&lt;h4&gt;
  
  
  One-to-Many (1:M) — Use This as Your Default
&lt;/h4&gt;

&lt;p&gt;One row in the first table matches many rows in the second.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; One Department can have many Employees. The Departments table lists each department once. The Employees table has that department ID repeated across many rows.&lt;/p&gt;

&lt;p&gt;This is the most common, most efficient, and most reliable relationship type in Power BI. Build your model around 1:M relationships wherever possible. In Model View it shows as &lt;strong&gt;1&lt;/strong&gt; on one side and ***** on the other.&lt;/p&gt;
&lt;h4&gt;
  
  
  Many-to-Many (M:M) — Use With Real Caution
&lt;/h4&gt;

&lt;p&gt;Many rows in both tables can match each other.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Doctors and Hospitals. One doctor works at multiple hospitals. One hospital employs multiple doctors.&lt;/p&gt;

&lt;p&gt;Power BI supports M:M natively but it introduces ambiguous filtering and can produce incorrect totals. Where possible, resolve M:M by introducing a bridge table — a third table that breaks the relationship into two clean 1:M relationships.&lt;/p&gt;
&lt;h4&gt;
  
  
  One-to-One (1:1) — Rare and Usually Unnecessary
&lt;/h4&gt;

&lt;p&gt;Each row in one table matches exactly one row in the other.&lt;/p&gt;

&lt;p&gt;If you have a 1:1 relationship, ask yourself honestly whether these tables should just be merged. They usually should.&lt;/p&gt;


&lt;h3&gt;
  
  
  Active vs Inactive Relationships
&lt;/h3&gt;

&lt;p&gt;Power BI allows only &lt;strong&gt;one active relationship&lt;/strong&gt; between any two tables. Active relationships are what visuals and DAX measures use by default. Additional relationships between the same tables must be inactive — shown as dashed lines in Model View.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When do you need inactive relationships?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A Date table connected to a fact table with multiple date columns is the classic case. Say your Orders table has OrderDate, ShipDate, and DeliveryDate. You can only have one active relationship to your Date table. The others are inactive.&lt;/p&gt;

&lt;p&gt;To use an inactive relationship in a measure, activate it temporarily with USERELATIONSHIP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Sales by Ship Date = CALCULATE(
    SUM(Orders[Revenue]),
    USERELATIONSHIP(Orders[ShipDate], Calendar[Date])
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This measure calculates revenue filtered by ShipDate instead of the default OrderDate — without permanently changing the model.&lt;/p&gt;




&lt;h3&gt;
  
  
  Cross-Filter Direction — Which Way Do Filters Flow?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Single direction:&lt;/strong&gt; Filters flow from the "one" side to the "many" side only. This is the default and the right choice for most relationships.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Both directions (bidirectional):&lt;/strong&gt; Filters flow both ways simultaneously.&lt;/p&gt;

&lt;p&gt;Bidirectional sounds helpful but it creates real problems — ambiguous filter paths, performance degradation, and measures that produce incorrect results in ways that are very hard to diagnose. Start with single direction always. Only consider bidirectional after testing confirms you genuinely need it and the behavior is correct.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 3: Fact Tables and Dimension Tables
&lt;/h2&gt;

&lt;p&gt;Every professional data model is built around this distinction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fact Tables
&lt;/h3&gt;

&lt;p&gt;A fact table stores &lt;strong&gt;measurable events&lt;/strong&gt; — things that happened. Each row is one transaction or event.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Contains numbers you want to measure: revenue, quantity, duration, count&lt;/li&gt;
&lt;li&gt;Contains foreign keys that point to dimension tables&lt;/li&gt;
&lt;li&gt;Typically long — many rows&lt;/li&gt;
&lt;li&gt;Typically narrow — few columns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Examples:&lt;/strong&gt; Hospital admissions, e-commerce orders, bank transactions, flight bookings&lt;/p&gt;

&lt;h3&gt;
  
  
  Dimension Tables
&lt;/h3&gt;

&lt;p&gt;A dimension table stores &lt;strong&gt;descriptive context&lt;/strong&gt; about those events — the who, what, where, and when.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Contains descriptive attributes: names, categories, locations, dates&lt;/li&gt;
&lt;li&gt;Contains a primary key that the fact table references&lt;/li&gt;
&lt;li&gt;Typically short — few rows&lt;/li&gt;
&lt;li&gt;Typically wide — many columns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Examples:&lt;/strong&gt; Patients, Products, Branches, Calendar&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The practical rule:&lt;/strong&gt; Your slicers and filters come from dimension tables. Your aggregations and measures come from fact tables. Keep them separate.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 4: Schemas — The Overall Shape of Your Model
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Star Schema — Start Here Every Time
&lt;/h3&gt;

&lt;p&gt;A central fact table connects directly to surrounding dimension tables. One hop from any dimension to the fact. No chains.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;              [Calendar]
                  |
[Customer] — [Sales Fact] — [Product]
                  |
              [Branch]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean, fast, easy to maintain. Power BI's calculation engine is specifically optimized for this structure. Your DAX measures will be simpler and your reports will be faster on a star schema than any other structure. This is the recommended starting point for almost every Power BI model.&lt;/p&gt;




&lt;h3&gt;
  
  
  Snowflake Schema — Normalized but Complex
&lt;/h3&gt;

&lt;p&gt;Dimension tables are broken into sub-dimensions connected in chains.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Region] ← [Branch] — [Sales Fact] — [Product] → [Category] → [Department]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead of a single Branch dimension with Region included as a column, Region becomes its own separate table connected to Branch.&lt;/p&gt;

&lt;p&gt;This reduces data redundancy — useful in enterprise data warehouses where storage and integrity matter. In Power BI reporting however, the additional complexity adds joins, slows queries, and makes the model harder to navigate. If your source data arrives in a snowflake structure, consider flattening dimension chains in Power Query before loading into the model.&lt;/p&gt;




&lt;h3&gt;
  
  
  Flat Table (Denormalized) — Simple but Limited
&lt;/h3&gt;

&lt;p&gt;Everything — facts and dimensions — in a single table. No relationships needed.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;OrderDate&lt;/th&gt;
&lt;th&gt;CustomerName&lt;/th&gt;
&lt;th&gt;City&lt;/th&gt;
&lt;th&gt;ProductName&lt;/th&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Revenue&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;01/03/2026&lt;/td&gt;
&lt;td&gt;James&lt;/td&gt;
&lt;td&gt;Nairobi&lt;/td&gt;
&lt;td&gt;Laptop&lt;/td&gt;
&lt;td&gt;Electronics&lt;/td&gt;
&lt;td&gt;85000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;02/03/2026&lt;/td&gt;
&lt;td&gt;Amina&lt;/td&gt;
&lt;td&gt;Mombasa&lt;/td&gt;
&lt;td&gt;Phone&lt;/td&gt;
&lt;td&gt;Electronics&lt;/td&gt;
&lt;td&gt;42000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Fast to build, easy to understand. But "Electronics" is repeated in every electronics row. Change a category name and you are updating thousands of cells. Performance degrades quickly as row count grows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use flat tables for:&lt;/strong&gt; Quick one-off analysis, very small datasets, early prototyping before building a proper model. Avoid them for anything that will be maintained, updated, or scaled.&lt;/p&gt;




&lt;h3&gt;
  
  
  Schema Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Schema&lt;/th&gt;
&lt;th&gt;Structure&lt;/th&gt;
&lt;th&gt;Performance&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Star&lt;/td&gt;
&lt;td&gt;Fact + direct dimensions&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Most Power BI reporting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snowflake&lt;/td&gt;
&lt;td&gt;Fact + chained dimensions&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Enterprise warehouses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flat&lt;/td&gt;
&lt;td&gt;Single table&lt;/td&gt;
&lt;td&gt;Poor at scale&lt;/td&gt;
&lt;td&gt;Very low&lt;/td&gt;
&lt;td&gt;Small, simple, one-off&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Part 5: Role-Playing Dimensions
&lt;/h2&gt;

&lt;p&gt;A role-playing dimension is one dimension table used multiple times in the same model, each time in a different context.&lt;/p&gt;

&lt;p&gt;The Date table is the most common example. A Hospital fact table might have AdmissionDate, DischargeDate, and SurgeryDate — all referencing the same Calendar dimension but each representing a different point in time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution 1 — One active, rest inactive:&lt;/strong&gt;&lt;br&gt;
Create one active relationship between Calendar and AdmissionDate. Create inactive relationships to DischargeDate and SurgeryDate. Use USERELATIONSHIP in DAX when you need the inactive ones.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Discharges This Month = CALCULATE(
    COUNT(Admissions[PatientID]),
    USERELATIONSHIP(Admissions[DischargeDate], Calendar[Date])
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Solution 2 — Duplicate the dimension table:&lt;/strong&gt;&lt;br&gt;
Create three separate Calendar tables — AdmissionCalendar, DischargeCalendar, SurgeryCalendar — each with its own active relationship. More relationships to maintain but no inactive relationship complexity in DAX.&lt;/p&gt;

&lt;p&gt;For most scenarios, Solution 1 is cleaner.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 6: Common Modeling Issues
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Many-to-Many without a bridge table&lt;/strong&gt;&lt;br&gt;
Connecting two fact tables directly produces unreliable totals. Fix by introducing a shared dimension table that both relate to through 1:M relationships.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bidirectional filtering everywhere&lt;/strong&gt;&lt;br&gt;
Slows the model and creates ambiguous results. Default to single direction. Only use Both direction after deliberate testing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Circular relationships&lt;/strong&gt;&lt;br&gt;
Table A → Table B → Table C → Table A creates a loop Power BI cannot resolve. Fix by identifying and removing the redundant relationship.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No dedicated Date table&lt;/strong&gt;&lt;br&gt;
Using date columns from fact tables directly breaks time intelligence functions. Always create a continuous Date dimension table, mark it as a Date table in Power BI (right-click table in Model View → Mark as Date Table), and use it as the single source for all date filtering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Loading unnecessary columns&lt;/strong&gt;&lt;br&gt;
Every column you load into the model consumes memory. In Power Query, remove columns you will not use before loading. Keep the model lean.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Recommended Workflow
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Load your data&lt;/strong&gt; into Power Query&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 — Clean and prepare&lt;/strong&gt; using joins and transformations in Power Query&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3 — Build the model&lt;/strong&gt; in Model View using 1:M relationships in a star schema&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4 — Create a Date table&lt;/strong&gt; and mark it appropriately&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5 — Write DAX measures&lt;/strong&gt; on top of the clean model&lt;/p&gt;

&lt;p&gt;Following this sequence means your measures are built on a solid foundation. Skipping to DAX before the model is right is the most common reason Power BI reports produce numbers nobody trusts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;The most important insight in data modeling is also the simplest one: &lt;strong&gt;keep facts and dimensions separate, connect them with 1:M relationships, and structure them as a star.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Everything else in this article — the join types, cardinality options, schema variations, role-playing dimensions — is either building on that foundation or explaining what happens when you deviate from it.&lt;/p&gt;

&lt;p&gt;Master the star schema with clean 1:M relationships first. You will handle 90% of real-world Power BI modeling scenarios with that alone.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of my data science learning journey.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>analytics</category>
      <category>beginners</category>
      <category>data</category>
      <category>database</category>
    </item>
    <item>
      <title>How Excel is Used in Real-World Data Analysis</title>
      <dc:creator>Kendi</dc:creator>
      <pubDate>Wed, 25 Mar 2026 13:10:38 +0000</pubDate>
      <link>https://dev.to/kendixy/how-excel-is-used-in-real-world-data-analysis-db4</link>
      <guid>https://dev.to/kendixy/how-excel-is-used-in-real-world-data-analysis-db4</guid>
      <description>&lt;h2&gt;
  
  
  How Excel is Used in Real-World Data Analysis
&lt;/h2&gt;

&lt;p&gt;Excel is a spreadsheet application by Microsoft that organizes data into a grid of rows and columns. Each box in that grid is called a cell, and each cell can hold a number, text, a date, or a formula that calculates something automatically.&lt;/p&gt;

&lt;p&gt;On top of that grid, Excel gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hundreds of built-in functions for math, text, dates, and logic&lt;/li&gt;
&lt;li&gt;Power Query for importing and transforming data from any source&lt;/li&gt;
&lt;li&gt;PivotTables that summarize thousands of rows in seconds&lt;/li&gt;
&lt;li&gt;Charts and visualizations that turn numbers into stories&lt;/li&gt;
&lt;li&gt;Data validation to control what gets entered into cells&lt;/li&gt;
&lt;li&gt;Conditional formatting that highlights patterns automatically&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Excel is used in finance, healthcare, logistics, marketing, research, and government — basically anywhere data exists and decisions need to be made. It sits at a sweet spot between accessibility and depth. A beginner can use it on day one. An expert can still find new things to learn after years.&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Used Excel in a Real Project
&lt;/h2&gt;

&lt;p&gt;For a recent project I analyzed product performance data from Jumia — one of Africa's largest e-commerce platforms — covering product prices, discounts, customer reviews, and ratings. Here is exactly how I approached it.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Make a Copy of Your Original Data
&lt;/h3&gt;

&lt;p&gt;Before touching anything, I created a copy of the raw data and kept the original sheet untouched. Data cleaning is destructive — you delete rows, overwrite values, change formats. Without a backup, there is no going back.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Understand Your Data Before Changing Anything
&lt;/h3&gt;

&lt;p&gt;I used &lt;strong&gt;CTRL + END&lt;/strong&gt; to see how many rows and columns I had, then scrolled through to spot obvious problems — price columns with "KSh" symbols, ratings stored as text like "4.5 out of 5", negative review counts, and blank cells scattered throughout.&lt;/p&gt;

&lt;p&gt;Understanding your data first tells you exactly what needs fixing and in what order. It prevents you from applying the wrong solution to the wrong problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Data Cleaning — Where the Real Work Happens
&lt;/h3&gt;

&lt;p&gt;Data cleaning is unglamorous but it is the foundation everything else stands on. In professional data work it takes 60 to 80 percent of total project time. The key steps I worked through were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Removing duplicates&lt;/strong&gt; — ensuring each product appeared only once using Data tab → Remove Duplicates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fixing data types&lt;/strong&gt; — converting price columns from text to numbers, extracting numeric ratings from text like "4.5 out of 5"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handling missing values&lt;/strong&gt; — replacing blank cells with the column average rather than deleting rows and losing data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standardizing text&lt;/strong&gt; — making sure category values like "Nairobi" and "nairobi" were consistent using PROPER, UPPER, and LOWER functions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is exciting. All of it is essential.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Formulas That Fascinated Me
&lt;/h2&gt;

&lt;h3&gt;
  
  
  XLOOKUP — The Lookup Function Excel Should Have Had From the Start
&lt;/h3&gt;

&lt;p&gt;Before XLOOKUP, the standard way to look up data was VLOOKUP — and it had serious limitations. It could only search left to right, broke silently when columns were inserted, and required hardcoded column numbers that made formulas fragile.&lt;/p&gt;

&lt;p&gt;XLOOKUP fixes all of that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=XLOOKUP(lookup_value, lookup_range, return_range, if_not_found)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It searches in any direction, handles missing values natively, uses column names instead of numbers, and can return multiple columns at once. One formula replaces what used to require complex workarounds.&lt;/p&gt;

&lt;p&gt;The one caveat — XLOOKUP is only available in Excel 2021 and Microsoft 365. If you are on an older version, INDEX/MATCH is the next best alternative.&lt;/p&gt;

&lt;h3&gt;
  
  
  IFS — Replacing Messy Nested IFs
&lt;/h3&gt;

&lt;p&gt;IFS checks multiple conditions in sequence and returns the first match. It replaced what would otherwise be deeply nested IF statements that are almost impossible to read or debug.&lt;/p&gt;

&lt;p&gt;For example, classifying products by rating:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=IFS(F2&amp;gt;=4.5, "Excellent", F2&amp;gt;=3, "Average", F2&amp;lt;3, "Poor")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean, readable, and easy to update. Nesting three IFs inside each other to do the same thing is a maintenance nightmare.&lt;/p&gt;

&lt;h3&gt;
  
  
  PivotTables — Summarizing Data Without a Single Formula
&lt;/h3&gt;

&lt;p&gt;A PivotTable takes your raw data and lets you summarize it any way you want — by category, by date, by product — without writing a single formula. You drag fields into rows, columns, and values, and Excel does the rest.&lt;/p&gt;

&lt;p&gt;One tip that makes PivotTables significantly more reliable: convert your clean data into an Excel Table first using &lt;strong&gt;CTRL + T&lt;/strong&gt;. Tables expand automatically when new rows are added, so your PivotTable always captures the full dataset on refresh.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Dashboard — Making Data Usable for Everyone
&lt;/h3&gt;

&lt;p&gt;The final step was bringing everything together in an interactive dashboard — a single sheet where any business stakeholder could explore the data without touching the underlying numbers. KPI cards at the top, charts in the middle, and slicers that filter every chart simultaneously with a single click.&lt;/p&gt;

&lt;p&gt;This is the difference between data analysis and data communication. The numbers only matter if the right people can read them.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The formulas are learnable. The functions have documentation. What nobody tells you is that the hardest part of data analysis is not the technical side — it is the discipline.&lt;/p&gt;

&lt;p&gt;The discipline to make a backup before touching anything. The discipline to understand your data before changing it. The discipline to clean thoroughly before analyzing. The discipline to question your results before presenting them.&lt;/p&gt;

&lt;p&gt;Excel gave me powerful tools. But the two steps that made the biggest difference were the simplest ones — making a copy of the original data and actually reading through it before writing a single formula.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Written as part of a data science learning program at LuxDev HQ.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>datascience</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
