DEV Community

Ryan Murunga
Ryan Murunga

Posted on

Data Modelling in Power Bi

INTRODUCTION

Power Bi is a comprehensive business analytic tool developed by microsoft.

Power Bi is used for:

  1. Connect data .
  2. Clean and transform data.
  3. Create relationships between tables
  4. Analyze data
  5. Build Reports
  6. Create dashboards and share insights

Some of the importance of Power Bi is;

  1. Cleaning messy data
  2. Analyze large amounts of data
  3. Create Interactive dashboards
  4. Identify trends & patterns
  5. Make data driven decisions
  6. Present insights visually

Core components of power bi include:

  • Power BI Desktop - Is a free Windows application used to connect to data sources, clean and transform information, model relationships, and build reports.
  • Power BI Service - Is a cloud-based platform accessible via web browsers.
  • Power BI Mobile - Is a Native iOS and Android apps that allow you to monitor business metrics and interact with reports on the go.

DAX FUNCTIONS

DAX - Data Analysis Expression (Is a formula language used in power Bi to create calculations and analyze logics.)

DAX is used to create ;

  • Calculated columns
  • Calculated tables

DAX functions allow us to create calculations that are not direclty available in the raw dataset.

Here are some of few common DAX function;

  1. Sum Is a measure used to calculate a total value.
Total Sales = SUM(Sales[SalesAmount])
Enter fullscreen mode Exit fullscreen mode
  1. Average Is a measure used to calculate the average of values.
Average Sales = AVERAGE(Sales[Amount])
Enter fullscreen mode Exit fullscreen mode
  1. Count Is used to count the number of rows in a column that contain numbers.
Total Orders = COUNT(Sales[OrderID])
Enter fullscreen mode Exit fullscreen mode
  1. IF Is used to create conditions. It checks if a condition is true or false and returns a specific result based on that.
Sales Category = IF(Sales[Amount] > 10000, “High”, “Low”)
Enter fullscreen mode Exit fullscreen mode
  1. RANKX It assigns ranks to values based on a specific measure.
Sales Rank = RANKX(ALL(Products), SUM(Sales[Amount]))
Enter fullscreen mode Exit fullscreen mode
  1. ISBLANK It checks if a value is blank.
Check Sales = IF(ISBLANK(Sales[Amount]), “No Data”, Sales[Amount])
Enter fullscreen mode Exit fullscreen mode
  1. NOW It returns the current date and time.
Current DateTime = NOW()
Enter fullscreen mode Exit fullscreen mode

After a short overview of PowerBi lets now understand what is Data Modelling.

Data Modelling - is the process of structuring and defining relationships between your tables to enable fast, accurate, and scalable data analysis.

Relationships - link data across different tables so your visuals can aggregate, filter, and slice data correctly.

There are 3 main relationships

  1. One to Many - A database where a single record in a one table is linked to a table with multiple records
    e.g
    One patient can have many hospital visits

  2. One to One - A database where a single records corresponds to exactly one unique in another record.
    e.g
    One patient can have one patient profile

  3. Many to Many - A database where multiple records in one entity are associated with multiple records in another entity.
    e.g
    One patient can have many diagnosis

SCHEMAS
This refers to how data is organized in a database.

The main types of schema are ;

  1. Flat Table
  2. Star Schema
  3. Snowflake Schema

1. Flat table
Is a single large table that contain all the information in one place

Characteristics

  1. Has many columns
  2. Combines both ; Descriptive columns, Numeric Columns, ID columns,Repeated values

Example of a flat table

Advantages

  1. Simple for small datasets
  2. Easy to load in PowerBI
  3. Easy to read

Disdvantages

  1. May slow down reports
  2. Can create data quality problems 3.Contain reported data
  3. Can become very large datasets

2. Star Schema
Is a data model where a central fact table connects directly to multiple dimension table.

Fact-table Sits in the middle
Dimension Tables surround it

Example of a star schema

Fact table
Stores events , transaction or activities that happened
The fact table contains:

  • Keys that connnect to dimension tables
  • Numeric Values that can be calculated

Dimension Table
Stores descriptive information
Helps us filter , group and describe the fact table

Advantages of Star schema

  1. Reduces repeated data
  2. Improves report performance
  3. Easier to understand
  4. Makes DAX measures easier

Star schema is recommended because

  • Dimension tables contain descreption
  • Filter flow from dimensions to facts
  • Fact tables contains numbers and keys

3. SNOWFLAKE SCHEMA
This is when dimension tables are normalized and split into multiple, related sub-dimension tables.
Dimension tables are split into smaller related tables

Example of a snowflake schema

Snowflake schema is useful when;

  • Creating a more normalized model
  • Reduce repeated data
  • Have very large dimension tables

Differences between Star Schema and Snowflake schema

AREA STAR SCHEMA SNOWFLAKE SCHEMA
Performance in Power BI Usually Faster Can be slower if too many relationships
Simplicity Easier for beginners More Complex
Structure Fact table connected directly to dimensions Dimensions are split into smaller dimensions
Storage More repeated dimension data Less data
Best for Reporting and dashboards More normalized database style model

Top comments (0)