What is a Schema?
A schema is essentially the blueprint or structure of how data is organized in a database or data model. Think of it as the map that shows:
• What tables exist
• What fields (columns) are in those tables
• How those tables relate to each other
In other words, a schema defines the logical arrangement of data and the relationships between different parts of that data.
What is Data Modelling?
Data modeling is the process of designing how data is structured, stored, and related to each other so it can be used effectively for analysis and reporting.
It’s like creating a blueprint for your data: deciding what tables you need, what fields go into them, and how they connect.
Types of Schemas in B.I
- Star Schema
- Snowflake Schema
- Galaxy Schema (Fact Constellation)
1. Star Schema
A Star Schema is a way of organizing data where:
• A central Fact Table holds quantitative, transactional data (numbers, metrics).
• Surrounding Dimension Tables hold descriptive attributes (categories, labels, hierarchies).
• The structure looks like a ⭐ star: fact table in the middle, dimensions radiating outward.
Components of a Star Schema
Fact Table
• Contains measures (numeric values) and foreign keys linking to dimensions.
• Usually very large because it stores detailed records.
Example:
Sales Fact Table → OrderID, CustomerID, ProductID, DateID, Quantity, Revenue.Dimension Tables
• Contain descriptive attributes used for filtering, grouping, and slicing data.
• Smaller tables compared to fact tables.
Examples:
(i) Customer Dimension → CustomerID, Name, Age, Region.
(ii) Product Dimension → ProductID, Name, Category, Brand.
(iii) Date Dimension → DateID, Year, Quarter, Month, Day.Relationships
• One-to-Many: Each dimension record can link to many fact records.
Example: One customer can place many orders.
• In Power BI, these relationships are defined in the Model view.
Why use Star Schema?
• Performance: Queries run faster compared to other types of schemas
• Simplicity: Easy for analysts and business users to understand.
• Flexibility: Supports hierarchies, drill-downs, and DAX measures in Power BI.
• Best Practice: Microsoft recommends star schema for Power BI semantic models.
2. Snowflake Schema
A Snowflake Schema is a type of data model where dimension tables are normalized into multiple related tables.
• Instead of having one big dimension table (like in a star schema), you split it into smaller tables to remove redundancy.
• The structure looks like a ❄️ snowflake because dimensions branch out into sub-dimensions.
Structure
• Fact Table: Holds transactional/numeric data (e.g., Sales: OrderID, ProductID, CustomerID, DateID, Revenue).
• Dimension Tables: Broken down into multiple related tables.
Example:
• Customer → City → Region → Country
• Product → Category → Department
Fact Table: Sales
• OrderID
• CustomerID
• ProductID
• DateID
• Quantity, Revenue
Dimension Tables (normalized)
• Customer (CustomerID, Name, CityID)
• City (CityID, CityName, RegionID)
• Region (RegionID, RegionName, CountryID)
• Country (CountryID, CountryName)
Here, instead of storing Customer → City → Region → Country in one table, you break it into multiple linked tables.
Advantages
• Saves storage space (no repeated values like “Kenya” in every customer row).
• Reduces redundancy and improves data integrity.
• Useful for very large datasets with complex hierarchies.
Disadvantages
• More complex to design and maintain.
• Queries are slower because they must join across many tables.
• Harder for business users to understand compared to star schema.
Relationships
• Still based on one-to-many relationships.
• But dimensions are linked through multiple levels (Customer → City → Region → Country).
• In Power BI, this means more joins and more complex relationships in the model view.
Good data modeling is critical because it directly impacts Power BI’s performance, accuracy, and scalability. A well-structured model ensures fast queries, correct calculations, and intuitive reporting, while poor modeling leads to slow dashboards, incorrect insights and wasted memory.
Summary: Importance of Good Data Modeling in Power BI
• Performance Optimization:
A well-designed model (especially using a star schema) reduces the number of joins, improves query speed, and minimizes memory usage—making dashboards faster and more responsive.
• Accurate Reporting:
Clean relationships between fact and dimension tables ensure correct aggregations, filters, and slicers. This prevents misleading totals and broken visuals.
• Scalability and Maintainability:
Structured models are easier to update, extend, and troubleshoot. They support reusable DAX measures and consistent logic across reports.
• User-Friendly Design:
A clear schema makes it easier for analysts and stakeholders to understand the data, build reports, and explore insights confidently.
• Best Practices Alignment:
Microsoft recommends using star schemas, one-to-many relationships, and dedicated date tables to ensure optimal performance and functionality.
In short: Good modeling is the foundation of reliable, fast, and scalable analytics. It turns raw data into trusted insights.


Top comments (2)
Well explained—keep up the good work!
Thank you