Introduction
Data doesn't become useful the moment you collect it, rather, it becomes useful when it is well structured. The structure is what schemas and data modeling are all about. Whether you're working in Excel or PowerBi or a full data warehouse, getting to know, understanding schemas and data modeling is the foundational skill for any data analyst.
This article breaks down schemas and data modeling in an elaborate way well suited for beginers.
Now lets dive in.
A Schema
(The blueprint)
Refers to the way tables are structred and the interelation they have with eachother. in other words, schemas define the logical structure, organization and relationships of tables within a data model. Majorly, they narrow down on structuring data for optimal performance and reporting.
Data Modeling
(The Plan)
It essentially refers to the crucial process of arranging and creating an interelation between data tables to create a logical, efficient sematic model, which in turn helps in creating accurate analysis and powerful reports by defining relationships, creating DAX computations(Measures) and having organized data by using schemas within Power querry and Model view.
Lets now connect the two. Data modeling is the process of designing a blueprint for organizing data based on various business needs and requirements, while a schema is the ground and technical implimentation of that blueptint within DBMS. The interelation is a direct progression from concept to physical reality.
The data model guides the schema design - The schema is the actualization of the model.
Why Schemas Matter
Make data easier to understand
Reduce query performance
Reduce data duplication
Support scalable reporting and analysis
Types of Schemas in Power BI
1. Star Schema
Refers to a mature modeling approach widely adopted by the relational data warehouses. It needs user to classify their model tables as their dimension or fact. It is designed for high performance. In addition it consists one or more central Fact tables (With metrics/measures) sorrounded by de-normalised dimensions tables (Containing Descriptive attributes)
The strucure:
Fact Table(Center) >> Contains Numeric measures (Things you aggregate) and foreigns keys to dimensions.
Dimension Tables (Around it) >> Contains descriptive attributes and one row per entity.
Reasons why Star Schema is recomended
Faster performance
Simpler DAX
Fewer relationship
Easy to explain to stakeholders
Tips:
If it describes something>> Dimension
If it is a sum of a number>> Fact
2. Snowflake Schemas
This is a more complex, and widely normalised version of the Star schema. This refers to when dimensions are normalised into multiple tabls. This is further broken down into Sub-dimensions to reduce data redundancy. Its basically an extension of the star schema often utilised for complex, hierachical data, improving storage efficiency at the cost of Query performance due to more required joins.
Pros: Reduces duplication
Cons: Slower, More complex DAX, and harder to maintain.
NOTE: Flatten demensions where possible. Storage is cheap, Confusion is expensive.**
Comparing A Good data model Vs A Bad data model.
| Good Model | Bad Model |
|---|---|
| Multiple small dimension tables | One big table with over 50 columns |
| Simple relationships | Many to many relationships everywhere |
| No duplicates | Duplicate in columns |
Role of Power Query in data modeling.
Removes duplicates
Creates dimension tables
Geenarate surrogate keys
Clean and starndardize columns
Common Modeling mistakes and how to avoid them
| Using data columns in fact table only | Create a proper Data table |
|---|---|
| Many-to-many relationships | Introduce bridge tables |
| Too many bidirectional filters | Redesign tables |
| Calculated columns everwhere | Prefer measures |
Data modeling is the bridge between raw data and clean insights, once your model are solid, DAX becomes silple and not scary.
Schemas and Data modeling may feel abstract at first, but they are what seperates messy data from insightful and reliable analytics systsems.Mastering them will Improve your dashboards, reduce errors, and make analysis scalable.

Top comments (0)