Power BI has shown me how multifaceted and multi-purposed it can be. From the Data Cleaning process to data visualization.
Lets begin with some Term definitions;
Joins-These physically combine columns from different tables into one. They make it possible to assemble data from multiple sources.
Modeling-This is the process of defining how business data and tables are structured and connected. It structures chaotic data into organized "business objects". It uses relationships and DAX (Data Analysis Expressions), to ensure fast performance and accurate reporting.
Relationships- These are logical connections between tables that allow data from multiple sources to be combined and analyzed together. hey propagate filters automatically across tables, ensuring that visuals accurately reflect data based on shared keys or IDs.
Schemas-These define the structural organization of your data model, dictating how tables are connected and interact. The correct Schema influences the efficiency and performance of data queries and reports.
Now that we have defined our terms, lets see how they work or what each one does.
I will focus on Schemas because, modeling, joins and relationships are simply integral parts of them.
There are Two Types of Schemas;
Star Schema
This is the gold standard for Power BI modeling. It creates a simple, "star-like" pattern in your Model view. This is the most recommended Schema to use, because it is the simplest style and has the widest use.
The star Schema contains two tables;
• Fact Tables: Positioned at the center. They contain quantitative, measurable data (e.g., sales revenue, transaction quantities, dates) and unique ID.
• Dimension Tables: Surrounding the fact table. They contain descriptive data used to filter and group your facts (e.g., product categories, customer names, regions).
Benefits of star schemas
• Fact/dimensional models like star schemas are simple to understand and implement, and make it easy for end users to find the data they need. They can be applied to data marts and other data resources.
• Great for simple queries because of their reduced dependency on joins when accessing the data, as compared to normalized models like snowflake schemas.
• Adapt well to fit OLAP models.
• Improved query performance as compared to normalized data, because star schemas attempt to avoid computationally expensive joins.
Snowflake Schema
A snowflake schema is a logical database arrangement in a data warehouse where dimension tables are normalized and broken down into related sub-tables.
Resembling a snowflake, it places a central fact table at the core with branched, hierarchical dimension tables radiating outward
Characteristics of Snowflake Schema
• Requires less storage because dimension data is structured.
• Supports multiple hierarchies within a dimension.
• Performance decreases because of extra joins.
• Dimensions can be sourced from different systems and integrated cleanly.
• Good when dimension tables contain attributes at different grains.
Benefits of a Snowflake schema
• Improves data integrity through normalization.
• Reduces redundancy and storage usage.
• Supports detailed hierarchical drill-down.
How to determine which schema to use?
• Star schema and snowflake schema differ primarily in their structure, where snowflake schema is an expanded form of star schema that further normalizes data into additional sub-dimension tables.
• Star schema utilizes a denormalized structure with fewer joins to optimize query performance and speed, while snowflake schema uses normalization to achieve greater storage efficiency and data integrity.
• You can determine which schema is right for you by comparing their uses, benefits, drawbacks, and performance in key areas like scalability and ease of use.
This is just but a small overview of the subject title, to know more, grab yourself some data and sit in-front of your desk, get curious, get interactive. All the best!!-
Top comments (0)