DEV Community

Fanuel Awidhi
Fanuel Awidhi

Posted on

From Tables to Insights: Understanding Data Modelling, Joins, Relationships and Data Schemas

Data Modelling is both an art and science because we are constantly trying to shape data so that the insights gained are as precise as possible.

Data models are the building blocks of visualization and reporting.

A data model consists of one or more tables and several relationships assuming that more than just one or two tables exists.

It uses diagrams, symbols and textual definitions to visually represent how data is captured, stored and used.

To create efficient measures,we hone the data model, emphasizing table and relationship management.

Core Components

  • Fact Table- Contains the numbers intended for calculations(e.g, Revenue, Yield, Quantity)

  • Dimension Table- Contains the labels or attributes used to slice, group and filter those numbers(e.g, Date, County, Crop Type)

Understanding Relationships in Data Modelling

A relationship is the connection between entities in a data model, which in turn reflect business rules.

It connects two tables using matching columns.

Relationship between entities can be either one-to-one, one-to-many and many-to-many.
Key Relationship Terms

  • Primary Key- Uniquely identifies each row in a dimension table
  • Foreign Key- It is the matching key in another table.

Cardinality
Cardinality describes how rows in one table relate to other rows in another table.

The main relationship types are:

One-to-many
It is the most common and recommended relationship type in Power BI.
It mean one row exists on the "one" side(usually a dimension table), while many rows exists on the "many" side(usually a fact table).
It may also be displayed as many-to-one, depending on which table is selected first.

Many-to-many
This relationship occurs when both tables contain multiple rows for the same key.
In many cases, many-to-many relationships are a signal that a data model needs to be restructured, often by introducing a proper dimension table or adjusting granularity.

One-to-one
This relationship means each row in one table matches exactly one row in the other table.

Schemas

A schema is the way tables are organized in a database or reporting model.
In Power BI, it helps us organize our data model so that reports are easier, faster and more accurate.

Types of Schemas

Star Schema- Is the most widely recommended and optimal design in Power BI. The center of the star is a Fact Tables which contains measurable, quantitative data, for example, sales revenue, quantity sold, or transaction amounts. The fact tables contain foreign keys that link to dimension tables. Dimension tables form points of the star. They provide descriptive context for the facts. Dimension tables contain primary keys and are used to filter and group data in your reports.

Snowflake Schema- Is an extension of the star schema. In a snowflake schema, dimension tables are normalized, which means they are broken down into further sub-dimension tables. It reduces data redundancy but requires more complex relationships between tables, which can decrease query performance in Power BI.

Joins

A join is setting up a relationship between two or more table to pull data.

The data is commonly mapped together using a primary key, foreign key or a combination, which is referred to as a composite key.

Join Types

Inner Join- Only matching rows are visible.

Left Outer Join- All items in the first table appear, but only matching items from the second.

Right Outer Join- All items in the second table appear, but only matchin items from the first.

Full Outer Join- All rows appear.

Left Anti Join- Returns all rows from the first table where a match in the second table does not exist.

Right Anti Join- Returns all rows from the second table where a match in the first table does not exist.

Top comments (0)