Introduction.
One of the most difficult aspects for a data analyst looking to become successful is learning how data from various tables relates. SQL queries, SQL Dashboards in Power BI, or designing a database, three things are seen throughout:
*Relationships
*Schemas
- Joins
The below concepts are the foundation of relational databases. Once you know them, you will find data easier to analyze.
In this article, we will discuss each concept, and provide some examples.
What is a database relationship?
A relationship is a linkage between two or more tables by the same column.
Imagine an online store with two tables:
Customers
| CustomerID | Name |
|---|---|
| 101 | Alice |
| 102 | Brian |
| 103 | Carol |
Orders
| OrderID | CustomerID | Product |
|---|---|---|
| 5001 | 101 | Laptop |
| 5002 | 101 | Mouse |
| 5003 | 103 | Keyboard |
Notice that the two tables have a column named CustomerID.
The column in that column is the person who placed the order.
If there were no relationships, each table would be considered a separate table.
Types of relationships
One-to-one relationship
One row in one table matches one and only one row in another.
One to many relationship
This is the most common relationship. This relationship is commonly leveraged in power BI models.
Many to many relationship
There are several records in one table that are associated with several records in another table.
What is database schema
Is the logical organization of a database is called a schema.
It defines:
*Tables
- Columns
- Relationships
- Constraints
Database designers build schemas before data is stored, similar to how an architect designs a building before it is built. For example the blueprint of a building.
Common Schema Types
Star Schema
The best model to use in Power BI. It consists of :
- One Fact table
- Multiple Dimension tables For example;
Products
|
Customers ---- Sales ---- Dates
|
Employees
Advantages:
- Fast performance
- Easy to understand
- Optimized for reporting
- Recommended by Microsoft
Snowflake Schema
A snowflake schema forces dimension tables to be broken into other tables.
Example
Sales
|
Products
|
Category
|
Department
Advantages
- Reduces redundancy
- Better data consistency
Disadvantages
- More joins
- More complex queries
Galaxy Schema
Also known as a Fact Constellation. It has several fact tables with dimension tables.
Example
Sales ------ Customers
|
Inventory ---- Products
This is normally used in large organization.
What Is a SQL Join?
A join is a method used to create a result set from two or more tables based on the key column of one table that matches the key column of another.
Joins are essential for being able to query data from multiple tables.
INNER JOIN
Its Only the matching records will be returned.
Example
SELECT
Customers.Name,
Orders.Product
FROM Customers
INNER JOIN Orders
ON Customers.CustomerID = Orders.CustomerID;
Result
| Name | Product |
|---|---|
| Alice | Laptop |
| Alice | Mouse |
| Carol | Keyboard |
Brian is not included as he has no orders.
LEFT JOIN
Selects all rows from left table.
If there are matching records in the right table, they are added.
SELECT
Customers.Name,
Orders.Product
FROM Customers
LEFT JOIN Orders
ON Customers.CustomerID = Orders.CustomerID;
Result
| Name | Product |
|---|---|
| Alice | Laptop |
| Alice | Mouse |
| Brian | NULL |
| Carol | Keyboard |
RIGHT JOIN
Returns all the rows from the right table.
If there is a mismatch on the left side then those matches are null.
FULL OUTER JOIN
Returns:
- All matching records All records that do not match between the two tables. This can be helpful to detect missing data.
CROSS JOIN
Makes all possibilities.
If one table has 10 rows and another table has 20 rows then the result contain.
A table joins with itself. Useful for:
- Employee-manager relationships
- Organizational hierarchies
- Family trees
Relationships vs Joins
Relationships and joins are different, yet appear as if they are.
| Relationships | Joins |
|---|---|
| Permanent join of tables | Temporary operations |
| Defined in the data model | Written in SQL |
| Automatically used by Power BI | Used when you query the data |
| Enhance the navigation of the model | Add data to the model during execution |
Best Practices
Use a Star Schema as much as you can
1.create relationships using unique keys.
If possible, do not use many-to-many relationships.
- Use meaningful primary keys.
3.Minimize unnecessary joins.
- Maintain dimension tables with no duplicate data.
Real-World Example
Suppose you are developing a Sales Dashboard in Power BI.
You might have:
- Sales (Fact table)
- Customers
- Products
- Date
- Regions
The relationships link the tables together and Power BI uses the relationships to sum up sales by customer, by product, by month, or by region. In SQL you would have to join the tables beforehand and then work out your results.
Conclusion
The ability to understand relationships, schemas and joins are some of the most important skills for any Data Analyst. Relationships are the way that tables are related to each other, schemas provide a structure for the entire database, and joins allow to retrieve related data from other tables.
Understanding these concepts will help you design more effective SQL queries, create efficient Power BI data models, and design scalable databases.

Top comments (1)
This awesome well....ellaborated