DEV Community

Karen Langat
Karen Langat

Posted on

Mastering SQL Joins and Window Functions

Introduction

SQL joins are used to combine rows from two or more tables based on related columns, while window functions perform calculations across a set of related rows within a single result set without collapsing individual rows.

In this article, we'll break down these two in a practical way showing their use in analytics, backend systems and data projects.

SQL Joins

A JOIN combines rows from two or more tables based on a related column between them.They help in:

  • Retrieving connected data stored across multiple tables.
  • Matching records using common columns.
  • Improving data analysis by combining related information.
  • Creating meaningful result sets from separate tables.

We'll use the following 2 tables:
Customers
| customer_id | name |
| ----------- | ----- |
| 1 | Alice |
| 2 | Bob |
| 3 | Carol |

Orders
| order_id | customer_id | amount |
| -------- | ----------- | ------ |
| 101 | 1 | 250 |
| 102 | 1 | 300 |
| 103 | 2 | 150 |

1. INNER JOIN

Is used to retrieve rows where matching values exist in both tables.It helps in combining records based on a related column.
Syntax:

SELECT 
    c.customer_id,
    c.name,
    o.order_id,
    o.amount
FROM Customers c
INNER JOIN Orders o
    ON c.customer_id = o.customer_id;
Enter fullscreen mode Exit fullscreen mode

It returns only the matching rows:

Note: We can also write JOIN instead of INNER JOIN. JOIN is same as INNER JOIN.

2. LEFT JOIN (LEFT OUTER JOIN)

Returns all rows from the left table and matching rows from the right. It displays NULL values where no match exists in the right table.

It is useful when you don't want to lose data from the primary table.

SELECT 
    c.customer_id,
    c.name,
    o.order_id,
    o.amount
FROM Customers c
LEFT JOIN Orders o
    ON c.customer_id = o.customer_id;
Enter fullscreen mode Exit fullscreen mode

Now Carol appears with NULL in order columns:

3. RIGHT JOIN

Used to retrieve all rows from the right table and the matching rows from the left table.
It is less commonly used because you can usually rewrite it as a LEFT JOIN by swapping tables.

4. FULL OUTER JOIN

Returns all rows from both tables. If no match exists, NULLs fill the missing side.

SELECT 
    c.customer_id,
    c.name,
    o.order_id,
    o.amount
FROM Customers c
FULL JOIN Orders o
    ON c.customer_id = o.customer_id;
Enter fullscreen mode Exit fullscreen mode

When to use Joins

Use Joins when:

  • You need to combine related datasets
  • Your database is normalized
  • You want enriched or relational views of data

Window Functions

SQL window functions allow performing calculations across a set of rows that are related to the current row, without collapsing the result into a single value. They are commonly used for tasks like aggregates, rankings and running totals.

Basic Syntax

FUNCTION_NAME() OVER (
    PARTITION BY column
    ORDER BY column
)
Enter fullscreen mode Exit fullscreen mode

The OVER clause defines the β€œwindow” of rows for the calculation. It can:

  • PARTITION BY: It divides the data into groups using PARTITION BY.
  • ORDER BY: It specifies the order of rows within each group using ORDER BY.

We will use the Sales dataset below:
| sale_id | region | amount |
| ------- | ------ | ------ |
| 1 | East | 200 |
| 2 | East | 200 |
| 3 | East | 100 |
| 4 | West | 300 |
| 5 | West | 150 |

Types of Window Functions

1. Aggregate Window Functions

Aggregate window functions calculate aggregates over a window of rows while retaining individual rows. Common aggregate functions include:

  • SUM(): Sums values within a window.
  • AVG(): Calculates the average value within a window.
  • COUNT(): Counts the rows within a window.
  • MAX(): Returns the maximum value in the window.
  • MIN(): Returns the minimum value in the window.

Example:

SELECT
    sale_id,
    region,
    amount,
    AVG(amount) OVER (
        PARTITION BY region
    ) AS avg_region_sales
FROM Sales;
Enter fullscreen mode Exit fullscreen mode

Each Row shows its region's average:

2. Ranking Window Functions

These functions provide rankings of rows within a partition based on specific criteria.
They include:

RANK()

Used to assign ranks to rows within a group based on a specific order. If there's a tie, ranks are stripped.
Example:

SELECT
    sale_id,
    region,
    amount,
    RANK() OVER (
        PARTITION BY region
        ORDER BY amount DESC
    ) AS rank_in_region
FROM Sales;
Enter fullscreen mode Exit fullscreen mode

Returns:

DENSE RANK()

Assigns rank without skipping numbers.

SELECT
    sale_id,
    region,
    amount,
    DENSE_RANK() OVER (
        PARTITION BY region
        ORDER BY amount DESC
    ) AS dense_rank_in_region
FROM Sales;
Enter fullscreen mode Exit fullscreen mode

There is no gap in the ranking numbers:

ROW NUMBER()

Gives a unique number to each row in the result set.

SELECT
    sale_id,
    region,
    amount,
    ROW_NUMBER() OVER (
        PARTITION BY region
        ORDER BY amount DESC
    ) AS row_number_in_region
FROM Sales;
Enter fullscreen mode Exit fullscreen mode

Returns:

When using Window Functions, a few issues could come up. To prevent these:

  • Partition carefully: Without Partition by, the whole table is treated as one group.
  • Check Order by : It controls the calculation order in the window.
  • Optimize performance: Window functions can be slow on large datasets; use indexes if needed.

Conclusion

Joins and Window Functions are two powerful tools in SQL and together, they unlock advanced data analysis.

Joins allow you to bring data together from multiple tables, forming a complete and meaningful dataset. They are foundational in relational databases and essential for analytics, reporting, backend systems, and data engineering workflows.

Window Functions, on the other hand, let you perform advanced calculations across related rows without losing detail. Instead of collapsing data like GROUP BY, they preserve each row while adding powerful insights such as rankings, running totals, comparisons, and percentiles.

Top comments (0)