Isika Millicent

Posted on Mar 9 • Edited on Mar 12

Mastering SQL Joins and Window Functions with Real Examples

#sql #tutorial #learning

If you have been writing SQL for a while, you have probably come across joins and window functions. These two features appear in a large percentage of real-world SQL queries, and for good reason. They are among the most powerful tools SQL has to offer.

However, they can be confusing at first. Understanding how tables connect with joins and how window functions analyze data across rows is a key step in becoming comfortable with SQL.

In this article, we will walk through the most common types of joins using a simple example and explain how they work.

1. What are Joins?

In SQL, joins are used to combine rows from two or more tables based on a related column between them.
Think of it as sliding two spreadsheets together so that matching values line up, the join type determines what you do with rows that have no partner on the other side.

To understand joins better, we will use the following two tables.

In the Customers table, customer_id is the primary key.
In the Orders table, order_id is the primary key, while customer_id acts as a foreign key that references the Customers table.

Table 1: Customers Table

| customer_id | customer_name |
|------------|---------------|
| 1          | Alice         |
| 2          | Bob           |
| 3          | Charlie       |
| 4          | Diana         |

Table 2: Orders Table

| order_id | customer_id | total_amount |
|----------|------------|--------------|
| 101      | 1          | 250          |
| 102      | 2          | 150          |
| 103      | 1          | 300          |
| 104      | 3          | 200          |
| 105      | 5          | 100          |

NB:

Customer 4 (Diana) has no orders
Order 105 belongs to customer_id 5, which does not exist in Customers.

A) INNER Join

Returns rows where there is a match in both tables.
If there’s no match, the result set will not include those records.

SELECT c.customer_name, o.order_id, o.total_amount FROM Customers c INNER JOIN Orders o ON c.customer_id = o.customer_id;

Result: The above query returns only customers who actually have orders.

customer_name   order_id    total_amount
Alice             101          250
Alice             103          300
Bob               102          150
Charlie           104          200

NB
Notice that Diana (no orders) and Order 105 (no matching customer) are both absent. Only rows with a valid match on both sides make it through.

This is the most common join.
It is mostly when you want records that exist in both tables eg orders with valid customers, students with enrolled courses.

B. LEFT(OUTER) Join

This join returns all records from the left table and the matched records from the right table.
If there’s no match, NULL values are returned for the right table’s columns.

In this case, the left table is the customers table and on the right is the orders table.

SELECT c.customer_name, o.order_id, o.total_amount FROM Customers c LEFT JOIN Orders o ON c.customer_id = o.customer_id;

Result: The above query returns all customers even those with no orders.

customer_name   order_id   total_amount
Alice              101        250
Alice              103        300
Bob                102        150
Charlie            104        200
Diana              NULL       NULL

Diana now appears in the results, even though she has no orders — her order columns simply return NULL.
Left Join is the foundation of the anti-join pattern — filtering to WHERE o.order_id IS NULL gives you only the customers who have never placed an order.

B.i) Anti Join
An Anti Join is used when you want to find records in one table that do not have a corresponding match in another table.

For Example:

Find customers who haven’t purchased anything
Identify products with no sales

Using the above table, LEFT JOIN + WHERE NULL
SELECT c.customer_id, c.customer_name FROM Customers c LEFT JOIN Orders o ON c.customer_id = o.customer_id WHERE o.order_id IS NULL;

Result: For customers without matching orders, order_idreturns NULL.
The WHERE o.order_id IS NULL condition filters these unmatched rows.
This is the anti join pattern.

customer_id    customer_name
4                 Diana

C) RIGHT Join

Returns all rows from the right table and matching rows from the left table.

SELECT c.customer_name, o.order_id, o.total_amount FROM Customers c RIGHT JOIN Orders o ON c.customer_id = o.customer_id;

Result: The above query returns all orders, even those without matching customers.

customer_name   order_id   total_amount
Alice             101         250
Alice             103         300
Bob               102         150
Charlie           104         200
NULL              105         100

d) FULL OUTER Join

Returns all rows when there is a match in one of the tables.

SELECT c.customer_name, o.order_id, o.total_amount FROM Customers c FULL OUTER JOIN Orders o ON c.customer_id = o.customer_id;

Result: Includes all customers and all orders, matching where possible. Combine both LEFT + RIGHT and everything shown

customer_name   order_id    total_amount
Alice             101           250
Alice             103           300
Bob               102           150
Charlie           104           200
Diana             NULL          NULL
NULL              105          100

e) SELF Join

A Self Join is when a table is joined with itself.

Instead of joining two different tables, you treat the same table as if it were two separate tables by using table aliases.

Self joins are commonly used when a table contains hierarchical or related data within itself, such as:

employees and managers
categories and subcategories

Example Table: Employees

employee_id     employee_name    manager_id
1                  Alice           NULL
2                  Bob             1
3                  Charlie         1
4                  Diana           2

Alice is the top manager (no manager).
Bob and Charlie report to Alice.
Diana reports to Bob.

SELECT e.employee_name AS employee, m.employee_name AS manager FROM employees e LEFT JOIN employees m ON e.manager_id = m.employee_id;

Result: The above query essentially asks: "For each employee, find the person whose employee_id matches their manager_id."

employee     manager
Alice         NULL
Bob           Alice
Charlie       Alice
Diana         Bob

Let me explain:

The employees table is referenced twice in the query.
e represents the employee.
m represents the manager.
We match manager_id from one instance of the table to employee_id in the other.

2. Window Functions

While joins help you combine data across tables, window functions help you analyze data across rows — without collapsing those rows into a single result the way GROUP BY does.

This is the key distinction: GROUP BY aggregates rows and destroys their individual identity. A window function performs the same kind of calculation but preserves every row, attaching the result as a new column alongside the original data.

The syntax uses OVER()

The OVER() clause is what transforms an ordinary aggregate into a window function. An empty OVER() applies the function across the entire result set.
PARTITION BY subdivides that set into groups, restarting the calculation for each one.

a) Ranking Functions (ROW_NUMBER, RANK, DENSE_RANK)

Suppose you want to rank each order by its total_amount — highest to lowest.
Three ranking functions can do this, and their difference only shows up when values tie.

SELECT o.order_id, c.customer_name, o.total_amount,
    -- Always unique; no ties are possible (1, 2, 3, 4 …)
ROW_NUMBER() OVER (ORDER BY o.total_amount DESC) AS row_num,
    -- Tied rows share a rank; the next rank skips (1, 1, 3 …)
RANK() OVER (ORDER BY o.total_amount DESC) AS rnk,
    -- Tied rows share a rank; no ranks are skipped (1, 1, 2 …)
DENSE_RANK() OVER (ORDER BY o.total_amount DESC) AS dense_rnk
FROM Orders o
INNER JOIN Customers c ON c.customer_id = o.customer_id;

Result:

order_id	customer_name	total_amount	row_num	rnk	dense_rnk
103	Alice	300	1	1	1
101	Alice	250	2	2	2
104	Charlie	200	3	3	3
102	Bob	150	4	4	4

NB:

Order 105 does not appear because it has no matching customer — the INNER JOIN filters it out.

b) Aggregates Over a Window — SUM, AVG, COUNT

One of the most powerful things window functions enable is comparing each row against an overall or group-level aggregate — without GROUP BY collapsing your data.

Here, we calculate the total revenue across all orders and each order's percentage share of that total:

SELECT o.order_id, c.customer_name, o.total_amount,    SUM(o.total_amount) OVER () AS grand_total,
FROM Orders o
INNER JOIN Customers c ON c.customer_id = o.customer_id;

Result:

order_id	customer_name	total_amount	grand_total
101	Alice	250	900
102	Bob	150	900
103	Alice	300	900
104	Charlie	200	900

Notice that grand_total is the same on every row.
The OVER() means "look across the entire result set."
Each row keeps its own identity while also having access to the overall total.

c) LAG and LEAD — Comparing Orders Across Rows

LAG looks at the previous row's value and LEAD looks at the next row's value.
Here, we can use LAG to show how much each of Alice's orders changed compared to her previous one.

SELECT o.order_id, c.customer_name, o.total_amount,
    -- We are now looking for the previous order's amount for the same customer
    -- o.total_amount is the column we want to “look back” at.
    -- the` 1 `returns how many rows back we want to look (1 row back in this case)
LAG(o.total_amount, 1) OVER (PARTITION BY o.customer_id
        ORDER BY o.order_id) AS prev_order_amount,
    -- Difference from the previous order using minus sign
 o.total_amount - LAG(o.total_amount, 1) 
       OVER (PARTITION BY o.customer_id
       ORDER BY o.order_id) AS change_from_prev

FROM Orders o
INNER JOIN Customers c ON c.customer_id = o.customer_id
ORDER BY c.customer_name, o.order_id;

Result:

order_id	customer_name	total_amount	prev_order_amount	change_from_prev
101	Alice	250	NULL	NULL
103	Alice	300	250	50
102	Bob	150	NULL	NULL
104	Charlie	200	NULL	NULL

Alice's second order (103) was 50 more than her first (101). Bob and - Charlie each only have one order, so their prev_order_amount returns NULL — there is no previous row to look back at.

d) NTILE — Bucketing Customers by Spend

NTILE(n) divides rows into n roughly equal buckets and assigns each row a bucket number. It is perfect for segmenting customers by how much they have spent.

Since Diana has no orders, we first join and aggregate, then apply NTILE to rank the customers who do have orders:

SELECT c.customer_name, SUM(o.total_amount)                               AS total_spent,
NTILE(3) OVER (ORDER BY SUM(o.total_amount) DESC) AS spend_tier,
CASE NTILE(3) OVER (ORDER BY SUM(o.total_amount) DESC)
WHEN 1 THEN 'High Spender'
WHEN 2 THEN 'Mid Spender'
ELSE 'Low Spender'
END AS segment

FROM Orders o
INNER JOIN Customers c ON c.customer_id = o.customer_id
GROUP BY c.customer_name
ORDER BY total_spent DESC;

Result:

customer_name	total_spent	spend_tier	segment
Alice	550	1	High Spender
Charlie	200	2	Mid Spender
Bob	150	3	Low Spender

*NB
What NTILE(3) does is divide the results into 3 equal groups (tiles).

Alice, with a combined spend of 550, lands in Tier 1. Bob and Charlie each have a single order and are bucketed accordingly. Diana does not appear because she has no orders in the Orders table.

In conclusion, as mentioned earlier, these two concepts require a lot of practice to master. The best way to build mastery is to work with a dataset you already have and understand or a spreadsheet you have exported and deliberately write one query using each join type and one query using each of the window functions covered here.

DEV Community

Mastering SQL Joins and Window Functions with Real Examples

1. What are Joins?

A) INNER Join

B. LEFT(OUTER) Join

C) RIGHT Join

d) FULL OUTER Join

e) SELF Join

2. Window Functions

a) Ranking Functions (ROW_NUMBER, RANK, DENSE_RANK)

b) Aggregates Over a Window — SUM, AVG, COUNT

c) LAG and LEAD — Comparing Orders Across Rows

d) NTILE — Bucketing Customers by Spend

Top comments (0)