DEV Community

Isika Millicent
Isika Millicent

Posted on

Mastering SQL Joins and Window Functions with Real Examples

If you have been writing SQL for a while, you have probably come across joins and window functions. These two features appear in a large percentage of real-world SQL queries, and for good reason. They are among the most powerful tools SQL has to offer.

However, they can be confusing at first. Understanding how tables connect with joins and how window functions analyze data across rows is a key step in becoming comfortable with SQL.

In this article, we will walk through the most common types of joins using a simple example and explain how they work.

1. What are Joins?

In SQL, joins are used to combine rows from two or more tables based on a related column between them.
Think of it as sliding two spreadsheets together so that matching values line up, the join type determines what you do with rows that have no partner on the other side.

To understand joins better, we will use the following two tables.

In the Customers table, customer_id is the primary key.
In the Orders table, order_id is the primary key, while customer_id acts as a foreign key that references the Customers table.

Table 1: Customers Table

| customer_id | customer_name |
|------------|---------------|
| 1          | Alice         |
| 2          | Bob           |
| 3          | Charlie       |
| 4          | Diana         |
Enter fullscreen mode Exit fullscreen mode

Table 2: Orders Table

| order_id | customer_id | total_amount |
|----------|------------|--------------|
| 101      | 1          | 250          |
| 102      | 2          | 150          |
| 103      | 1          | 300          |
| 104      | 3          | 200          |
| 105      | 5          | 100          |
Enter fullscreen mode Exit fullscreen mode

NB:

  • Customer 4 (Diana) has no orders
  • Order 105 belongs to customer_id 5, which does not exist in Customers.

A) INNER Join

Returns rows where there is a match in both tables.
If there’s no match, the result set will not include those records.

SELECT c.customer_name, o.order_id, o.total_amount
FROM Customers c
INNER JOIN Orders o
ON c.customer_id = o.customer_id;

Result: The above query returns only customers who actually have orders.

customer_name   order_id    total_amount
Alice             101          250
Alice             103          300
Bob               102          150
Charlie           104          200
Enter fullscreen mode Exit fullscreen mode

NB
Notice that Diana (no orders) and Order 105 (no matching customer) are both absent. Only rows with a valid match on both sides make it through.

Inner Join

This is the most common join.
It is mostly when you want records that exist in both tables eg orders with valid customers, students with enrolled courses.

B. LEFT(OUTER) Join

This join returns all records from the left table and the matched records from the right table.
If there’s no match, NULL values are returned for the right table’s columns.

In this case, the left table is the customers table and on the right is the orders table.

SELECT c.customer_name, o.order_id, o.total_amount
FROM Customers c
LEFT JOIN Orders o
ON c.customer_id = o.customer_id;

Result: The above query returns all customers even those with no orders.

customer_name   order_id   total_amount
Alice              101        250
Alice              103        300
Bob                102        150
Charlie            104        200
Diana              NULL       NULL
Enter fullscreen mode Exit fullscreen mode

NB

  • Diana now appears in the results, even though she has no orders — her order columns simply return NULL.
  • Left Join is the foundation of the anti-join pattern — filtering to WHERE o.order_id IS NULL gives you only the customers who have never placed an order.

B.i) Anti Join
An Anti Join is used when you want to find records in one table that do not have a corresponding match in another table.

For Example:

  • Find customers who haven’t purchased anything
  • Identify products with no sales

Using the above table, LEFT JOIN + WHERE NULL
SELECT c.customer_id, c.customer_name
FROM Customers c
LEFT JOIN Orders o
ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;

Result: For customers without matching orders, order_idreturns NULL.
The WHERE o.order_id IS NULL condition filters these unmatched rows.
This is the anti join pattern.

customer_id    customer_name
4                 Diana
Enter fullscreen mode Exit fullscreen mode

C) RIGHT Join

Returns all rows from the right table and matching rows from the left table.

SELECT c.customer_name, o.order_id, o.total_amount
FROM Customers c
RIGHT JOIN Orders o
ON c.customer_id = o.customer_id;

Result: The above query returns all orders, even those without matching customers.

customer_name   order_id   total_amount
Alice             101         250
Alice             103         300
Bob               102         150
Charlie           104         200
NULL              105         100
Enter fullscreen mode Exit fullscreen mode

d) FULL OUTER Join

Returns all rows when there is a match in one of the tables.

SELECT c.customer_name, o.order_id, o.total_amount
FROM Customers c
FULL OUTER JOIN Orders o
ON c.customer_id = o.customer_id;

Result: Includes all customers and all orders, matching where possible. Combine both LEFT + RIGHT and everything shown

customer_name   order_id    total_amount
Alice             101           250
Alice             103           300
Bob               102           150
Charlie           104           200
Diana             NULL          NULL
NULL              105          100
Enter fullscreen mode Exit fullscreen mode

e) SELF Join

A Self Join is when a table is joined with itself.

Instead of joining two different tables, you treat the same table as if it were two separate tables by using table aliases.

Self joins are commonly used when a table contains hierarchical or related data within itself, such as:

  • employees and managers
  • categories and subcategories

Example Table: Employees

employee_id     employee_name    manager_id
1                  Alice           NULL
2                  Bob             1
3                  Charlie         1
4                  Diana           2
Enter fullscreen mode Exit fullscreen mode

NB

  • Alice is the top manager (no manager).
  • Bob and Charlie report to Alice.
  • Diana reports to Bob.

SELECT
e.employee_name AS employee,
m.employee_name AS manager
FROM employees e
LEFT JOIN employees m
ON e.manager_id = m.employee_id;

Result: The above query essentially asks: "For each employee, find the person whose employee_id matches their manager_id."

employee     manager
Alice         NULL
Bob           Alice
Charlie       Alice
Diana         Bob
Enter fullscreen mode Exit fullscreen mode

Let me explain:

  • The employees table is referenced twice in the query.
  • e represents the employee.
  • m represents the manager.
  • We match manager_id from one instance of the table to employee_id in the other.

2. Window Functions

While joins help you combine data across tables, window functions help you analyze data across rows — without collapsing those rows into a single result the way GROUP BY does.

This is the key distinction: GROUP BY aggregates rows and destroys their individual identity. A window function performs the same kind of calculation but preserves every row, attaching the result as a new column alongside the original data.

The syntax uses OVER()

The OVER() clause is what transforms an ordinary aggregate into a window function. An empty OVER() applies the function across the entire result set.
PARTITION BY subdivides that set into groups, restarting the calculation for each one.

a) Ranking Functions (ROW_NUMBER, RANK, DENSE_RANK)

Suppose you want to rank each order by its total_amount — highest to lowest.
Three ranking functions can do this, and their difference only shows up when values tie.

SELECT o.order_id, c.customer_name, o.total_amount,
    -- Always unique; no ties are possible (1, 2, 3, 4 …)
ROW_NUMBER() OVER (ORDER BY o.total_amount DESC) AS row_num,
    -- Tied rows share a rank; the next rank skips (1, 1, 3 …)
RANK() OVER (ORDER BY o.total_amount DESC) AS rnk,
    -- Tied rows share a rank; no ranks are skipped (1, 1, 2 …)
DENSE_RANK() OVER (ORDER BY o.total_amount DESC) AS dense_rnk
FROM Orders o
INNER JOIN Customers c ON c.customer_id = o.customer_id;
Enter fullscreen mode Exit fullscreen mode

Result:

order_id customer_name total_amount row_num rnk dense_rnk
103 Alice 300 1 1 1
101 Alice 250 2 2 2
104 Charlie 200 3 3 3
102 Bob 150 4 4 4

NB:

  • Order 105 does not appear because it has no matching customer — the INNER JOIN filters it out.

b) Aggregates Over a Window — SUM, AVG, COUNT

One of the most powerful things window functions enable is comparing each row against an overall or group-level aggregate — without GROUP BY collapsing your data.

Here, we calculate the total revenue across all orders and each order's percentage share of that total:

SELECT o.order_id, c.customer_name, o.total_amount,    SUM(o.total_amount) OVER () AS grand_total,
FROM Orders o
INNER JOIN Customers c ON c.customer_id = o.customer_id;
Enter fullscreen mode Exit fullscreen mode

Result:

order_id customer_name total_amount grand_total
101 Alice 250 900
102 Bob 150 900
103 Alice 300 900
104 Charlie 200 900

NB

  • Notice that grand_total is the same on every row.
  • The OVER() means "look across the entire result set."
  • Each row keeps its own identity while also having access to the overall total.

c) LAG and LEAD — Comparing Orders Across Rows

LAG looks at the previous row's value and LEAD looks at the next row's value.
Here, we can use LAG to show how much each of Alice's orders changed compared to her previous one.

SELECT o.order_id, c.customer_name, o.total_amount,
    -- We are now looking for the previous order's amount for the same customer
    -- o.total_amount is the column we want to “look back” at.
    -- the` 1 `returns how many rows back we want to look (1 row back in this case)
LAG(o.total_amount, 1) OVER (PARTITION BY o.customer_id
        ORDER BY o.order_id) AS prev_order_amount,
    -- Difference from the previous order using minus sign
 o.total_amount - LAG(o.total_amount, 1) 
       OVER (PARTITION BY o.customer_id
       ORDER BY o.order_id) AS change_from_prev

FROM Orders o
INNER JOIN Customers c ON c.customer_id = o.customer_id
ORDER BY c.customer_name, o.order_id;
Enter fullscreen mode Exit fullscreen mode

Result:

order_id customer_name total_amount prev_order_amount change_from_prev
101 Alice 250 NULL NULL
103 Alice 300 250 50
102 Bob 150 NULL NULL
104 Charlie 200 NULL NULL

NB

  • Alice's second order (103) was 50 more than her first (101). Bob and - Charlie each only have one order, so their prev_order_amount returns NULL — there is no previous row to look back at.

d) NTILE — Bucketing Customers by Spend

NTILE(n) divides rows into n roughly equal buckets and assigns each row a bucket number. It is perfect for segmenting customers by how much they have spent.

Since Diana has no orders, we first join and aggregate, then apply NTILE to rank the customers who do have orders:

SELECT c.customer_name, SUM(o.total_amount)                               AS total_spent,
NTILE(3) OVER (ORDER BY SUM(o.total_amount) DESC) AS spend_tier,
CASE NTILE(3) OVER (ORDER BY SUM(o.total_amount) DESC)
WHEN 1 THEN 'High Spender'
WHEN 2 THEN 'Mid Spender'
ELSE 'Low Spender'
END AS segment

FROM Orders o
INNER JOIN Customers c ON c.customer_id = o.customer_id
GROUP BY c.customer_name
ORDER BY total_spent DESC;
Enter fullscreen mode Exit fullscreen mode

Result:

customer_name total_spent spend_tier segment
Alice 550 1 High Spender
Charlie 200 2 Mid Spender
Bob 150 3 Low Spender

*NB
What NTILE(3) does is divide the results into 3 equal groups (tiles).

Alice, with a combined spend of 550, lands in Tier 1. Bob and Charlie each have a single order and are bucketed accordingly. Diana does not appear because she has no orders in the Orders table.

In conclusion, as mentioned earlier, these two concepts require a lot of practice to master. The best way to build mastery is to work with a dataset you already have and understand or a spreadsheet you have exported and deliberately write one query using each join type and one query using each of the window functions covered here.

Top comments (0)