DEV Community

Cover image for SQL Joins & Window Functions: The Skills That Separate Analysts from Beginners
Mburu
Mburu

Posted on

SQL Joins & Window Functions: The Skills That Separate Analysts from Beginners

You have learned the basics of SQL - 'SELECT', 'WHERE', 'GROUP BY' and now you are ready to level up: to move from I know SQL to I can analyze data. I will help you understand Joins and Window Functions and with these you`ll be able to answer much more complex questions from your data.

PART 1: SQL JOINS

Imagine you have two tables:

  • One table has Customers
  • Another table has Orders

If you want to know which customer made which order,
you will need a JOIN.

What Is a JOIN?

A Join lets you combine these two tables so you can see the customer alongside their order all in one result.
In simple terms: A Join connects rows from two or more tables based on a related column between them.

Example Tables

Customers Table

customer_id name
1 Shujaa
2 Achieng
3 Brian

Orders Table

order_id customer_id amount
101 1 5000
102 1 3000
103 2 7000

INNER JOIN

This is the Most Common join and returns only the rows where there is a match in both tables.

sql
SELECT
c.name,
o.order_id,
o.amount
FROM Customers c
INNER JOIN Orders o
ON c.customer_id = o.customer_id;

What Happens?

  • Shujaa → 2 orders
  • Achieng → 1 order
  • Brian → No orders → NOT shown

Only records that match in both tables are returned.

LEFT JOIN

Returns all rows from the left table, and the matching rows from the right table. If there's no match, you get NULL on the right side.

sql
SELECT
c.name,
o.order_id,
o.amount
FROM Customers c
LEFT JOIN Orders o
ON c.customer_id = o.customer_id;

What Happens?

  • Shujaa → 2 orders
  • Achieng → 1 order
  • Brian → NULL values for order columns

LEFT JOIN keeps all customers even those who haven`t placed an order yet. Those customers will show up with NULL in the order column.

RIGHT JOIN

The opposite of LEFT JOIN. Returns all rows from the right table, and matching rows from the left.

SELECT 
    c.name,
    o.order_id,
    o.amount
FROM Customers c
RIGHT JOIN Orders o
    ON c.customer_id = o.customer_id;
Enter fullscreen mode Exit fullscreen mode

Keeps all orders even if a customer is missing.
Pro tip: RIGHT JOINs are less common. Most developers just flip the table order and use a LEFT JOIN instead: same result, easier to read.

FULL OUTER JOIN

It has give me everything vibes. Returns all rows from both tables, with NULL where there's no match on either side.

SELECT 
    c.name,
    o.order_id,
    o.amount
FROM Customers c
FULL OUTER JOIN Orders o
    ON c.customer_id = o.customer_id;
Enter fullscreen mode Exit fullscreen mode

Keeps all records from both sides.

CROSS JOIN

Every combination possible

Returns the Cartesian product every row from the left table paired with every row from the right table.

SELECT c.name,o.amount
FROM customers
CROSS JOIN amounts;
Enter fullscreen mode Exit fullscreen mode

If Table A has 100 rows and Table B has 100 rows, you get 10,000 rows back so use with caution!

Why JOINS Matter in Real Life

Joins allow you to:

  • Connect sales with products
  • Connect patients with visits
  • Connect employees with departments
  • Connect transactions with accounts

Without joins, analysis is incomplete.

PART 2: WINDOW FUNCTIONS

If JOINs combine tables, Window functions analyze data without collapsing it. A regular GROUP BY aggregation collapses your rows into one summary row per group. But what if you want the aggregated value AND the individual row at the same time?

That's exactly what Window Functions do.

What Is a Window Function?

A Window Function performs a calculation across a set of rows related to the current row without collapsing them into one.
Think of it like looking through a sliding window across your data. The window moves row by row, and for each row, it calculates something based on a defined group of surrounding rows.

Basic Example: Total Sales Per Customer

SELECT 
    customer_id,
    order_id,
    amount,
    SUM(amount) OVER (PARTITION BY customer_id) AS total_per_customer
FROM Orders;
Enter fullscreen mode Exit fullscreen mode

What Is Happening?

PARTITION BY customer_id
= Group by customer internally

But it does NOT remove rows.

Instead, it adds a new column showing:

  • Shujaa → 8000 (5000 + 3000)
  • Achieng → 7000

Each order still appears individually.

Common Window Functions

ROW_NUMBER()

Assigns a sequential number to each row within a partition.

SELECT 
    customer_id,
    order_id,
    amount,
    ROW_NUMBER() OVER (
        PARTITION BY customer_id
        ORDER BY amount DESC
    ) AS rank_per_customer
FROM Orders;
Enter fullscreen mode Exit fullscreen mode

This helps answer:

  • What is each customer’s highest purchase?
  • Which transaction came first?

RANK() and DENSE RANK()

Both rank rows, but handle ties differently:

  • RANK() — Skips numbers after a tie (1, 2, 2, 4)
  • DENSE_RANK() — Does NOT skip numbers after a tie (1, 2, 2, 3)
SELECT 
    amount,
    RANK() OVER (ORDER BY amount DESC) AS sales_rank
    DENSE RANK() OVER (ORDER BY amount DESC) AS sales_dense_rank
FROM Orders;
Enter fullscreen mode Exit fullscreen mode

If two orders have same amount → same rank.

Running Total

SELECT 
    order_id,
    amount,
    SUM(amount) OVER (
        ORDER BY order_id
    ) AS running_total
FROM Orders;
Enter fullscreen mode Exit fullscreen mode

Shows cumulative growth over time.

Very powerful for dashboards.

LAG() and LEAD()

Look backwards and forwards

  • LAG() — Gets the value from the previous row
  • LEAD() — Gets the value from the next row
SELECT
  month,
  revenue,
  LAG(revenue, 1) OVER (ORDER BY month) AS prev_month_revenue,
  revenue - LAG(revenue, 1) OVER (ORDER BY month) AS month_over_month_change
FROM sales;
Enter fullscreen mode Exit fullscreen mode

Use case: Calculate month-over-month revenue growth without a self-join.

SUM(), AVG(), COUNT() as Window Functions

Running totals and moving averages

Yes, your familiar aggregate functions can be used as window functions too!

-- Running total of sales
SELECT
  order_date,
  amount,
  SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders;
Enter fullscreen mode Exit fullscreen mode
-- Moving average over the last 3 rows
SELECT
  order_date,
  amount,
  AVG(amount) OVER (
    ORDER BY order_date
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
  ) AS moving_avg_3
FROM orders;
Enter fullscreen mode Exit fullscreen mode

NTILE()

Divide rows into buckets

Splits rows into N equal groups (quartiles, deciles, percentiles, etc.).

SELECT
  name,
  salary,
  NTILE(4) OVER (ORDER BY salary DESC) AS salary_quartile
FROM employees;
Enter fullscreen mode Exit fullscreen mode

Use case: Segment customers into top 25%, second 25%, etc.

Window Functions - Key Points Summary

Function What It Does
ROW_NUMBER() Unique sequential number per row
RANK() Rank with gaps after ties
DENSE_RANK() Rank without gaps after ties
LAG() Value from a previous row
LEAD() Value from a next row
SUM() OVER() Running total
AVG() OVER() Moving average
NTILE(n) Divide rows into n buckets

JOIN vs WINDOW FUNCTION

Feature JOIN Window Function
Combines tables ✅ Yes ❌ No
Keeps all rows Depends ✅ Yes
Used for ranking ✅ Yes
Used for cumulative totals ✅ Yes

They solve different problems.

Real Business Example 1

Imagine you're analyzing hospital data:

You might:

  • JOIN patients with visits
  • JOIN visits with prescriptions
  • Then use window functions to:

    • Rank top diseases
    • Calculate running monthly admissions
    • Find most expensive prescriptions per patient

This is how analysts think.

Real Business Example 2

Let's say you work at an e-commerce company and want to answer this question:

For each customer, show their total orders, their most recent order date, and rank them by total spend.

SELECT
  c.name,
  COUNT(o.id) AS total_orders,
  MAX(o.order_date) AS last_order_date,
  SUM(o.amount) AS total_spend,
  RANK() OVER (ORDER BY SUM(o.amount) DESC) AS spend_rank
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.name
ORDER BY spend_rank;
Enter fullscreen mode Exit fullscreen mode

This single query uses both a LEFT JOIN (to include customers with no orders) and a Window Function (RANK()) to rank customers by spend - all in one clean result.

Key Concepts Explained Simply

JOIN = Attach Information

-You have information in different places.
JOIN brings them together.
-Master INNER, LEFT, and FULL OUTER and you'll cover 95% of real-world use cases.

WINDOW FUNCTION = Calculations that need context from surrounding rows without losing your individual row data.

  • The OVER() clause is the heart of every window function. Learn PARTITION BY and ORDER BY inside it first.
  • LAG() and LEAD() are underrated gems. They eliminate the need for messy self-joins when comparing rows.
  • You can combine Joins and Window Functions in the same query for powerful, real-world analytics

Beginner Mistakes to Avoid

❌ Using GROUP BY when you need row-level detail
❌ Forgetting the ON condition in JOIN
❌ Mixing up PARTITION BY and GROUP BY
❌ Not ordering inside window functions when needed

What You Should Remember

  1. JOINs combine data from multiple tables.
  2. INNER JOIN returns matches only.
  3. LEFT JOIN keeps everything from the left table.
  4. Window functions calculate across rows without collapsing them.
  5. PARTITION BY groups data internally.
  6. ORDER BY inside OVER() controls ranking and running totals.
  7. As an analysts, you can use JOINs and Window Functions together in real projects.

In My Own Words

If SQL were a toolbox:

JOINs are like screws and bolts - they connect pieces.
Window functions are like measuring tools - they help you analyze the structure without breaking it apart.

If you found this helpful, try this challenge:

Challenge: Write a query that finds the highest purchase per customer using JOIN and ROW_NUMBER().

Happy querying! If this helped you, share it with a fellow developer/analyst who's just getting started with SQL.

Top comments (0)