DEV Community

Mohamed Idris
Mohamed Idris

Posted on

Part 14: Window Functions (Ninja Mode)

This post is part of the SQL: Zero to Ninja series.

Your boss asks: "Show me each user's orders, and next to every order put its rank, like their 1st order, 2nd order, 3rd, by total." You reach for GROUP BY... and hit a wall. GROUP BY squashes all of a user's orders into one row. But you wanted to keep every order AND add a rank. You need both. That is exactly what window functions do, and they are the move that turns you into a SQL ninja.

The idea in one line

A window function computes a value across a group of rows (a "window") while keeping every single row, unlike GROUP BY, which collapses them into one.

The metaphor: the bakery queue

You are standing in line at the bakery. A little screen says you are customer number 4. You can see your position in the line. But here is the thing: you are still you, still standing there as yourself. You did not get blended into "the line." You kept your identity and also got a number.

That is a window function. Every row stays itself, and you bolt on extra info computed from the rows around it (your position, the running total, the person before you).

GROUP BY is the opposite. It is like the baker saying "I do not care about individuals, there are 12 people total." Twelve people become one number. Useful sometimes, but you lost everybody.

The shape: OVER (PARTITION BY ... ORDER BY ...)

A window function always has an OVER (...) clause. That is what makes it a window function.

SELECT
  name,
  total,
  ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY total DESC) AS rank_in_user
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Read OVER (...) in plain English:

PARTITION BY user_id   -->  split rows into one queue per user
ORDER BY total DESC     -->  inside each queue, line them up biggest first
ROW_NUMBER()            -->  hand each row its position number
Enter fullscreen mode Exit fullscreen mode

PARTITION BY is like having a separate bakery line per user. ORDER BY decides who stands where. No PARTITION BY? Then it is one big single line for the whole table.

Numbering rows: ROW_NUMBER, RANK, DENSE_RANK

These three all number rows, but they treat ties differently. Say two orders have the same total of 100.

total   ROW_NUMBER   RANK   DENSE_RANK
 150        1          1         1
 100        2          2         2
 100        3          2         2
  80        4          4         3
Enter fullscreen mode Exit fullscreen mode
  • ROW_NUMBER(): always 1, 2, 3, 4. No ties allowed, it just picks an order. Great for "give me exactly one row each."
  • RANK(): ties share a number, then it skips ahead (1, 2, 2, 4). Like the Olympics, two silvers and no bronze.
  • DENSE_RANK(): ties share a number, but it does not skip (1, 2, 2, 3). No gap.

Pick ROW_NUMBER when you need a unique number, RANK/DENSE_RANK when ties should genuinely tie.

Running totals with SUM() OVER

Add ORDER BY inside the window and SUM becomes a running total (each row adds itself to everything before it):

SELECT
  id,
  total,
  SUM(total) OVER (ORDER BY created_at) AS running_total
FROM orders;
Enter fullscreen mode Exit fullscreen mode
id  total   running_total
 1    50         50
 2    30         80     (50 + 30)
 3    20        100     (80 + 20)
Enter fullscreen mode Exit fullscreen mode

Every row is still here, and now each one knows the total so far. Try doing that with GROUP BY. You cannot, because GROUP BY would have crushed these into one row.

Peeking at neighbors: LAG and LEAD

LAG() looks at the previous row, LEAD() looks at the next one. Perfect for "compare this order to the user's last order."

SELECT
  user_id,
  total,
  LAG(total) OVER (PARTITION BY user_id ORDER BY created_at) AS previous_total
FROM orders;
Enter fullscreen mode Exit fullscreen mode
user_id  total   previous_total
   7       50         NULL    (first order, nobody before)
   7       80         50      (look back one row)
   7       60         80
Enter fullscreen mode Exit fullscreen mode

Now subtract total - previous_total and you instantly see if a user is spending more or less than last time.

The killer real case: top 3 orders per user

Here is the request from the top of the post: rank each user's orders, biggest first. ROW_NUMBER with PARTITION BY nails it:

SELECT
  user_id,
  total,
  ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY total DESC) AS rn
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Same trick ranks products by price inside each category:

SELECT
  category,
  name,
  price,
  ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank
FROM products;
Enter fullscreen mode Exit fullscreen mode

Notice the difference from GROUP BY:

GROUP BY category       -->  ONE row per category (you lose the products)
ROW_NUMBER() OVER (...) -->  EVERY product stays, plus a rank column
Enter fullscreen mode Exit fullscreen mode

That is the whole point. The window keeps everybody and adds a column.

Gotcha: you cannot filter a window result in WHERE

This is the big one. Window functions run after WHERE. So you cannot say WHERE price_rank <= 3, the rank does not exist yet when WHERE runs.

-- WRONG: price_rank is not known during WHERE
SELECT category, name,
  ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank
FROM products
WHERE price_rank <= 3;   -- error!
Enter fullscreen mode Exit fullscreen mode

The fix is to wrap it in a subquery or a CTE (remember Part 10?), then filter on the outside:

-- RIGHT: compute the rank inside, filter it outside
WITH ranked AS (
  SELECT category, name,
    ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank
  FROM products
)
SELECT category, name
FROM ranked
WHERE price_rank <= 3;   -- now it works, the column exists out here
Enter fullscreen mode Exit fullscreen mode

That CTE wrapper is the standard "top N per group" pattern. Memorize it.

Gotchas juniors hit

  1. Filtering on a window in WHERE. It is not there yet. Wrap it in a CTE or subquery and filter outside.
  2. Forgetting PARTITION BY. Without it, your "rank per user" becomes "rank across the whole table." The partition is what resets the count for each group.
  3. Mixing up RANK and DENSE_RANK on ties. RANK leaves gaps (1, 2, 2, 4), DENSE_RANK does not (1, 2, 2, 3). Pick on purpose.

Recap

  • A window function keeps every row and adds a value computed from nearby rows. GROUP BY collapses rows; windows do not.
  • Shape: function OVER (PARTITION BY ... ORDER BY ...).
  • ROW_NUMBER, RANK, DENSE_RANK number rows and differ on ties. SUM() OVER (ORDER BY ...) gives running totals. LAG/LEAD peek at neighbors.
  • "Top N per group" = ROW_NUMBER() inside a CTE, then filter in the outer WHERE.
  • Windows run after WHERE, so filter their results in a wrapper, not in WHERE.

Your turn

Write a query that gives each user only their single most expensive order (use ROW_NUMBER() and a CTE wrapper). Which column do you PARTITION BY, and what do you keep where rn = 1? Explain it to a friend and the ninja headband is yours.


And that is the series. You went from "what even is a database" all the way to window functions, the stuff a lot of working devs still find scary. You can query, filter, join, group, change data safely, model relationships, use subqueries and CTEs, index for speed, wrap things in transactions, block SQL injection, and now slice data with windows. That is a real, job-ready SQL toolkit.

So go build something. A tiny dashboard, a leaderboard, a "top products" report, anything that makes you write real queries against real data. That is how it sticks. Want to revisit a topic or share the series with a friend? Head back to the README for the full roadmap. Congratulations, ninja. You finished.

Top comments (1)

Collapse
 
edriso profile image
Mohamed Idris

Part 14 Practice: Wield the Windows

The final round. These use the shared schema (users, orders, products, order_items). Try each before peeking. Remember the golden rule: you cannot filter a window result in WHERE, so wrap it in a CTE when you need to.


1. Number each user's orders from oldest to newest

Show user_id, total, and a column order_seq that counts 1, 2, 3 within each user, oldest first.

Solution

SELECT
  user_id,
  total,
  ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at) AS order_seq
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Why: PARTITION BY user_id gives each user their own counting line, and ORDER BY created_at decides the order. ROW_NUMBER() hands out 1, 2, 3 and never ties.


2. Rank products by price inside each category

Show category, name, price, and a price_rank where the most expensive product in each category is rank 1. Use a rank that lets ties share a number with no gaps.

Solution

SELECT
  category,
  name,
  price,
  DENSE_RANK() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank
FROM products;
Enter fullscreen mode Exit fullscreen mode

Why: DENSE_RANK() lets two equally priced products both be rank 1, and the next one is rank 2 (no skipped number). Swap in RANK() if you want gaps after ties.


3. Build a running total of order revenue over time

Show each order's id, total, and a running_total that adds up all order totals up to and including that order, oldest first.

Solution

SELECT
  id,
  total,
  SUM(total) OVER (ORDER BY created_at) AS running_total
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Why: SUM(...) OVER (ORDER BY created_at) turns a plain sum into a running total. Each row keeps itself and also shows the total so far. No PARTITION BY here, so it runs across the whole table as one line.


4. Compare each order to the user's previous order

For each order show user_id, total, the user's previous_total, and the difference.

Solution

SELECT
  user_id,
  total,
  LAG(total) OVER (PARTITION BY user_id ORDER BY created_at) AS previous_total,
  total - LAG(total) OVER (PARTITION BY user_id ORDER BY created_at) AS difference
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Why: LAG() peeks one row back inside each user's line. The first order has no previous row, so previous_total (and the difference) is NULL.


5. Top 3 orders per user (the CTE wrapper)

List each user's three biggest orders by total. You must filter on the rank, so wrap it.

Solution

WITH ranked AS (
  SELECT
    user_id,
    total,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY total DESC) AS rn
  FROM orders
)
SELECT user_id, total
FROM ranked
WHERE rn <= 3;
Enter fullscreen mode Exit fullscreen mode

Why: You cannot put WHERE rn <= 3 in the same query that creates rn, because windows run after WHERE. The CTE computes the rank first, then the outer query filters it. This is the standard top-N-per-group pattern.


6. Each user's single most expensive order

Return one row per user: their user_id and the total of their priciest order.

Solution

WITH ranked AS (
  SELECT
    user_id,
    total,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY total DESC) AS rn
  FROM orders
)
SELECT user_id, total
FROM ranked
WHERE rn = 1;
Enter fullscreen mode Exit fullscreen mode

Why: Same wrapper trick, but you keep only rn = 1. ROW_NUMBER() guarantees exactly one row at the top per user, even if two orders tie on total.


You did it. You just used the tools that separate beginners from confident SQL devs: ranking, running totals, peeking at neighbors, and top-N-per-group. That is the whole series wrapped up.

Now go build something real and put these queries to work. A leaderboard, a sales report, a "your last order vs this one" feature, pick anything that excites you. Want the full map of everything you learned? It is all in the README. Congratulations, ninja. You finished the series.