SQL Window Functions
SQL Window Functions are advanced SQL functions used to perform calculations across a set of rows while still keeping the individual row details.
Unlike GROUP BY, which collapses rows into a single summary row, Window Functions allow you to:
- Keep all original rows
- Add calculated values beside them
- Perform ranking, comparisons, running totals, analytics, etc.
The Problem with GROUP BY
Problem: When you use GROUP BY, rows collapse into summary data.
Suppose we want employee details along with the maximum salary in their department:
SELECT dept_name, MAX(salary)
FROM employee
GROUP BY dept_name;
Output:
| dept_name | max_salary |
|---|---|
| HR | 8000 |
| IT | 12000 |
Issue: Employee rows are lost—only grouped data remains.
The Window Functions Solution
Window Functions solve this by allowing aggregate calculations without collapsing rows. They keep all original data while adding calculated values alongside.
Basic Syntax
FUNCTION_NAME(...) OVER(
PARTITION BY column
ORDER BY column
)
Understanding the OVER() Clause
The OVER() clause defines the window (set of rows) on which the function operates.
OVER() Without PARTITION
When OVER() has no PARTITION BY, the entire table becomes one window:
SELECT
emp_name,
salary,
MAX(salary) OVER() AS max_salary
FROM employee;
Result: Every row gets the same maximum salary value.
| emp_name | salary | max_salary |
|---|---|---|
| A | 5000 | 12000 |
| B | 7000 | 12000 |
| C | 12000 | 12000 |
PARTITION BY
PARTITION BY divides rows into groups/windows. The function calculation restarts for each partition.
SELECT
emp_name,
dept_name,
salary,
MAX(salary) OVER(PARTITION BY dept_name) AS dept_max_salary
FROM employee;
Result: Employees are grouped by department, and maximum salary is calculated separately per department.
| emp_name | dept_name | salary | dept_max_salary |
|---|---|---|---|
| A | HR | 5000 | 8000 |
| B | HR | 8000 | 8000 |
| C | IT | 12000 | 12000 |
ORDER BY Inside OVER()
Defines ordering within each partition/window. This is especially important for:
- Ranking functions
- Running totals
- Previous/next row calculations
ROW_NUMBER()
Assigns a unique sequential number to each row.
Syntax:
ROW_NUMBER() OVER(
PARTITION BY column
ORDER BY column
)
Example:
SELECT
emp_name,
dept_name,
salary,
ROW_NUMBER() OVER(
PARTITION BY dept_name
ORDER BY salary DESC
) AS rn
FROM employee;
Rows are divided by department, with the highest salary getting row number 1, which resets for each department.
| emp_name | dept_name | salary | rn |
|---|---|---|---|
| B | HR | 8000 | 1 |
| A | HR | 5000 | 2 |
| C | IT | 12000 | 1 |
RANK()
Assigns ranking with gaps when ties occur.
SELECT
emp_name,
salary,
RANK() OVER(
ORDER BY salary DESC
) AS rnk
FROM employee;
| emp_name | salary | rank |
|---|---|---|
| A | 12000 | 1 |
| B | 10000 | 2 |
| C | 10000 | 2 |
| D | 8000 | 4 |
Important: Ranks skip numbers after ties (sequence: 1, 2, 2, 4).
DENSE_RANK()
Similar to RANK(), but does NOT skip numbers.
SELECT
emp_name,
salary,
DENSE_RANK() OVER(
ORDER BY salary DESC
) AS drnk
FROM employee;
| emp_name | salary | dense_rank |
|---|---|---|
| A | 12000 | 1 |
| B | 10000 | 2 |
| C | 10000 | 2 |
| D | 8000 | 3 |
Difference Between RANK and DENSE_RANK:
| Function | Output |
|---|---|
| RANK | 1, 2, 2, 4 |
| DENSE_RANK | 1, 2, 2, 3 |
LAG() - Access Previous Row
Used to retrieve data from a previous row.
Syntax:
LAG(column, offset, default_value)
OVER(ORDER BY column)
| Parameter | Meaning |
|---|---|
| column | Value to fetch |
| offset | How many rows back (default: 1) |
| default_value | Returned if previous row doesn't exist |
Example:
SELECT
emp_name,
salary,
LAG(salary) OVER(
ORDER BY joining_date
) AS prev_salary
FROM employee;
| emp_name | salary | prev_salary |
|---|---|---|
| A | 5000 | NULL |
| B | 7000 | 5000 |
| C | 9000 | 7000 |
LEAD() - Access Next Row
Used to retrieve data from the next row.
SELECT
emp_name,
salary,
LEAD(salary) OVER(
ORDER BY joining_date
) AS next_salary
FROM employee;
Practical Example: Top N Employees Per Department
Goal: Find the top 2 highest-paid employees from each department.
Why the straightforward approach fails:
SELECT *
FROM employee
WHERE ROW_NUMBER() OVER(...) <= 2; -- ❌ This doesn't work
Window functions are calculated after the WHERE clause, so this query fails.
Correct approach using CTE:
WITH ranked_employees AS (
SELECT
emp_name,
dept_name,
salary,
ROW_NUMBER() OVER(
PARTITION BY dept_name
ORDER BY salary DESC
) AS rn
FROM employee
)
SELECT *
FROM ranked_employees
WHERE rn <= 2;
SQL Execution Order (Important)
Understanding the execution sequence explains why window functions cannot be used directly in WHERE:
FROM
WHERE
GROUP BY
HAVING
WINDOW FUNCTIONS ← Functions calculated here
SELECT
ORDER BY
LIMIT
Since WHERE executes before window functions, WHERE rn <= 2 cannot reference the window function column rn. Solution: Use a subquery or CTE.
Comparing Rows: LAG() with CASE
SELECT
emp_name,
salary,
LAG(salary) OVER(
ORDER BY joining_date
) AS prev_salary,
CASE
WHEN salary > LAG(salary) OVER(ORDER BY joining_date)
THEN 'Higher'
WHEN salary < LAG(salary) OVER(ORDER BY joining_date)
THEN 'Lower'
ELSE 'Same'
END AS comparison
FROM employee;
Commonly Used Window Functions
| Function | Purpose |
|---|---|
| ROW_NUMBER | Unique sequential numbering |
| RANK | Ranking with gaps on ties |
| DENSE_RANK | Ranking without gaps |
| LAG | Get previous row value |
| LEAD | Get next row value |
| FIRST_VALUE | First value in window |
| LAST_VALUE | Last value in window |
| NTILE | Divide rows into equal buckets |
| SUM() OVER | Running total |
| AVG() OVER | Moving average |
Running Total Example
SELECT
emp_name,
salary,
SUM(salary) OVER(
ORDER BY emp_id
) AS running_total
FROM employee;
Key Comparisons for Interviews
GROUP BY vs Window Functions
| GROUP BY | Window Function |
|---|---|
| Collapses rows | Keeps rows |
| One row per group | Original rows remain |
| Aggregate only | Aggregate + row details |
RANK vs DENSE_RANK vs ROW_NUMBER
| Function | Tie Handling | Example |
|---|---|---|
| ROW_NUMBER | Always unique | 1, 2, 3, 4 |
| RANK | Skips rank | 1, 2, 2, 4 |
| DENSE_RANK | No skip | 1, 2, 2, 3 |
Most Asked Interview Query
Find the top 3 salaries per department:
WITH cte AS (
SELECT
emp_name,
dept_name,
salary,
DENSE_RANK() OVER(
PARTITION BY dept_name
ORDER BY salary DESC
) AS drnk
FROM employee
)
SELECT *
FROM cte
WHERE drnk <= 3;
Key Interview Points
Differences Between GROUP BY and Window Functions:
- GROUP BY collapses rows into summary data—you lose individual employee details
- Window Functions keep all rows while adding calculated values alongside
- Use GROUP BY for pure aggregation; use Window Functions when you need both detail and summary
When to Use RANK vs DENSE_RANK:
- RANK() when gaps in rank numbers are acceptable (e.g., 1st, 2nd, 2nd, 4th)
- DENSE_RANK() when you want consecutive ranks without gaps
Why Window Functions in WHERE Don't Work:
SQL execution order processes WHERE before window functions. Use CTEs or subqueries instead.
Best Practices
- Use
PARTITION BYintentionally—partition only when you need separate calculations per group - Always specify
ORDER BYfor ranking functions; without it, results may be unpredictable - Use CTEs for readability when combining window functions with WHERE clauses
- Prefer
DENSE_RANKwhen skipped rankings don't fit your business logic - Add indexes on PARTITION BY and ORDER BY columns for performance on large datasets
Quick Reference
Window Function Formula:
FUNCTION() OVER(
PARTITION BY ... -- Group rows
ORDER BY ... -- Define sequence
)
Remember:
-
OVER()= defines the calculation window -
PARTITION BY= creates logical groups -
ORDER BY= defines row sequence within the window - Ranking: ROW_NUMBER (always unique) → RANK (with gaps) → DENSE_RANK (no gaps)
- Navigation: LAG (previous) ↔ LEAD (next)
Summary
Window Functions are one of the most powerful SQL features for analytics, reporting, ranking, time-series analysis, running totals, and comparative analysis. They're heavily used in:
- Data Engineering
- Backend Development
- Business Intelligence
- Data Analytics
- SQL Interviews
Master them, and you'll handle complex analytical queries with ease.
Top comments (0)