Abhishek Tiwari

Posted on May 29

SQL Window Function (Simple way)

#beginners #database #sql #tutorial

SQL Window Functions

SQL Window Functions are advanced SQL functions used to perform calculations across a set of rows while still keeping the individual row details.

Unlike GROUP BY, which collapses rows into a single summary row, Window Functions allow you to:

Keep all original rows
Add calculated values beside them
Perform ranking, comparisons, running totals, analytics, etc.

The Problem with GROUP BY

Problem: When you use GROUP BY, rows collapse into summary data.

Suppose we want employee details along with the maximum salary in their department:

SELECT dept_name, MAX(salary)
FROM employee
GROUP BY dept_name;

Output:

dept_name	max_salary
HR	8000
IT	12000

Issue: Employee rows are lost—only grouped data remains.

The Window Functions Solution

Window Functions solve this by allowing aggregate calculations without collapsing rows. They keep all original data while adding calculated values alongside.

Basic Syntax

FUNCTION_NAME(...) OVER(
    PARTITION BY column
    ORDER BY column
)

Understanding the OVER() Clause

The OVER() clause defines the window (set of rows) on which the function operates.

OVER() Without PARTITION

When OVER() has no PARTITION BY, the entire table becomes one window:

SELECT
    emp_name,
    salary,
    MAX(salary) OVER() AS max_salary
FROM employee;

Result: Every row gets the same maximum salary value.

emp_name	salary	max_salary
A	5000	12000
B	7000	12000
C	12000	12000

PARTITION BY

PARTITION BY divides rows into groups/windows. The function calculation restarts for each partition.

SELECT
    emp_name,
    dept_name,
    salary,
    MAX(salary) OVER(PARTITION BY dept_name) AS dept_max_salary
FROM employee;

Result: Employees are grouped by department, and maximum salary is calculated separately per department.

emp_name	dept_name	salary	dept_max_salary
A	HR	5000	8000
B	HR	8000	8000
C	IT	12000	12000

ORDER BY Inside OVER()

Defines ordering within each partition/window. This is especially important for:

Ranking functions
Running totals
Previous/next row calculations

ROW_NUMBER()

Assigns a unique sequential number to each row.

Syntax:

ROW_NUMBER() OVER(
    PARTITION BY column
    ORDER BY column
)

Example:

SELECT
    emp_name,
    dept_name,
    salary,
    ROW_NUMBER() OVER(
        PARTITION BY dept_name
        ORDER BY salary DESC
    ) AS rn
FROM employee;

Rows are divided by department, with the highest salary getting row number 1, which resets for each department.

emp_name	dept_name	salary	rn
B	HR	8000	1
A	HR	5000	2
C	IT	12000	1

RANK()

Assigns ranking with gaps when ties occur.

SELECT
    emp_name,
    salary,
    RANK() OVER(
        ORDER BY salary DESC
    ) AS rnk
FROM employee;

emp_name	salary	rank
A	12000	1
B	10000	2
C	10000	2
D	8000	4

Important: Ranks skip numbers after ties (sequence: 1, 2, 2, 4).

DENSE_RANK()

Similar to RANK(), but does NOT skip numbers.

SELECT
    emp_name,
    salary,
    DENSE_RANK() OVER(
        ORDER BY salary DESC
    ) AS drnk
FROM employee;

emp_name	salary	dense_rank
A	12000	1
B	10000	2
C	10000	2
D	8000	3

Difference Between RANK and DENSE_RANK:

Function	Output
RANK	1, 2, 2, 4
DENSE_RANK	1, 2, 2, 3

LAG() - Access Previous Row

Used to retrieve data from a previous row.

Syntax:

LAG(column, offset, default_value)
OVER(ORDER BY column)

Parameter	Meaning
column	Value to fetch
offset	How many rows back (default: 1)
default_value	Returned if previous row doesn't exist

Example:

SELECT
    emp_name,
    salary,
    LAG(salary) OVER(
        ORDER BY joining_date
    ) AS prev_salary
FROM employee;

emp_name	salary	prev_salary
A	5000	NULL
B	7000	5000
C	9000	7000

LEAD() - Access Next Row

Used to retrieve data from the next row.

SELECT
    emp_name,
    salary,
    LEAD(salary) OVER(
        ORDER BY joining_date
    ) AS next_salary
FROM employee;

Practical Example: Top N Employees Per Department

Goal: Find the top 2 highest-paid employees from each department.

Why the straightforward approach fails:

SELECT *
FROM employee
WHERE ROW_NUMBER() OVER(...) <= 2;  -- ❌ This doesn't work

Window functions are calculated after the WHERE clause, so this query fails.

Correct approach using CTE:

WITH ranked_employees AS (
    SELECT
        emp_name,
        dept_name,
        salary,
        ROW_NUMBER() OVER(
            PARTITION BY dept_name
            ORDER BY salary DESC
        ) AS rn
    FROM employee
)
SELECT *
FROM ranked_employees
WHERE rn <= 2;

SQL Execution Order (Important)

Understanding the execution sequence explains why window functions cannot be used directly in WHERE:

FROM
WHERE
GROUP BY
HAVING
WINDOW FUNCTIONS  ← Functions calculated here
SELECT
ORDER BY
LIMIT

Since WHERE executes before window functions, WHERE rn <= 2 cannot reference the window function column rn. Solution: Use a subquery or CTE.

Comparing Rows: LAG() with CASE

SELECT
    emp_name,
    salary,
    LAG(salary) OVER(
        ORDER BY joining_date
    ) AS prev_salary,
    CASE
        WHEN salary > LAG(salary) OVER(ORDER BY joining_date)
            THEN 'Higher'
        WHEN salary < LAG(salary) OVER(ORDER BY joining_date)
            THEN 'Lower'
        ELSE 'Same'
    END AS comparison
FROM employee;

Commonly Used Window Functions

Function	Purpose
ROW_NUMBER	Unique sequential numbering
RANK	Ranking with gaps on ties
DENSE_RANK	Ranking without gaps
LAG	Get previous row value
LEAD	Get next row value
FIRST_VALUE	First value in window
LAST_VALUE	Last value in window
NTILE	Divide rows into equal buckets
SUM() OVER	Running total
AVG() OVER	Moving average

Running Total Example

SELECT
    emp_name,
    salary,
    SUM(salary) OVER(
        ORDER BY emp_id
    ) AS running_total
FROM employee;

Key Comparisons for Interviews

GROUP BY vs Window Functions

GROUP BY	Window Function
Collapses rows	Keeps rows
One row per group	Original rows remain
Aggregate only	Aggregate + row details

RANK vs DENSE_RANK vs ROW_NUMBER

Function	Tie Handling	Example
ROW_NUMBER	Always unique	1, 2, 3, 4
RANK	Skips rank	1, 2, 2, 4
DENSE_RANK	No skip	1, 2, 2, 3

Most Asked Interview Query

Find the top 3 salaries per department:

WITH cte AS (
    SELECT
        emp_name,
        dept_name,
        salary,
        DENSE_RANK() OVER(
            PARTITION BY dept_name
            ORDER BY salary DESC
        ) AS drnk
    FROM employee
)
SELECT *
FROM cte
WHERE drnk <= 3;

Key Interview Points

Differences Between GROUP BY and Window Functions:

GROUP BY collapses rows into summary data—you lose individual employee details
Window Functions keep all rows while adding calculated values alongside
Use GROUP BY for pure aggregation; use Window Functions when you need both detail and summary

When to Use RANK vs DENSE_RANK:

RANK() when gaps in rank numbers are acceptable (e.g., 1st, 2nd, 2nd, 4th)
DENSE_RANK() when you want consecutive ranks without gaps

Why Window Functions in WHERE Don't Work:

SQL execution order processes WHERE before window functions. Use CTEs or subqueries instead.

Best Practices

Use PARTITION BY intentionally—partition only when you need separate calculations per group
Always specify ORDER BY for ranking functions; without it, results may be unpredictable
Use CTEs for readability when combining window functions with WHERE clauses
Prefer DENSE_RANK when skipped rankings don't fit your business logic
Add indexes on PARTITION BY and ORDER BY columns for performance on large datasets

Quick Reference

Window Function Formula:

FUNCTION() OVER(
    PARTITION BY ...  -- Group rows
    ORDER BY ...      -- Define sequence
)

Remember:

OVER() = defines the calculation window
PARTITION BY = creates logical groups
ORDER BY = defines row sequence within the window
Ranking: ROW_NUMBER (always unique) → RANK (with gaps) → DENSE_RANK (no gaps)
Navigation: LAG (previous) ↔ LEAD (next)

Summary

Window Functions are one of the most powerful SQL features for analytics, reporting, ranking, time-series analysis, running totals, and comparative analysis. They're heavily used in:

Data Engineering
Backend Development
Business Intelligence
Data Analytics
SQL Interviews

Master them, and you'll handle complex analytical queries with ease.

DEV Community