DEV Community

Cover image for Day 46: A Guide to ClickHouse® Window Functions
Kanishga Subramani
Kanishga Subramani

Posted on

Day 46: A Guide to ClickHouse® Window Functions

Introduction

Analyzing data often requires comparing rows, calculating rankings, finding running totals, or computing moving averages without changing the number of rows returned. Traditionally, these tasks required complex self-joins or nested subqueries, making SQL queries harder to write and maintain.

ClickHouse® Window Functions simplify these analytical operations by allowing calculations across a group of related rows while preserving every row in the result set. This enables cleaner SQL, better readability, and more powerful analytics.

In this guide, you'll learn what window functions are, how the OVER() clause works, and how to use the most common window functions in ClickHouse® with practical examples.


What Are Window Functions?

A window function performs calculations over a set of rows called a window.

Unlike GROUP BY, which combines multiple rows into a single result, window functions calculate values while keeping every original row.

Consider the following employee salary data:

Employee Department Salary
Alice HR 50,000
Bob HR 65,000
Charlie HR 55,000

Using GROUP BY, you can calculate the average salary for HR, but the result contains only one row.

With a window function, every employee remains visible while also displaying the department's average salary.

Employee Salary Department Average
Alice 50,000 56,666
Bob 65,000 56,666
Charlie 55,000 56,666

This ability to retain individual rows while performing analytical calculations is what makes window functions so powerful.


Basic Syntax

Every window function uses the OVER() clause.

SELECT
    column_name,
    window_function() OVER (
        PARTITION BY column
        ORDER BY column
    )
FROM table_name;
Enter fullscreen mode Exit fullscreen mode

The OVER() clause defines how the window is created.

It can specify:

  • How rows are grouped (PARTITION BY)
  • How rows are ordered (ORDER BY)
  • Which rows belong to the window (ROWS BETWEEN)

Sample Table

We'll use the following table throughout the examples.

CREATE TABLE employee_sales
(
    Employee String,
    Department String,
    SaleDate Date,
    Sales UInt32
)
ENGINE = MergeTree
ORDER BY SaleDate;
Enter fullscreen mode Exit fullscreen mode

Sample data:

Employee Department SaleDate Sales
Alice Electronics 2025-01-01 120
Bob Electronics 2025-01-02 150
Charlie Electronics 2025-01-03 180
David Furniture 2025-01-01 90
Emma Furniture 2025-01-01 110

Ranking Functions

ROW_NUMBER()

Assigns a unique sequential number to every row.

SELECT
    Employee,
    Sales,
    ROW_NUMBER() OVER (ORDER BY Sales DESC) AS RowNum
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

Employee Sales RowNum
Charlie 180 1
Bob 150 2
Alice 120 3
Emma 110 4

Common Use Cases

  • Ranking reports
  • Pagination
  • Top-N queries

RANK()

Assigns the same rank to equal values.

SELECT
    Employee,
    Sales,
    RANK() OVER (ORDER BY Sales DESC) AS Rank
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

Employee Sales Rank
Alice 200 1
Bob 180 2
Charlie 180 2
David 150 4

Notice that Rank 3 is skipped because two employees share Rank 2.


DENSE_RANK()

Works like RANK(), but does not leave gaps.

SELECT
    Employee,
    Sales,
    DENSE_RANK() OVER (ORDER BY Sales DESC) AS DenseRank
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

Employee Sales DenseRank
Alice 200 1
Bob 180 2
Charlie 180 2
David 150 3

NTILE()

Splits ordered rows into approximately equal-sized buckets.

SELECT
    Employee,
    Sales,
    NTILE(2) OVER (ORDER BY Sales DESC) AS SalesGroup
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

Employee Sales SalesGroup
Charlie 180 1
Bob 150 1
Alice 120 2
David 90 2

Common Use Cases

  • Customer segmentation
  • Sales quartiles
  • Performance grading
  • Ranking analysis

Aggregate Window Functions

Traditional aggregate functions such as SUM(), AVG(), MIN(), MAX(), and COUNT() can also operate as window functions.


SUM() OVER()

Calculates running totals.

SELECT
    SaleDate,
    Sales,
    SUM(Sales)
    OVER (ORDER BY SaleDate) AS RunningTotal
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

SaleDate Sales RunningTotal
Jan 1 120 120
Jan 2 150 270
Jan 3 180 450

Ideal for cumulative sales reports.


AVG() OVER()

Calculates averages without collapsing rows.

SELECT
    Employee,
    Department,
    Sales,
    AVG(Sales)
    OVER (PARTITION BY Department) AS DepartmentAverage
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

Employee Department Sales DepartmentAverage
Alice Electronics 120 150
Bob Electronics 150 150
Charlie Electronics 180 150

Moving Average

Moving averages become very simple with window functions.

SELECT
    SaleDate,
    Sales,
    AVG(Sales)
    OVER (
        ORDER BY SaleDate
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS MovingAverage
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

SaleDate Sales MovingAverage
Jan 1 120 120
Jan 2 150 135
Jan 3 180 150

The calculation includes:

  • Current row
  • Previous row
  • Two rows before the current row

Understanding Window Frames

Window frames determine which rows participate in the calculation.

Example:

ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
Enter fullscreen mode Exit fullscreen mode

This means:

  • 2 PRECEDING → Include the previous two rows.
  • CURRENT ROW → Include the current row.

Window frames are commonly used for:

  • Moving averages
  • Rolling sums
  • Sliding calculations
  • Time-series analytics

Navigation Functions

LAG()

Returns the value from the previous row.

SELECT
    SaleDate,
    Sales,
    LAG(Sales)
    OVER (ORDER BY SaleDate) AS PreviousSales
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

SaleDate Sales PreviousSales
Jan 1 120 NULL
Jan 2 150 120
Jan 3 180 150

Useful for day-over-day comparisons.


LEAD()

Returns the value from the next row.

SELECT
    SaleDate,
    Sales,
    LEAD(Sales)
    OVER (ORDER BY SaleDate) AS NextSales
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

SaleDate Sales NextSales
Jan 1 120 150
Jan 2 150 180
Jan 3 180 NULL

Useful for forecasting and sequential analysis.


FIRST_VALUE() and LAST_VALUE()

Retrieve the first or last value from the window.

SELECT
    Employee,
    Sales,
    FIRST_VALUE(Sales)
    OVER (
        ORDER BY Sales DESC
    ) AS HighestSale,
    LAST_VALUE(Sales)
    OVER (
        ORDER BY Sales
        ROWS BETWEEN UNBOUNDED PRECEDING
        AND UNBOUNDED FOLLOWING
    ) AS LastSale
FROM employee_sales;
Enter fullscreen mode Exit fullscreen mode

Output:

Employee Sales HighestSale LastSale
Charlie 180 180 180
Bob 150 180 180
Alice 120 180 180

Here:

  • FIRST_VALUE() returns the highest sales value after sorting in descending order.
  • LAST_VALUE() returns the final value across the complete window.

Window Functions vs GROUP BY

GROUP BY Window Functions
Combines rows Keeps all rows
One result per group One result per row
Used for summaries Used for row-level analytics
Cannot easily compare neighboring rows Easily compares previous, next, ranked, and cumulative values

Performance Best Practices

For better performance when using window functions in ClickHouse®:

  • Choose an efficient ORDER BY key.
  • Use PARTITION BY only when necessary.
  • Filter rows before applying window functions whenever possible.
  • Keep window frames as small as practical.
  • Use MergeTree tables for analytical workloads.
  • Avoid unnecessary sorting on very large datasets.

Conclusion

Window Functions are one of the most powerful SQL features available in ClickHouse® for analytical workloads. They allow you to perform rankings, running totals, moving averages, cumulative calculations, and row-to-row comparisons while preserving every row in the result set.

By understanding the OVER() clause and mastering functions such as ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), SUM(), AVG(), LAG(), LEAD(), FIRST_VALUE(), and LAST_VALUE(), you can write cleaner SQL and solve complex analytical problems without relying on complicated joins or subqueries.

For anyone working with ClickHouse®, mastering window functions is an essential skill for building efficient, readable, and high-performance analytical queries.

Top comments (0)