Mohamed Hussain S

Posted on Mar 3

Why 'FINAL' in ClickHouse Is Usually a Design Smell

#clickhouse #performance #backend #dataengineering

If you're using FINAL regularly in ClickHouse, your schema might be wrong.

Not always.
But often.

FINAL is one of those features that feels like a lifesaver when you first discover it. You see duplicates, you add FINAL, and suddenly everything looks correct.

Problem solved… right?

Not exactly.

Let’s break down what’s really happening.

First: Why Do Duplicates Even Exist?

To understand FINAL, we need to understand how ClickHouse actually stores data.

ClickHouse (especially MergeTree-family engines) works like this:

Each insert creates a new part
Parts are immutable
Background merges combine parts asynchronously
Deduplication (in engines like ReplacingMergeTree) happens during merges

That last point is critical.

If you’re using ReplacingMergeTree, rows with the same primary key can temporarily coexist in different parts. They only get deduplicated when a background merge runs.

Until that merge happens → duplicates are visible.

This is not a bug.

It’s by design.

ClickHouse prioritizes write speed and asynchronous compaction over immediate consistency.

What `FINAL` Actually Does

When you run:

SELECT * FROM table FINAL

You are telling ClickHouse:

“Ignore background merge state. Reconstruct the final merged result at query time.”

Internally, FINAL:

Reads all relevant parts
Applies deduplication logic during the SELECT
Performs a merge-like process in memory
Disables certain read optimizations

It does not simply “remove duplicates.”

It forces a read-time merge pipeline.

And that’s expensive.

Why `FINAL` Kills Performance

Using FINAL shifts background work into your foreground query.

Here’s what that means in practice:

1. More Parts Must Be Read

Without FINAL, ClickHouse can use indexes and skip efficiently.

With FINAL, it must consider all rows that might participate in deduplication.

2. Extra CPU Work

Deduplication logic that normally runs during background merges now runs during your query execution.

You are essentially asking ClickHouse to merge parts on-demand.

3. Higher Memory Usage

Query-time merging requires buffering rows for comparison and collapsing.

On large partitions, this becomes painful.

4. Poor Scaling

On small datasets, you won’t notice much.

On large analytical tables?

You just turned a fast OLAP system into something that behaves suspiciously like a transactional engine doing row-by-row reconciliation.

That’s not what ClickHouse is optimized for.

Why People Reach for `FINAL`

Because it “fixes” this:

SELECT * FROM events WHERE id = 123

… returns multiple rows.

So they add:

SELECT * FROM events FINAL WHERE id = 123

And now it returns one row.

Looks correct.

But what really happened?

The schema was designed to rely on eventual deduplication, yet the workload expects immediate correctness.

That’s a modeling mismatch.

Better Alternatives

If you find yourself using FINAL frequently, consider these patterns instead:

1. Use a Version Column Properly

With ReplacingMergeTree, define:

ReplacingMergeTree(version_column)

Then design queries using aggregation:

SELECT
    id,
    argMax(value, version) AS value
FROM table
GROUP BY id

This avoids forcing a full FINAL merge.

2. Model Immutable Events

Instead of overwriting rows:

Store events as append-only
Compute “latest state” in a view or aggregation

ClickHouse thrives on immutable data patterns.

3. Pre-Aggregate

Use:

Materialized views
AggregatingMergeTree
SummingMergeTree (where appropriate)

Push state computation to ingestion time.

4. Fix Partitioning Strategy

Sometimes excessive duplicates persist because:

Partitions are too large
Merges are delayed
Merge settings are suboptimal

This is an operational issue, not a query issue.

When `FINAL` Is Actually Valid

Let’s not be dogmatic.

FINAL is reasonable when:

Tables are small
It’s a debugging query
You’re validating deduplication behavior
One-off backfills
Low-concurrency environments

The issue isn’t that FINAL exists.

The issue is using it in production analytical workloads as a default pattern.

The Core Idea

FINAL is not a feature for everyday querying.

It’s a safety valve.

If your dashboards, APIs, or pipelines depend on FINAL, you’re pushing background merge work into your critical query path.

And that’s usually a signal that:

The schema needs refinement
The workload expectations are mismatched
Or the modeling strategy needs to change

The Mental Model to Keep

ClickHouse is optimized for:

Append-heavy workloads
Asynchronous compaction
Analytical scans
Pre-aggregation patterns

If you force it to behave like a row-level transactional engine at read time, you pay the price.

Final Thought

FINAL solves symptoms.

Good schema design solves causes.

If you use it occasionally - fine.

If you rely on it regularly - it’s probably time to rethink your modeling strategy.

DEV Community

Why 'FINAL' in ClickHouse Is Usually a Design Smell

First: Why Do Duplicates Even Exist?

What `FINAL` Actually Does

Why `FINAL` Kills Performance

1. More Parts Must Be Read

2. Extra CPU Work

3. Higher Memory Usage

4. Poor Scaling

Why People Reach for `FINAL`

Better Alternatives

1. Use a Version Column Properly

2. Model Immutable Events

3. Pre-Aggregate

4. Fix Partitioning Strategy

When `FINAL` Is Actually Valid

The Core Idea

The Mental Model to Keep

Final Thought

Top comments (0)

First: Why Do Duplicates Even Exist?

What FINAL Actually Does

Why FINAL Kills Performance

1. More Parts Must Be Read

2. Extra CPU Work

3. Higher Memory Usage

4. Poor Scaling

Why People Reach for FINAL

Better Alternatives

1. Use a Version Column Properly

2. Model Immutable Events

3. Pre-Aggregate

4. Fix Partitioning Strategy

When FINAL Is Actually Valid

The Core Idea

The Mental Model to Keep

Final Thought

What `FINAL` Actually Does

Why `FINAL` Kills Performance

Why People Reach for `FINAL`

When `FINAL` Is Actually Valid