Mohamed Hussain S

Posted on May 25

Why Too Many Parts Hurt ClickHouse Performance

#clickhouse #database #dataengineering #backend

A lot of people initially think ClickHouse performance problems come from:

large queries
bad joins
massive datasets
missing indexes

And honestly, those things can matter.

But one of the most common operational problems in ClickHouse often starts much earlier:

too many tiny parts.

This is one of those issues that usually stays invisible at first.

Then suddenly:

merges fall behind
queries slow down
memory usage increases
inserts become unstable

And the cluster starts behaving strangely.

Every Insert Creates Parts

This is the first thing that’s important to understand.

In MergeTree-based engines, ClickHouse stores data as immutable parts.

Something as simple as:

INSERT INTO events VALUES (...);

creates new parts on disk.

And this is completely normal.

ClickHouse is designed around this storage model.

So:

parts themselves are not the problem.

The real issue starts when parts begin accumulating faster than merges can stabilize them.

Why Tiny Inserts Become Dangerous

At smaller scale, tiny inserts may seem harmless.

For example:

inserting row-by-row
extremely frequent micro-batches
tiny streaming flush intervals

Initially:

everything still works.

But over time, the number of parts starts growing aggressively.

Now ClickHouse has to manage:

more metadata
more merges
more scheduling
more file operations

This creates operational overhead.

Meaning:

the system starts spending increasing resources managing fragmentation itself.

Why Merges Matter So Much

ClickHouse relies heavily on background merges.

These merges:

combine smaller parts
reduce fragmentation
improve compression
optimize query performance

Under healthy ingestion patterns, merges naturally keep the system stable over time.

That is the ideal state.

But problems start when:

parts created per second
        >
parts merged per second

Now fragmented parts begin accumulating faster than ClickHouse can compact them.

And this is usually where instability slowly starts building.

The Dangerous Part Is That It Builds Slowly

This is what makes the issue tricky operationally.

You usually do not notice the problem immediately.

The cluster may look perfectly healthy initially.

Then gradually:

insert latency increases
merges lag behind
CPU usage becomes unstable
queries become heavier
replication slows down

And eventually ClickHouse may start throwing errors like:

Too many parts

At that point, the merge system is already under serious pressure.

Queries Also Become More Expensive

A lot of people think parts only affect inserts.

But queries suffer too.

Because queries now need to:

open more parts
scan more metadata
coordinate more files

Even when the actual dataset itself is not massive.

So sometimes:

performance degradation comes more from fragmentation than raw data volume.

That is a very important operational insight.

FINAL Does Not Really Solve This

One thing that’s important to understand:

FINAL is not really a solution for too many parts.

For example:

SELECT *
FROM events FINAL;

FINAL applies merge logic during query execution.

But the fragmented parts still physically exist underneath.

So if the system already has excessive fragmentation:

queries still scan many parts
merge pressure still exists
query execution can become heavier

Which means:

FINAL can actually become more expensive when fragmentation becomes unhealthy.

The real fix is usually improving ingestion and merge behavior itself.

Over-Partitioning Can Quietly Make This Worse

Another thing that often accelerates part explosion is overly granular partitioning.

For example:

PARTITION BY toYYYYMMDDhh(timestamp)

instead of something broader like:

PARTITION BY toYYYYMM(timestamp)

Now even small inserts may create parts across many partitions simultaneously.

Which means:

a single insert can end up creating multiple fragmented parts underneath.

And over time, merge pressure increases much faster than expected.

ClickHouse Also Has Ways to Help

Modern ClickHouse versions also support features like async inserts to help reduce excessive tiny-part creation.

Instead of immediately flushing every small insert into separate parts, ClickHouse can buffer inserts internally before writing larger parts to disk.

This helps reduce fragmentation and merge pressure in workloads that naturally produce smaller inserts.

But async inserts are not a replacement for healthy ingestion patterns themselves.

Stable batching still matters a lot.

Why Batch Size Matters So Much

ClickHouse generally performs much better with:

larger batches
fewer inserts
healthier merge behavior

Because fewer parts means:

fewer merges
lower metadata overhead
better compression
more efficient scans

This is one of the reasons ClickHouse ingestion patterns often look very different from traditional OLTP systems.

Too Many Parts Also Affects Startup and Recovery

Another thing people often discover late:

Large numbers of parts also affect:

startup time
replication recovery
metadata loading
server restarts

Because ClickHouse now has to:

scan part metadata
validate parts
rebuild internal state

before the server becomes fully operational again.

So the issue is not just query performance.

It becomes an overall operational stability problem.

The Important Lesson

One thing I’ve noticed with ClickHouse is that many performance problems are actually merge-management problems underneath.

And too many parts is one of the clearest examples of that.

Because the issue usually is not:

“ClickHouse cannot handle large data.”

The issue is more often:

fragmentation and merge pressure slowly became unhealthy.

That is a very different operational problem.

Final Thought

ClickHouse is extremely good at handling massive analytical workloads.

But it performs best when the storage engine is allowed to merge parts efficiently.

And sometimes the biggest performance problem is not the query itself.

It is the thousands of tiny fragmented parts quietly building underneath the system over time.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.