Introduction
One of the reasons ClickHouse® can handle massive analytical workloads with exceptional speed is its storage architecture. Unlike traditional databases that frequently modify existing files, ClickHouse stores incoming data as immutable units called data parts. These parts are later combined through a background process known as merging.
This design allows ClickHouse to achieve both high ingestion throughput and fast query execution. Every insert operation creates a new part, while background merges continuously optimize storage and improve query efficiency without interrupting users.
Understanding how data parts and merges work is essential for anyone managing ClickHouse in production. It helps explain why batch inserts are recommended, why excessive small inserts can cause performance issues, and how ClickHouse maintains efficiency as datasets grow.
In this article, we'll explore how data parts are created, how background merges operate, how to monitor them, and the best practices for keeping MergeTree tables healthy.
What Are Data Parts?
A data part is the smallest physical storage unit within a MergeTree table.
Whenever data is inserted into a MergeTree-based table, ClickHouse creates a brand-new immutable data part. Instead of modifying existing files, new data is appended as a separate part.
Each data part contains everything required to serve queries efficiently, including:
- Compressed column files
- Primary indexes
- Mark files
- Checksums
- Table metadata
Because data parts are immutable, ClickHouse can perform concurrent reads and writes efficiently without locking existing data.
Why ClickHouse Uses Data Parts
The immutable-part architecture provides several advantages over traditional update-in-place storage models.
Key benefits include:
- Fast insert performance
- Lock-free write operations
- Efficient compression
- Simplified storage management
- Background optimization through merges
- Better scalability for analytical workloads
Rather than spending resources constantly reorganizing data during inserts, ClickHouse defers optimization work to background merge processes.
Creating a MergeTree Table
Let's create a simple table for storing order information:
CREATE TABLE default.orders
(
order_id UInt32,
country LowCardinality(String),
status LowCardinality(String),
amount Float64,
order_date Date
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(order_date)
ORDER BY (country, status, order_date);
This table is partitioned by month and sorted by country, status, and order date.
How Data Parts Are Created
Every INSERT operation generates a new data part.
First Insert
INSERT INTO default.orders VALUES
(1,'IN','completed',1200,'2024-01-05'),
(2,'US','pending',450,'2024-01-06');
This creates the first data part.
Second Insert
INSERT INTO default.orders VALUES
(3,'GB','completed',890,'2024-01-07'),
(4,'DE','cancelled',300,'2024-01-08');
This creates another independent data part.
Although users see a single logical table, ClickHouse internally stores these inserts as separate physical parts until they are merged later.
After multiple inserts, a table may contain dozens or even hundreds of active parts.
The Problem with Too Many Parts
While creating parts is inexpensive, having too many small parts can negatively impact performance.
When thousands of tiny parts accumulate, ClickHouse must manage additional files and metadata, leading to:
- Slower query execution
- Increased memory usage
- Higher file descriptor consumption
- Additional merge workload
- Reduced overall efficiency
This is why ClickHouse strongly recommends batching inserts instead of inserting one row at a time.
For example, inserting 50,000 rows in a single batch is significantly more efficient than executing 50,000 individual insert statements.
What Happens During a Merge?
A merge is a background operation that combines multiple smaller parts into a larger consolidated part.
Consider the following example:
Before Merge
Part A → Rows 1–2
Part B → Rows 3–4
Part C → Rows 5–6
After Merge
Part D → Rows 1–6
Once the new merged part is successfully created:
- The merged part becomes active.
- Older parts are marked inactive.
- Inactive parts are eventually removed from disk.
This process occurs automatically without requiring user intervention.
How Merges Work Internally
Background merges follow several stages.
1. Selecting Parts
ClickHouse continuously evaluates which parts should be merged.
Factors considered include:
- Part size
- Number of parts in a partition
- Age of parts
- Available disk resources
- Merge scheduling priorities
2. Reading Existing Parts
The selected parts are read from disk.
Since parts are immutable, ongoing queries can continue accessing them while the merge is in progress.
3. Sorting Data
Rows from all participating parts are combined and sorted according to the table's ORDER BY key.
This ensures that the newly generated part preserves the storage order expected by MergeTree.
4. Writing a New Part
ClickHouse writes a completely new data part containing:
- Compressed column files
- Primary indexes
- Mark files
- Metadata
- Checksums
5. Activating the New Part
After the new part has been fully written and validated, ClickHouse atomically activates it.
The old parts are immediately marked as inactive.
6. Cleaning Up Old Parts
Inactive parts remain on disk temporarily and are later removed automatically according to configured retention settings.
MergeTree Storage Architecture
A MergeTree table consists of multiple independent data parts.
Table
│
├── Part A
│ ├── Column Files (.bin)
│ ├── Mark Files (.mrk2)
│ ├── Primary Index
│ ├── Checksums
│ └── Metadata
│
├── Part B
│
└── Part C
Each part contains all information necessary for query execution.
This architecture enables ClickHouse to scale efficiently while maintaining high read and write performance.
Understanding Part Names
ClickHouse assigns structured names to data parts using the format:
{partition}_{min_block}_{max_block}_{level}
Example:
202401_1_3_1
Meaning:
| Component | Description |
|---|---|
| 202401 | January 2024 partition |
| 1 | Minimum block number |
| 3 | Maximum block number |
| 1 | Merge level |
The merge level indicates how many merge operations contributed to the creation of the part.
Higher levels generally indicate larger and more consolidated parts.
Why Background Merges Matter
Background merges provide several important benefits:
- Reduce the total number of parts
- Improve query performance
- Increase compression efficiency
- Lower disk seek overhead
- Simplify storage organization
- Reduce metadata management costs
Without merges, tables would accumulate thousands of tiny parts, eventually affecting performance.
Monitoring Data Parts
ClickHouse provides the system.parts table for inspecting active and inactive parts.
View Active Parts
SELECT
partition,
name,
rows,
bytes_on_disk,
active
FROM system.parts
WHERE database = 'default'
AND table = 'orders'
AND active = 1;
Example output:
| partition | name | rows | bytes_on_disk | active |
|---|---|---|---|---|
| 202401 | 202401_1_1_0 | 2 | 1024 | 1 |
| 202401 | 202401_2_2_0 | 2 | 1024 | 1 |
This query helps monitor how many active parts exist in a table.
Monitoring Running Merges
You can inspect ongoing merge operations using system.merges.
SELECT
database,
table,
elapsed,
progress,
num_parts,
result_part_name
FROM system.merges
WHERE database = 'default'
AND table = 'orders';
Typical output:
| database | table | elapsed | progress | num_parts | result_part_name |
|---|---|---|---|---|---|
| default | orders | 2.3 | 0.65 | 4 | 202401_1_4_1 |
Important fields include:
- elapsed – duration of the merge
- progress – completion percentage
- num_parts – number of parts involved
- result_part_name – name of the resulting merged part
Monitoring this table can help identify merge bottlenecks.
Triggering Manual Merges
In most situations, ClickHouse handles merges automatically.
However, manual merges can be triggered when necessary.
Merge Table Parts
OPTIMIZE TABLE default.orders;
Merge a Specific Partition
OPTIMIZE TABLE default.orders
PARTITION '202401';
Force a Complete Merge
OPTIMIZE TABLE default.orders FINAL;
Important Warning
OPTIMIZE TABLE ... FINAL forces all parts within a partition to be merged into a single part.
On large tables, this operation can consume significant CPU, memory, and disk resources. It should only be used when absolutely necessary and preferably during low-traffic periods.
Best Practices
To maintain healthy MergeTree tables:
Use Large Batch Inserts
Aim for:
- Minimum: 1,000 rows per insert
- Recommended: 10,000–100,000 rows per insert
This minimizes the number of generated parts.
Monitor Part Counts
Regularly check system.parts.
If a partition contains hundreds of active parts, investigate ingestion patterns.
Avoid Excessive FINAL Operations
Repeatedly running OPTIMIZE FINAL can consume significant resources and reduce overall system efficiency.
Choose Partitioning Carefully
For most analytical workloads, monthly partitions using toYYYYMM() are preferable to daily partitions.
Allow Background Merges to Work
ClickHouse includes sophisticated merge scheduling mechanisms.
Avoid interrupting or disabling background merge processes unless absolutely necessary.
Monitor Merge Activity
Use system.merges to ensure merges are keeping up with ingestion workloads.
Common Mistakes
Many ClickHouse performance problems originate from a few recurring mistakes:
- Inserting one row at a time
- Creating excessive numbers of partitions
- Ignoring growing part counts
- Running
OPTIMIZE FINALrepeatedly - Disabling background merges
Avoiding these issues can significantly improve cluster performance and stability.
Data Parts vs Background Merges
| Data Parts | Background Merges |
|---|---|
| Created during inserts | Run automatically |
| Immutable storage units | Combine multiple parts |
| Store table data | Optimize storage layout |
| Queried directly | Execute asynchronously |
| Increase with ingestion | Reduce overall part count |
Conclusion
Data parts and background merges are fundamental to how ClickHouse achieves high-performance analytics at scale. Every insert creates an immutable data part, allowing ingestion to remain fast and efficient, while background merges continuously optimize storage by combining smaller parts into larger ones.
This architecture enables ClickHouse to deliver excellent write throughput, strong compression, and fast query execution without requiring constant manual maintenance.
By understanding how parts are created, how merges operate, and how to monitor storage behavior through system tables such as system.parts and system.merges, you can build more efficient ingestion pipelines and maintain healthy MergeTree deployments.
The most important rule remains simple: batch your inserts, monitor part counts, and let ClickHouse handle storage optimization through its merge engine.
#100DaysOfClickHouse #ClickHouse #DatabaseInternals #MergeTree #DataEngineering #BigData #Analytics #OLAP #DatabasePerformance #Day57
Top comments (0)