nishaant dixit

Posted on May 7 • Originally published at sivaro.in

ClickHouse: The Real-Time Analytics Database Engineers Are Fighting Over

I still remember the moment. 3 AM. A production dashboard stuck loading for 45 seconds. My CTO on the phone asking why we couldn't just "query faster."

The problem wasn't our engineers. It was our database. PostgreSQL was buckling under 50 million rows. We needed something built for analytics at scale. Not another general-purpose tool pretending to be fast.

That's when I discovered ClickHouse.

What is ClickHouse? According to ClickHouse.com, it's an open-source, column-oriented OLAP database management system for real-time analytical queries. Translation: It crushes aggregations on billions of rows in milliseconds. Row-oriented databases (MySQL, PostgreSQL) store data row by row. ClickHouse stores data by columns. This makes it insanely efficient for analytics where you only query a few columns across massive datasets.

Here's what I've learned building production systems with ClickHouse—the good, the bad, and the gotchas that documentation won't tell you. You'll see why it's dominating real-time analytics, how to set it up properly, and where it falls apart.

ClickHouse isn't just another database. It's a radical departure from what most engineers expect.

Column-oriented storage means each column lives in its own file on disk. When you run SELECT AVG(revenue) FROM sales WHERE date > '2024-01-01', ClickHouse only reads the revenue and date columns. It ignores everything else. Compare that to a row database, which reads entire rows before discarding most data.

Vectorized query execution is the secret sauce. Most databases process one row at a time. ClickHouse processes batches of rows—typically 1024 at once—using CPU SIMD instructions. According to ClickHouse's GitHub repository, this approach achieves 10-100x speed improvements over row-oriented systems for analytical queries.

MergeTree engine is where the magic happens. Data is written in sorted chunks called "parts." Background processes merge these parts into larger ones. This design enables:

Inserts are lightning fast (append-only)
Queries hit sorted data (fast range scans)
Compression is excellent (similar data clustered together)

Here's how you'd create a basic table:

sql
CREATE TABLE events (
event_time DateTime,
user_id UInt64,
event_type String,
revenue Float64
) ENGINE = MergeTree()
ORDER BY (event_time, user_id)
PARTITION BY toYYYYMM(event_time);

The ORDER BY clause isn't optional—it defines the sort order for data parts. Choose it carefully. In my experience, this is the single most impactful performance decision you'll make.

I've seen ClickHouse scan 500 million rows and return a GROUP BY aggregation in under 200 milliseconds. This isn't theoretical. According to the Wikipedia article on ClickHouse, it processes hundreds of billions of rows per second per server for simple aggregations.

This speed means you can stop pre-aggregating data. Stop building those nightly rollup tables. Stop caching everything. ClickHouse handles raw data at query time.

Column-oriented storage compresses beautifully. Same data type, repeated values, sorted order—ClickHouse exploits all three. I've seen 20TB of raw data compress to under 4TB in practice.

ClickHouse handles thousands of inserts per second without breaking a sweat. Each insert creates a new part asynchronously. Background merges consolidate parts without blocking new writes.

Your team already knows SQL. ClickHouse supports standard SQL with some extensions. The learning curve is flat for experienced engineers.

ClickHouse is Apache 2.0 licensed. The company ClickHouse Inc. provides enterprise features, but the core is free and open. According to their LinkedIn company page, they've raised significant funding and employ core contributors.

bash
docker run -d --name some-clickhouse-server \
-p 8123:8123 \
-p 9000:9000 \
--ulimit nofile=262144:262144 \
clickhouse/clickhouse-server:latest

Port 8123 is HTTP interface. Port 9000 is native TCP. Use the HTTP interface for simple queries and integrations. Use native TCP for performance-critical applications.

Partitioning isn't for data organization. It's for data lifecycle management. Here's the pattern I use:

sql
CREATE TABLE events (
event_time DateTime,
user_id UInt64,
event_type String,
payload String
) ENGINE = MergeTree()
ORDER BY (event_time, user_id)
PARTITION BY toYYYYMM(event_time)
TTL event_time + INTERVAL 90 DAY DELETE;

The TTL clause automatically deletes data older than 90 days. Partitioning by month makes this efficient—ClickHouse drops entire partitions instead of scanning rows.

Common pitfall: Don't partition too granularly. One partition per day creates thousands of tiny parts. The merge process chokes. Partition by month for most workloads. Partition by week only if you delete data weekly.

ClickHouse isn't designed for frequent updates. But sometimes you need them. Here's the correct approach:

sql
-- Don't do this:
-- UPDATE events SET event_type = 'conversion' WHERE user_id = 12345;

-- Do this instead:
-- 1. Create a ReplacingMergeTree
CREATE TABLE events (
event_time DateTime,
user_id UInt64,
event_type String,
revenue Float64,
version UInt32
) ENGINE = ReplacingMergeTree(version)
ORDER BY (user_id, event_time);

-- 2. Insert an updated row with higher version
INSERT INTO events VALUES (now(), 12345, 'conversion', 99.99, 2);

The ReplacingMergeTree deduplicates rows during background merges. Duplicates exist temporarily but are resolved eventually.

Sometimes you need faster-than-realtime. Use materialized views with AggregatingMergeTree:

sql
-- Create a destination for pre-aggregated data
CREATE TABLE daily_revenue (
date Date,
total_revenue AggregateFunction(sum, Float64),
user_count AggregateFunction(uniq, UInt64)
) ENGINE = AggregatingMergeTree()
ORDER BY date;

-- Create a materialized view to populate it
CREATE MATERIALIZED VIEW daily_revenue_mv
TO daily_revenue
AS SELECT
toDate(event_time) as date,
sumState(revenue) as total_revenue,
uniqState(user_id) as user_count
FROM events
GROUP BY date;

Query the pre-aggregated table with sumMerge and uniqMerge functions. Sub-millisecond queries on billion-row datasets.

ClickHouse stores data sorted by ORDER BY columns. Choose columns you filter by most frequently. If you always filter by event_time and user_id, put them first in the ORDER BY clause.

From ClickHouse's best practices documentation: "The order of columns in the ORDER BY clause matters. Put high-cardinality columns first only if you filter by them equally."

Here's the rule I follow:

Put the most filtered column first
Then use high-cardinality columns
The last column should be one used in ORDER BY queries

Bad: ORDER BY (user_id, event_time, revenue) — queries rarely filter by revenue first

Good: ORDER BY (event_time, user_id) — 90% of queries filter by time range

This kills performance:

bash
curl -X POST "http://localhost:8123" --data "INSERT INTO events VALUES (now(), 1, 'click')"
curl -X POST "http://localhost:8123" --data "INSERT INTO events VALUES (now(), 2, 'click')"

Instead, batch inserts to 50,000-100,000 rows per insert:

bash
echo '[
{"event_time":"2024-01-01 00:00:00","user_id":1,"event_type":"click"},
{"event_time":"2024-01-01 00:00:01","user_id":2,"event_type":"purchase"}
]' | curl -X POST "http://localhost:8123/?query=INSERT+INTO+events+FORMAT+JSONEachRow" --data-binary @-

Background merges consume CPU and IO. If merges fall behind, query performance degrades. Monitor system.merges:

sql
SELECT
database,
table,
count() as active_merges,
sum(rows_written) as total_written
FROM system.merges
WHERE database = currentDatabase()
GROUP BY database, table;

A healthy system has 0-5 active merges per table. More than 10 means you're inserting too fast or partitions are too small.

ClickHouse supports specialized data types that save space:

sql
-- Instead of String for dates
CREATE TABLE sales (
date Date, -- 2 bytes instead of String
product_id UInt32, -- 4 bytes
revenue Decimal(10,2) -- precision without Float64 overhead
) ENGINE = MergeTree()
ORDER BY date;

According to ClickHouse's best practices blog, using appropriate integer types over strings can reduce storage by 60-70%.

ClickHouse isn't for every workload. Here's where it shines and where it fails.

Real-time dashboards — Sub-second queries on billions of events
Log analytics — High-volume ingestion with complex filtering
Time-series analysis — Group by minute/hour/day on massive datasets
User-facing analytics — When customers need fast queries on their data

Transactional workloads — No ACID transactions. Not designed for point updates.
High-concurrency single-row lookups — ClickHouse optimizes for scans, not lookups. PostgreSQL wins here.
Frequent small updates — Every update creates new parts. The merge overhead kills performance.

As one engineer noted on Reddit's r/dataengineering: "ClickHouse is amazing for analytics but a terrible choice for OLTP. People try to use it for everything and get frustrated."

Running ClickHouse in production isn't trivial. The agent-skills best practices guide highlights several operational challenges:

Memory management — ClickHouse uses memory aggressively during queries. A single poorly written query can OOM your node. Set max_memory_usage limits per query.

Replication complexity — ClickHouse replication works through ZooKeeper (or ClickHouse Keeper). It's reliable but adds operational overhead. Two out of three nodes must be healthy for quorum.

Schema evolution — Adding columns is easy. Removing or changing column types requires table recreation. Plan your schema carefully.

Problem: A GROUP BY query uses 32GB RAM and crashes.

Solution: Use external aggregation:

sql
-- Enable disk-based aggregation
SET max_bytes_before_external_group_by = 20000000000; -- 20GB
SET max_memory_usage = 40000000000; -- 40GB

This spills intermediate results to disk when memory exceeds 20GB. Queries run slower but don't crash.

Problem: Inserting 1 million rows/second causes merge backlog.

Solution: Tune merge settings:

sql
-- In config.xml or via SET

300
600
10

Higher thresholds allow more parts before blocking inserts. Trade-off: slower merges, more disk usage.

Problem: Same events inserted twice due to retries.

Solution: Use ReplacingMergeTree with a version column:

sql
CREATE TABLE events (
event_id UInt64, -- Unique identifier
event_time DateTime,
user_id UInt64,
event_type String,
version UInt32
) ENGINE = ReplacingMergeTree(version)
ORDER BY (event_id);

Deduplication happens during merges, not at insert time. It's eventual consistency, not immediate.

Problem: Complex joins timeout after 30 seconds.

Solution: Increase timeout and optimize join strategy:

sql
SET max_execution_time = 300; -- 5 minutes

-- Use Global JOIN for large dimension tables
SELECT *
FROM events
GLOBAL JOIN users_map ON events.user_id = users_map.user_id
WHERE event_type = 'purchase';

GLOBAL JOIN distributes the right table to all nodes, avoiding network overhead.

Yes, for analytical queries on large datasets. ClickHouse is 10-100x faster for aggregations and scans. PostgreSQL wins for transactional workloads with frequent updates, complex joins, and point lookups.

Absolutely. It handles thousands of inserts per second without blocking. Data becomes queryable within 1-2 seconds after insert. For sub-second availability, use the Buffer table engine.

ClickHouse has simpler deployment (single binary vs. many services), better compression, and more standard SQL. Druid and Pinot offer better real-time ingestion and lower latency for streaming data. Choose ClickHouse for simpler operations.

Use MergeTree for most workloads. Use ReplacingMergeTree for upsert scenarios. Use SummingMergeTree for pre-aggregated counters. Use MaterializedView with AggregatingMergeTree for pre-computed rollups.

It handles 100-200 concurrent queries per node well. Beyond that, use a proxy like ClickHouse Proxy or CHProxy for connection pooling. Each query uses significant memory, so concurrency is limited by RAM.

Use ReplacingMergeTree with a version column. Insert new data with higher version. Old data is deduplicated during merges. There's no UPDATE ... WHERE—you insert new rows with updated values.

Partition by month for most time-series data. Partition by week only if you need granular data deletion. Avoid partitioning by day—too many partitions hurt merge performance. The Reddit best practices thread recommends starting with monthly partitions for event data.

Yes, but carefully. ClickHouse JOINs are hash-based and memory-intensive. Use dictionary-optimized lookups for dimension tables. Avoid joining large tables—denormalize instead.

ClickHouse is the most performant open-source analytical database I've worked with. It's not a silver bullet—transactional workloads, high-concurrency lookups, and frequent updates need different tools. But for real-time analytics on billions of rows, nothing comes close.

Start locally with Docker. Create a small table, insert some data, run GROUP BY queries. Feel the speed. Then think about your partitioning and sorting key. Everything downstream depends on that decision.

If you're building user-facing analytics, internal dashboards, or log analysis systems, ClickHouse will change how you think about data. I've seen engineering teams go from 30-second query times to 200-millisecond responses. It's transformative when used correctly.

Next step: Spin up ClickHouse, ingest your log data, and run your first aggregation. You'll see the difference immediately.

Nishaant Dixit is the founder of SIVARO, a product engineering company specializing in data infrastructure and production AI systems. He's been building systems processing 200K events/sec since 2018. Connect on LinkedIn: https://www.linkedin.com/in/nishaant-veer-dixit

Sources

ClickHouse Official Site — https://clickhouse.com/
ClickHouse GitHub Repository — https://github.com/clickhouse/clickhouse
ClickHouse Wikipedia — https://en.wikipedia.org/wiki/ClickHouse
ClickHouse LinkedIn — https://www.linkedin.com/company/clickhouseinc
Reddit: "Is Clickhouse a good choice?" — https://www.reddit.com/r/dataengineering/comments/1re8djt/is_clickhouse_a_good_choice/
ClickHouse GitHub Organization — https://github.com/clickhouse
ClickHouse Best Practices Docs — https://clickhouse.com/docs/best-practices
ClickHouse 10 Best Practices Blog — https://clickhouse.com/blog/10-best-practice-tips
ClickHouse Agent Skills Best Practices — https://github.com/ClickHouse/agent-skills/blob/main/skills/clickhouse-best-practices/SKILL.md
Reddit: "Tips and Best-Practices for Clickhouse" — https://www.reddit.com/r/dataengineering/comments/1hx9tzv/tips_and_bestpractices_for_clickhouse/

Originally published at https://sivaro.in/articles/clickhouse-the-real-time-analytics-database-engineers-are.

DEV Community

ClickHouse: The Real-Time Analytics Database Engineers Are Fighting Over

Top comments (0)