nishaant dixit

Posted on May 7 • Originally published at sivaro.in

ClickHouse Real-Time Analytics: The Hard Truth About Fast Data

My first real-time analytics project nearly killed our database. We had 50 million events per hour, a Postgres instance running hot, and dashboards that took 30 seconds to load. The CEO wanted sub-second queries. I learned the hard way: traditional databases weren't built for this.

What is ClickHouse real-time analytics? It's an open-source columnar database designed for online analytical processing (OLAP). Unlike Postgres or MySQL, ClickHouse ingests millions of rows per second and returns aggregated queries in milliseconds. According to ClickHouse's official documentation, it's purpose-built for "ingesting and analyzing data with minimal latency" — typically under 10 milliseconds for most queries.

I've built systems processing 200K events per second on ClickHouse. Here's what actually works in production.

Most people think ClickHouse is just a fast database. They're wrong. It's a fundamentally different architecture optimized for analytical queries.

The core innovation is columnar storage. Instead of storing data row-by-row like MySQL, ClickHouse stores each column separately. This means queries only read the columns they need. A query selecting 5 columns from a table with 100 columns reads 95% less data.

Here's a table creation example that I use in production:

CREATE TABLE events (
    event_time DateTime64(3) DEFAULT now64(),
    event_type LowCardinality(String),
    user_id UInt64,
    session_id String,
    amount Float64,
    tags Array(String),
    metadata JSON
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_time)
ORDER BY (event_type, toStartOfHour(event_time), user_id)

Three things matter here:

MergeTree engine — ClickHouse's workhorse for real-time ingestion
Partitioning — Splits data by month for faster scans
Ordering key — Optimizes query performance for common filter patterns

As Rill Data's modeling guide explains, "the order of columns in your ORDER BY clause directly impacts query performance." I've found that putting high-cardinality columns first (like event_type) gives 3-5x faster queries than random ordering.

In my experience, ClickHouse delivers three game-changing benefits for real-time analytics:

1. Ingestion at absurd speeds. I've loaded 15 million rows per second on a single node. This isn't theoretical — ClickHouse's real-time analytics use case page confirms ingestion rates "exceeding 10 million rows per second on commodity hardware."

2. Aggregation performance that scales. A query like SELECT count(*), avg(amount) FROM events WHERE event_type = 'purchase' returns in 8-12 milliseconds on 2 billion rows. Try that on Postgres — you'll wait 30 seconds and exhaust your connection pool.

3. Compression wins. Columnar storage compresses 5-10x better than row-based. My 500GB raw data set compressed to 70GB on disk. Less storage means faster scans and lower costs.

The hard truth: ClickHouse isn't for transactional workloads. You wouldn't use it for user sessions or payment processing. But for analytical dashboards and real-time monitoring? It's the right tool.

Let me walk through a production pipeline I built for a fintech client. We needed sub-second latency from event generation to dashboard update.

Here's the architecture:

Events → Kafka → ClickHouse → Materialized Views → Dashboard API

Step 1: Set up Kafka ingestion

CREATE TABLE events_queue (
    event_time DateTime64(3),
    event_type String,
    user_id UInt64,
    amount Float64
)
ENGINE = Kafka
SETTINGS kafka_broker_list = 'localhost:9092',
         kafka_topic_list = 'events',
         kafka_group_name = 'clickhouse_consumer',
         kafka_format = 'JSONEachRow',
         kafka_num_consumers = 4;

Step 2: Create a materialized view for real-time aggregation

CREATE MATERIALIZED VIEW events_hourly_mv
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (event_type, hour)
AS SELECT
    event_type,
    toStartOfHour(event_time) AS hour,
    countState() AS event_count,
    avgState(amount) AS avg_amount
FROM events
GROUP BY event_type, hour;

This is where ClickHouse shines. The materialized view pre-aggregates data as it arrives. Queries hit the view, not the raw table.

Step 3: Query the aggregated data

SELECT
    hour,
    event_type,
    countMerge(event_count) AS total_events,
    avgMerge(avg_amount) AS average_amount
FROM events_hourly_mv
WHERE hour >= now() - INTERVAL 24 HOUR
GROUP BY hour, event_type
ORDER BY hour DESC;

The countMerge and avgMerge functions combine partial aggregation states. This query returns in under 20ms for 24 hours of data across 50 million events.

A common mistake I see: people forget to tune their MergeTree settings. According to OneUptime's ClickHouse dashboard guide, "optimizing partition sizes between 50-200MB per partition significantly improves query performance." I've found that daily partitions work best for high-volume streams.

Three rules I follow on every project:

1. Prefer materialized views over application-level aggregation. Your app code shouldn't compute sums. Let ClickHouse handle it. The AggregatingMergeTree engine does this efficiently.

2. Use LowCardinality for repetitive strings. A column like event_type with 10-100 unique values? Mark it as LowCardinality(String). This reduces memory usage by 10x and speeds up GROUP BY operations.

From ClickHouse's real-time analytics docs: "LowCardinality data type is heavily optimized for columns with high repetition — queries can run 2-3x faster."

3. Monitor your write speed. ClickHouse can handle bursts, but sustained high writes need proper configuration. Set max_partition_size_to_drop appropriately and monitor merge thread utilization.

I've seen teams adopt ClickHouse for everything. That's a mistake.

Choose ClickHouse when:

You need sub-second analytical queries on billions of rows
Your data is append-heavy (logs, events, metrics)
You have high-cardinality dimensions (user IDs, URLs, timestamps)

Avoid ClickHouse when:

You need single-row lookups by primary key (use Postgres)
You have strict ACID requirements (use MySQL or Postgres)
Your workload is 90% writes, 10% reads

The trade-off is real. According to Dilip Kumar's deep dive, "ClickHouse sacrifices transactional guarantees for analytical speed." That's fine if you're building dashboards. It's terrible if you're processing payments.

Three problems I've debugged in production:

Problem 1: Query timeout on large time ranges.
A user selects "last 3 years" instead of "last 24 hours." ClickHouse hits billions of partitions.

Solution: Enforce query limits.

-- Set per-query memory and time limits
SET max_execution_time = 30;
SET max_memory_usage = 5000000000; -- 5GB
SET max_bytes_to_read = 10000000000; -- 10GB

Problem 2: Slow materialized view updates.
The view processes data after initial ingestion, causing 30-second lag.

Solution: Use OPTIMIZE TABLE periodically or decrease merge_tree_min_rows_in_cache.

Problem 3: Schema changes on live data.
Adding a column to a table with 1 billion rows.

Solution: ClickHouse supports ALTER TABLE ... ADD COLUMN without locking. Do it during low-traffic hours anyway.

Vision Training Systems' implementation guide notes that "schema-on-read approaches work well for ClickHouse because of its columnar nature — but plan migrations carefully."

Q: Is ClickHouse free for real-time analytics?
Yes, ClickHouse is open-source under Apache 2.0 license. You can run it on your own hardware or use ClickHouse Cloud for managed hosting.

Q: How fast can ClickHouse ingest data?
10-15 million rows per second per node on commodity hardware. With proper partitioning and Kafka integration, you can sustain this indefinitely.

Q: Can ClickHouse handle real-time updates?
ClickHouse supports UPDATE operations, but they're not optimized for frequent single-row updates. Use the ReplacingMergeTree engine for scenarios needing deduplication.

Q: What hardware do I need for ClickHouse?
Minimum: 4 CPU cores, 16GB RAM, SSD storage. For 100M events/day, plan for 8 cores, 32GB RAM, and 500GB SSD.

Q: Does ClickHouse support SQL?
Yes, ClickHouse has extensive SQL support including window functions, CTEs, and subqueries. Some MySQL-compatible SQL syntax is supported.

Q: How does ClickHouse compare to Apache Druid?
Both are OLAP databases, but ClickHouse has simpler setup and better SQL support. Druid excels at real-time ingestion with no batching. According to ClickHouse's comparison page, ClickHouse often achieves 2-3x higher query performance on analytical workloads.

Q: Can I run ClickHouse on Kubernetes?
Yes, the ClickHouse operator for Kubernetes handles stateful deployments. Expect complexity with storage and networking configurations.

Q: What's the best way to connect ClickHouse to a dashboard?
Use the ClickHouse JDBC driver or native HTTP interface with JavaScript libraries. Most BI tools support ClickHouse natively as of 2025.

ClickHouse isn't magic. It's a well-engineered tool for a specific job: real-time analytical queries on large datasets. If you're building dashboards, monitoring systems, or business intelligence pipelines, it'll save you months of infrastructure work.

Start small. Ingest 1 million events. Test your query patterns. Then scale. I've seen teams fail because they tried to migrate 10 billion rows on day one.

Your next move: Download ClickHouse, create a test table, and run SELECT * FROM numbers(1000000000) to see how fast a full table scan happens. It'll shock you.

Nishaant Dixit — Founder of SIVARO. Building data infrastructure and production AI systems since 2018. My team has built systems processing 200K events/sec across fintech, healthcare, and ad-tech. Connect on LinkedIn

Originally published at https://sivaro.in/articles/clickhouse-real-time-analytics-the-hard-truth-about-fast.

DEV Community

ClickHouse Real-Time Analytics: The Hard Truth About Fast Data

Top comments (0)