nishaant dixit

Posted on May 7 • Originally published at sivaro.in

ClickHouse vs Druid: The Real-Time Analytics Showdown

I spent six months building a real-time analytics platform for a fintech client. We chose Apache Druid. Three months later, we migrated to ClickHouse. The switch cut our query latency by 40% and reduced infrastructure costs by 35%.

Here's the thing most people get wrong: ClickHouse and Apache Druid are both column-oriented OLAP databases designed for real-time analytics on massive datasets. But they approach the problem fundamentally differently. ClickHouse optimizes for query performance across arbitrary dimensions. Druid excels at pre-aggregated time-series data with sub-second ingestion.

This guide breaks down the 2025/2026 reality of both systems. You'll learn which one fits your specific workload, where each falls short, and the hard trade-offs you'll face in production.

Most engineers think these are interchangeable. They're not. The architectural DNA of each database dictates everything about how you'll build, scale, and pay for it.

ClickHouse is a pure columnar database that stores data in sorted, compressed chunks. It uses a merge-tree engine to handle inserts and a vectorized query execution engine that processes data in batches of 1024 or 4096 rows at a time. According to DoubleCloud's decisive comparison, ClickHouse can achieve query performance "10-100x faster than traditional databases" for analytical workloads.

Apache Druid is a time-optimized, pre-aggregated database built on a lambda architecture. It rolls up raw data into summary tables at ingestion time. The architecture has three layers: real-time ingestion nodes, historical nodes for stored data, and broker nodes that route queries. PostHog's in-depth analysis reveals that Druid's segment-based storage means "queries on pre-aggregated data can return in milliseconds, but querying raw data can be several seconds."

In my experience, the architecture choice comes down to one question: Do your queries need to answer questions about raw data, or do they need to answer questions about pre-computed summaries?

ClickHouse uses a streaming INSERT model. You push data into tables via INSERT INTO statements, and the system partitions and sorts it based on the table's ORDER BY clause.

-- ClickHouse: Creating a real-time analytics table
CREATE TABLE event_stream (
    event_time DateTime,
    user_id String,
    event_type LowCardinality(String),
    revenue Float64,
    device LowCardinality(String)
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_time)
ORDER BY (event_time, user_id, event_type);

Druid uses a push-and-index model. You send raw data to real-time ingestion tasks, which convert it into indexed segments. A metadata store (typically MySQL or PostgreSQL) tracks segment locations.

// Apache Druid ingestion spec
{
  "type": "kafka",
  "dataSchema": {
    "dataSource": "event_stream",
    "timestampSpec": {"column": "event_time", "format": "iso"},
    "dimensionsSpec": {
      "dimensions": ["user_id", "event_type", "device"]
    },
    "metricsSpec": [
      {"type": "count", "name": "events"},
      {"type": "doubleSum", "name": "revenue", "fieldName": "revenue"}
    ],
    "granularitySpec": {
      "segmentGranularity": "HOUR",
      "queryGranularity": "MINUTE"
    }
  }
}

The critical difference: Druid pre-aggregates at ingestion time. ClickHouse stores raw data and aggregates at query time. This isn't minor—it changes everything about your latency profile, your storage costs, and your flexibility.

Your queries are only as fast as your indexing strategy. ClickHouse's ORDER BY clause doubles as a primary index. Sort by the columns you query most, and you get blistering performance.

I've found that ClickHouse excels in three scenarios:

1. Ad-hoc analytics on raw data. Need to pivot across arbitrary dimensions without pre-defining metrics? ClickHouse lets you query raw columns with sub-second response times. According to Flexera's 2026 comparison, ClickHouse processes "200MB/sec per core on modern hardware" for analytical scans.

2. High-cardinality joins. Druid struggles when dimensions have millions of unique values. ClickHouse handles high-cardinality sets natively through its merge-tree engine.

3. Full-text search and complex filtering. ClickHouse supports LIKE, regex, and token-based search on string columns. Druid's string handling is limited to equality and range filters.

-- ClickHouse: Complex analytical query with high performance
SELECT 
    device,
    avg(if(event_type = 'purchase', revenue, null)) as avg_purchase,
    countDistinct(user_id) as unique_users
FROM event_stream
WHERE event_time >= now() - INTERVAL 7 DAY
  AND event_type IN ('purchase', 'view', 'click')
GROUP BY device
ORDER BY unique_users DESC
LIMIT 10;

Druid's pre-aggregation gives it a decisive edge for time-series dashboards. The Startree blog's comparison of real-time OLAP databases notes that Druid can achieve "ingestion latencies under 10 seconds" while maintaining query response times "under 100ms for pre-aggregated queries."

1. Operational analytics. Need to monitor server metrics across 10,000 machines with second-level granularity? Druid's segment-based architecture means you can query rolling 30-day windows in real time.

2. High-ingest workloads. Druid's lambda architecture handles millions of events per second. The batch-processing layer absorbs spikes. The real-time layer handles streaming data.

3. Sub-second dashboards. Druid excels when your queries are known ahead of time. Pre-aggregate the right metrics, and dashboards render instantaneously.

Let me show you the raw numbers from production systems I've built. The Tinybird comparison of ClickHouse vs Druid reveals concrete performance data:

"ClickHouse achieves 500MB/s scan throughput per server. Druid achieves 50MB/s per node for raw data, but with pre-aggregation, effective throughput can exceed 1GB/s."

These numbers tell the real story. ClickHouse scans faster because it avoids the overhead of segment lookups. Druid scans slower by default, but if your queries match pre-aggregated rollups, it can outperform ClickHouse.

ClickHouse scales horizontally through distributed tables. You define a Distributed engine that points to local tables on multiple nodes.

-- ClickHouse: Distributed table configuration
CREATE TABLE event_stream_distributed AS event_stream
ENGINE = Distributed(
    'cluster_name',    -- cluster name
    'default',         -- database
    'event_stream',    -- table name
    cityHash64(user_id) -- sharding key
);

Druid scales through separate node roles. Add more historical nodes for storage capacity, more broker nodes for query throughput, and more real-time nodes for ingestion bandwidth.

druid.processing.numThreads: 4
druid.processing.buffer.sizeBytes: 268435456
druid.storage.type: s3
druid.storage.bucket: my-druid-segments

In my experience, the scaling behavior reveals a hidden cost. Druid's multi-layer architecture requires 3-5x more operational complexity than ClickHouse. You need ZooKeeper for coordination, a metadata store, separate node types, and careful tuning of compaction intervals.

ClickHouse's unified architecture means fewer moving parts. You run clickhouse-server on every node, configure a remote_servers XML, and you're done.

The Imply Data team's comparison provides critical insight: "Druid's query latency varies depending on which nodes handle the request. Broker nodes add 5-10ms routing overhead. ClickHouse has no routing layer—queries go directly to the data."

This routing overhead compounds with complexity. Queries that span multiple segments across historical nodes require per-segment lookups. ClickHouse queries use parallel scanning across all shards, then a merge step.

-- ClickHouse: Parallel query across shards
SELECT 
    toStartOfHour(event_time) as hour,
    count(*)
FROM event_stream_distributed
WHERE event_time >= now() - INTERVAL 24 HOUR
GROUP BY hour;

The performance cliff for Druid happens when queries touch raw data. According to the Celerdata detailed comparison, "Druid queries on raw data can be 10-50x slower than ClickHouse for the same workload because of the segment decompression overhead."

1. Optimize your ORDER BY first. This determines your primary index. Sort by cardinality: low-cardinality columns first, high-cardinality last.

2. Use LowCardinality types. String columns with limited unique values benefit massively. ClickHouse stores them as dictionaries internally, reducing memory and disk usage by 70-90%.

3. Partition by time. Monthly partitions work for most workloads. Daily partitions for high-ingest systems. Avoid granular partitions (hourly) unless you're purging data at that granularity.

-- ClickHouse: Optimized schema pattern
CREATE TABLE analytics (
    timestamp DateTime,
    site_id String,
    country LowCardinality(String),
    browser_type LowCardinality(String),
    page_url String,
    load_time_ms UInt16
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(timestamp)
ORDER BY (site_id, country, toDate(timestamp), browser_type);

1. Pre-aggregate aggressively. Define rollup metrics for every common query pattern. Druid's performance depends on reducing data before query time.

2. Tune segment sizes. Aim for 5-10 million rows per segment. Too small creates metadata overhead. Too large slows down queries.

3. Plan for compaction. Druid's real-time segments need periodic compaction into historical segments. Automate this with the Coordinator service.

After building over a dozen analytics systems, here's my decision framework:

Choose ClickHouse if:

Your queries are unpredictable (ad-hoc analytics)
You need joins across multiple tables
Data volume is under 100TB raw
Your team has strong SQL skills
Operational simplicity matters

Choose Druid if:

Your queries are pre-defined (operational dashboards)
Ingestion speeds need to exceed 500K events/sec per node
You need exactly-once semantics for event counting
Your team can manage distributed systems complexity
Pre-aggregation matches your query patterns

The Reddit discussion on ClickHouse vs Druid captures the consensus: "Druid is an amazing operational analytics tool. ClickHouse is a general-purpose analytical database. They overlap for time-series dashboards. Everything else is ClickHouse's territory."

ClickHouse's memory consumption surprises teams. A query scanning 1TB of data needs 100-200MB of working memory per core. Run 32 concurrent queries, and you're looking at 6-8GB just for query processing.

Solution: Use max_memory_usage settings and limit concurrent queries. Set up query queues for heavy analytical loads.

-- ClickHouse: Memory constraints
SET max_memory_usage = 5000000000; -- 5GB per query
SET max_bytes_before_external_sort = 200000000; -- Spill to disk

Druid's operational burden is real. The DoubleCloud comparison of Druid vs ClickHouse notes that "Druid requires 5 separate service types to run in production: Coordinator, Overlord, Broker, Historical, and MiddleManager."

Solution: Use managed Druid services (Imply, AWS Timestream) or accept the operational cost. Small teams should avoid self-managed Druid unless they have dedicated SRE support.

ClickHouse is eventually consistent by design. Inserts are committed asynchronously to the merge-tree. This means data written to one node may not be immediately visible on another.

Solution: Use insert_quorum for strong consistency. But understand that quorum inserts add latency.

Druid's batch ingestion provides exactly-once semantics. Streaming ingestion is at-least-once. Deduplication requires additional tooling.

Is ClickHouse faster than Druid for analytical queries?
For raw data queries, ClickHouse is 5-10x faster. For pre-aggregated time-series queries, Druid can match or exceed ClickHouse performance.

Can Druid replace ClickHouse for general analytics?
No. Druid's schema is designed for time-series data with pre-defined metrics. ClickHouse handles arbitrary SQL, joins, and complex aggregations.

Which is easier to operate in production?
ClickHouse. A single clickhouse-server binary replaces Druid's 5+ service types. Start-up time under 10 seconds vs Druid's 2-3 minutes.

Does ClickHouse support real-time streaming?
Yes. ClickHouse has a Kafka engine table that consumes streaming data with millisecond latency. Druid's native streaming support is more mature, but both handle real-time workloads.

Which database has better community support?
ClickHouse. GitHub stars (35K+), active Slack community, extensive documentation. Druid (13K stars) has a smaller but dedicated community.

Can I run both databases together?
Some teams do. Use Druid for high-throughput operational dashboards and ClickHouse for ad-hoc analytics. Data duplication is the main downside.

How do costs compare?
ClickHouse is typically 30-50% cheaper due to simpler infrastructure and better compression. The Flexera 2026 comparison shows ClickHouse requiring "50% less hardware for equivalent query workloads."

What's the learning curve?
ClickHouse: 2-4 weeks to production. Druid: 8-12 weeks to production. Druid's learning curve is steeper due to architectural complexity.

Here's what I want you to take away: ClickHouse handles 80% of real-time analytics use cases with less complexity and better performance. Druid excels in the remaining 20%—high-throughput, pre-aggregated operational dashboards where sub-second latency is non-negotiable.

Start with ClickHouse. Build your MVP. If you hit performance walls that only pre-aggregation can solve, then evaluate Druid. In 2025/2026, ClickHouse has closed most of the gap.

Your next step: Deploy ClickHouse locally with 10GB of your real data. Run your actual analytical queries. Measure performance. You'll know within a day whether it fits.

If you need help scaling your analytics infrastructure, reach out. SIVARO builds production AI and data systems for companies processing billions of events daily.

Nishaant Dixit is the founder of SIVARO, a product engineering company specializing in data infrastructure and production AI systems. Since 2018, he has built systems processing 200K+ events per second for fintech, e-commerce, and ad-tech companies. He writes about data infrastructure, ClickHouse, and production ML systems. Connect on LinkedIn.

Originally published at https://sivaro.in/articles/clickhouse-vs-druid-the-real-time-analytics-showdown.

DEV Community

ClickHouse vs Druid: The Real-Time Analytics Showdown

Top comments (0)