When we designed @logtide/reservoir the pluggable storage abstraction layer for Logtide we had to make a real decision: which database should be the default for an observability platform?
The conventional wisdom says: time-series data at scale → ClickHouse. It's what everyone building in this space seems to reach for. Grafana Loki, Signoz, and a bunch of others use it or are moving toward it.
We didn't. We picked TimescaleDB as our default, with ClickHouse available for enterprise deployments and MongoDB for teams already invested in that ecosystem.
We built a proper benchmark suite and ran it. Here are the actual numbers.
The Setup
All three engines were benchmarked under identical conditions, running in Docker on the same machine, seeded with the same synthetic dataset, tested at four volume tiers: 1K, 10K, 100K, and 1M records.
Three data types were tested separately logs, spans (distributed traces), and metrics because the query patterns are fundamentally different for each. Each test ran 3 iterations with 1 warmup round. Results are p50 latency unless otherwise noted.
The benchmark suite is open source: it ships in Logtide's repository and you can run it yourself
Ingestion: Where ClickHouse Has a Problem
The first thing that jumped out was ClickHouse's ingestion behavior at small-to-medium batch sizes.
Log ingestion p50 latency (batch 1,000):
| Engine | 1K rows | 10K rows | 100K rows | 1M rows |
|---|---|---|---|---|
| TimescaleDB | 17.6ms | 14.2ms | 13.9ms | 13.3ms |
| ClickHouse | 400.1ms | 400.4ms | 399.8ms | 400.0ms |
| MongoDB | 37.0ms | 39.5ms | 37.2ms | — |
ClickHouse is sitting at exactly 400ms for batch 1,000 across all volume tiers. That's not a coincidence it's ClickHouse's async insert behavior. When async_insert = 1 is enabled (common in modern clients and managed services), ClickHouse buffers writes in memory and flushes them when async_insert_busy_timeout_ms elapses. Our setup has that timeout at 400ms. The 400 isn't a random number; it's a configured flush interval.
The buffering exists precisely because ClickHouse doesn't handle high-frequency small writes well natively. Its columnar storage format requires merging data into sorted chunks a process that's expensive if triggered on every small insert. Async inserts are the workaround: batch writes in memory, flush periodically, pay the merge cost less often. It's the right design for bulk analytics ingestion. It's the wrong design if you're pushing logs from 10 microservices every few seconds.
This matters a lot for observability workloads. When your application is logging in real time, you're not sending 10,000-log batches. You're sending small, frequent writes. At batch 100, ClickHouse delivers 250 ops/s. TimescaleDB delivers 14,200 ops/s. That's a 56x difference at a batch size that's very common in practice.
ClickHouse catches up at batch 10,000 - 83,843 ops/s vs 120,934 ops/s for TimescaleDB. At scale ingestion, they're comparable. But you need to be running at that scale to benefit.
MongoDB sits in the middle: consistent ~25K ops/s regardless of batch size, no timing artifacts. Predictable if not spectacular.
Query Latency: The Result That Settles the Debate
This is where the numbers get dramatic.
Log query p50 latency at 100K records:
| Operation | TimescaleDB | ClickHouse | MongoDB |
|---|---|---|---|
| Single service filter | 0.47ms | 44.8ms | 304ms |
| Multi-filter | 0.48ms | 35.2ms | 309ms |
| Full-text search | 0.45ms | 32.2ms | 39.9ms |
| Narrow time range (1h) | 0.49ms | 8.7ms | 3.4ms |
| Pagination (offset 1000) | 0.40ms | 85.8ms | 320ms |
| Aggregate 1h buckets | 0.41ms | 15.1ms | 376ms |
TimescaleDB is answering filtered log queries in under half a millisecond at 100K records. ClickHouse takes 35-85ms for the same queries. MongoDB takes 300-400ms.
The scaling story is equally stark. At 1M records, TimescaleDB's query latency barely moves still 0.46ms for a service filter. ClickHouse degrades to 244ms. MongoDB wasn't tested at 1M for logs (the 100K numbers already showed where things were heading).
This is the TimescaleDB superpower: hypertable partitioning + continuous aggregates. Most log queries filter by time range and service. TimescaleDB chunks data by time, and those chunks are indexed by service. The queries skip entire partitions instead of scanning. The continuous aggregates make count and aggregate queries nearly free because the work is already done.
The One Place ClickHouse Wins
There's an important exception to the TimescaleDB dominance: count operations at scale.
Count p50 at 1M records:
| Operation | TimescaleDB | ClickHouse |
|---|---|---|
| Full count | 0.38ms | 11.25ms |
| Filtered count | 0.43ms | 14.42ms |
Wait TimescaleDB wins here too? Yes, because of the countEstimate optimization we built: instead of COUNT(*), we use EXPLAIN planner estimates for approximate counts. Zero scan, sub-millisecond.
Where ClickHouse genuinely wins is aggregate throughput at high volume. At 1M records, ClickHouse's aggregate (1m) shows 55,507 ops/s vs TimescaleDB's comparable range. ClickHouse is built for columnar analytical queries over huge datasets if you're running complex analytics across months of data with many group-by combinations, it'll outperform.
For the interactive dashboard queries that dominate observability UIs "show me the last hour filtered by this service" TimescaleDB is not even close to a fair fight.
Spans: The Interesting Reversal
The span (distributed tracing) results tell a different story from logs.
Trace query p50 at 10K records:
| Operation | TimescaleDB | ClickHouse | MongoDB |
|---|---|---|---|
| Query all traces | 2.5ms | 23.6ms | 1.6ms |
| Query error traces | 1.6ms | 22.6ms | 3.3ms |
| Get trace by ID | 0.29ms | 4.3ms | 0.40ms |
| Service dependencies | 0.42ms | 179ms | 444ms |
MongoDB is faster than TimescaleDB on some trace queries at this scale. The reason: MongoDB's document model fits trace data naturally. A trace is a document with nested spans. The queryTraces (all) query maps directly to a collection scan with a simple index lookup. TimescaleDB has to join spans to reconstruct traces.
Both MongoDB and TimescaleDB stay well ahead of ClickHouse on span queries. ClickHouse at 10K concurrent span queries (50 parallel) takes 1.76 seconds. TimescaleDB handles the same load in 10ms. That's what "not designed for point lookups" looks like in practice.
At 100K spans, the MongoDB advantage on trace queries disappears: querySpans (by service) goes from 82ms to 159ms, while TimescaleDB holds at 0.65ms. The document model helps at smaller scales but doesn't index-skip the way hypertables do.
Concurrency: The Story Nobody Tells
Single-query latency is fine for benchmarks. Production workloads are concurrent.
Concurrent log queries (50 parallel) p50:
| Volume | TimescaleDB | ClickHouse | MongoDB |
|---|---|---|---|
| 1K | 6.8ms | 334ms | 665ms |
| 10K | 6.7ms | 401ms | 792ms |
| 100K | 6.2ms | 895ms | 2,380ms |
| 1M | 6.2ms | 6,307ms | — |
TimescaleDB's concurrency numbers are remarkably flat. 50 parallel queries at 100K records: 6.2ms. Same 50 parallel queries at 1M records: still 6.2ms.
ClickHouse at 50 parallel queries on 1M records: 6.3 seconds. PostgreSQL's connection-per-query model and MVCC handle concurrent readers without degradation. ClickHouse's columnar engine serializes heavy queries and saturates threads.
This matters if you're running Logtide for a team. Multiple people with dashboards open, alert evaluations running in the background, scheduled reports firing that's concurrent load. TimescaleDB absorbs it. ClickHouse struggles with it.
Metrics: MongoDB's Surprise
Metrics data was the unexpected MongoDB story.
Concurrent metric queries (50 parallel) at 100K:
| Engine | p50 |
|---|---|
| TimescaleDB | 6.3ms |
| ClickHouse | 284.9ms |
| MongoDB | 53.7ms |
MongoDB beats ClickHouse on concurrent metric queries by 5x. The reason: our MongoDB metrics implementation uses the native $percentile aggregation pipeline, which MongoDB handles efficiently in-memory at this scale. ClickHouse's columnar approach adds overhead for the many small aggregations typical of metrics dashboards.
At 1K and 10K records, MongoDB's metric aggregations (avg, sum, min, max, percentiles) are all in the 11-17ms range faster than ClickHouse's 8-21ms range, and only slightly behind TimescaleDB's sub-millisecond performance.
The catch that these latency numbers don't show: MongoDB stores metrics as BSON documents without time-series-specific compression. TimescaleDB uses columnar compression on hypertables, and ClickHouse uses Gorilla encoding (delta-of-delta) for floats and Delta encoding for timestamps algorithms designed specifically for the repetitive patterns in metrics data. In practice, the same year of metrics data will occupy significantly less disk on TimescaleDB or ClickHouse than on MongoDB. If storage cost matters at your scale, that tradeoff should factor into the decision.
MongoDB won 4 out of 52 benchmark categories at 1K records, 2 at 10K. Small wins, but real ones mostly around span lookups by trace ID and narrow time range queries, where its document indexing shines.
The Decision Framework
After seeing these numbers, here's how we think about the choice:
Use TimescaleDB (default) when:
- You're running Logtide for a single team or SMB
- You're already comfortable with PostgreSQL operationally
- You want the lowest query latency across the board
- You have mixed concurrent load (dashboards + alerts + searches)
- You're on AWS RDS for PostgreSQL with TimescaleDB extension, or Aurora PostgreSQL
Use ClickHouse when:
- You're ingesting exclusively in large batches (10K+ per request)
- Your primary use case is analytical queries over months of historical data
- You have a dedicated ops team managing ClickHouse infrastructure
- You're on AWS EC2 with a self-managed ClickHouse cluster
Use MongoDB when:
- You're already running MongoDB in your infrastructure (DocumentDB, Atlas, FerretDB, Cosmos DB in Mongo mode)
- Your workload is trace-heavy with many individual document lookups
- You want to avoid running a separate database just for observability
- You're on AWS DocumentDB and don't want another managed service
The @logtide/reservoir abstraction means the application code doesn't care which engine you pick. You swap the config, run the migrations, and the same Logtide instance works on all three.
What These Numbers Don't Tell You
Benchmarks lie in specific ways, and this one has a scale ceiling you should be aware of.
1M records is not a large dataset. A moderately busy production service can generate 1M logs in minutes. At 100M or 1B rows where real enterprise observability workloads live the picture changes. TimescaleDB's B-tree indexes eventually stop fitting in RAM. When that happens, queries start hitting disk and latency climbs non-linearly. ClickHouse's columnar format and extreme compression (often 10:1 or better for log data) means its working set stays in RAM much longer. At billion-row scale, the engines invert: ClickHouse's full-table scans become faster than TimescaleDB's index-misses.
These benchmarks represent SMB-scale workloads teams generating tens of millions of log entries per day, not hundreds of millions per hour. That's exactly Logtide's target. But if you're evaluating engines for a platform that will eventually ingest at Datadog or Cloudflare scale, treat the 1M results as a floor, not a ceiling.
The other caveats: these tests ran on a single machine, fresh database, warm connection pool, no competing load. Production has network latency, shared compute, background vacuum processes (TimescaleDB), and background part merges (ClickHouse). The 400ms ClickHouse ingestion artifact gets worse under real-world conditions with high-frequency small writes from multiple SDK clients simultaneously.
MongoDB's metrics performance advantage at small scale comes with a storage cost that isn't visible in these benchmarks: MongoDB doesn't compress numeric time-series data the way TimescaleDB (using columnar compression) or ClickHouse (using Gorilla/Delta-Delta encoding) do. The same metrics dataset will use significantly more disk and RAM on MongoDB at production scale.
The benchmark suite is in the repo if you want to run it against your own infrastructure with your own dataset shapes.
Why TimescaleDB Won 96% of Tests
The summary from the benchmark runner:
timescale 50 wins ( 96%)
clickhouse 0 wins ( 0%)
mongodb 4 wins ( 4%)
Zero wins for ClickHouse isn't a bug in the benchmark it's a reflection of the workload. Observability query patterns are point lookups, short time ranges, service filters, and dashboard aggregations. That's TimescaleDB's wheelhouse.
ClickHouse excels at full-table analytics. When you're doing SELECT service, sum(errors) FROM logs WHERE month = 'February' across 500 million rows, ClickHouse will leave TimescaleDB behind. That query pattern doesn't dominate an observability dashboard. It dominates a data warehouse.
We made the right call. But we're glad we have the numbers to prove it now.
@logtide/reservoir is open source TimescaleDB, ClickHouse, and MongoDB adapters ship in Logtide 0.8.0.
If you run it against your own setup and get different results, open an issue. We'd genuinely like to know.
Top comments (0)