Originally published at harshit.cloud on 2025-11-19.
On 500 GB of logs over 7 days, on the same hardware: 94% lower query latencies, 37% smaller storage, and under half the CPU and RAM. The single number that surprised us most was the 12× drop in needle-in-a-haystack search times.
The setup
At Truefoundry we run multi-tenant ML workloads, which means fast ad-hoc search, high ingestion, live log tailing, and minimal ops on 4 vCPU / 16 GB nodes. Loki was our default, but past the 1M-active-series mark it started showing 30s+ search latencies and high I/O amplification. So we benchmarked it head-to-head against VictoriaLogs and let the numbers decide.
The contestants in one line:
- Loki: Grafana Labs' log store. Compressed chunks, label-based indexing, LogQL. Brilliant Grafana integration; expensive regex scans and Go GC overhead at scale.
- VictoriaLogs: VictoriaMetrics' columnar LSM log database. Per-field indices, SIMD search, LogsQL. Single binary, low memory footprint, efficient compression.
Methodology in five bullets:
- Workload: 65 MB/s sustained ingestion via flog → Vector → destination
- Dataset: ~500 GB over 7 days across 20 namespaces and 40 apps
- Load test: Locust, 10 virtual users, 43 RPS sustained
- Hardware: 4 vCPU / 8 GiB RAM instances
- Tuning: Block-cache disabled to simulate cold reads
The headline figure
Before the methodology debate, here's what the seven days produced.
The memory line is the one that most directly translates into infrastructure cost. At steady state, VictoriaLogs sat around 1.3 GB while Loki held 6–7 GB. Freeing ~5 GB per node is the difference between bin-packing four tenants on a box and seven.
Query performance
Four query patterns, run against the same 500 GB / 7-day index:
| Query Type | Loki | VictoriaLogs | Improvement |
|---|---|---|---|
| Stats (24h count) | 2.5s | 1.5s | 40% faster |
| Needle-in-Haystack (500 GB) | 12s | ~900ms | 12× faster |
Pattern :3000 (7d) |
2.2s | 2.2s | Same |
| Non-existent (500 GB) | Timeout | 2.2s | VL completed |
Key Insight: VictoriaLogs' per-token index turns brute-force line scans into index lookups. Loki, once the label filter is exhausted, has nothing left but a full scan.
The two queries that made the case, side by side:
Stats: counting logs over 24 hours
LogQL (Loki):
sum(count_over_time({app="servicefoundry-server"}[24h]))
LogsQL (VictoriaLogs):
{app="servicefoundry-server"} | stats count()
Needle in haystack: finding a single entry across 500 GB
LogQL:
{namespace="truefoundry", app!="grafana"} |= "[UNIQUE-STATIC-LOG] ID=abc123 XYZ"
LogsQL:
{namespace="truefoundry", app!="grafana"} "[UNIQUE-STATIC-LOG] ID=abc123 XYZ"
The non-existent query is the quiet one. Loki times out trying to prove a negative across 500 GB; VictoriaLogs returns "none" in 2.2 seconds. In production that's the difference between an alert that fires and a dashboard that loads.
Ingestion under pressure
We pushed both with 120 flog replicas to find the ceiling.
| Metric | Loki | VictoriaLogs | Delta |
|---|---|---|---|
| Peak ingestion | 20 MB/s | 66 MB/s | 3× higher |
| vCPU (sustained) | 4 (throttled) | 2 peak | 50% lower |
| Memory | ~4 GB | ~1.3 GB | 3× lower |
Loki hit the CPU wall first and never recovered. VictoriaLogs absorbed the same firehose with cycles to spare.
Load test under traffic
Locust, 10 concurrent users, simulating real read traffic:
- RPS handled: VictoriaLogs processed 36% higher requests per second
- p99 latency: 3.6× faster than Loki under load
- Tail latency: consistently lower at every percentile we measured
Why the gap is this big
Four design choices doing most of the work:
- Full-text indexing. Per-token indices skip line-by-line filtering entirely.
- Columnar LSM layout. Reads touch only the columns the query asks for; fewer disk seeks.
- Memory discipline. Lower steady-state overhead means more headroom for everything else.
- SIMD search. Vectorised inner loops on commodity CPUs add up over billions of lines.
When to pick which
Choose VictoriaLogs if:
- Text search and grep-style queries are the primary workload
- Ad-hoc exploration across large windows matters
- Resource efficiency and bin-packing density matter
- You want fewer knobs to tune in production
Choose Loki if:
- Label-based queries dominate; full-text is rare
- Deep Grafana ecosystem integration is non-negotiable
- You already operate Loki at scale and the migration cost outweighs the wins
For us, on this workload, the resource economics decided it. The freed memory per node became real infrastructure savings within a quarter. 12 seconds turned into 900 milliseconds with no tuning, and that's the number I keep quoting six months later.




Top comments (0)