DEV Community

ZhengZhiCong
ZhengZhiCong

Posted on

VictoriaMetrics Stream Aggregation: A Three-Year Retrospective (2026)

It's been exactly three years since the first article in this series was published in March 2023. The VictoriaMetrics ecosystem has changed dramatically since then. Let's revisit the problems we laid out, see what the official project has resolved, and assess where our custom stream-metrics-route gateway stands today.


I. The Problems We Identified in 2023

Here's a quick recap of the core issues from the original post:

# Problem 2023 Status
P1 Collection gap inflation Network jitter or performance issues cause time gaps that inflate stream aggregation deltas
P2 Single-node compute limits Stream aggregation has no historical state, fast but single-instance bottlenecked
P3 Distributed task allocation Which compute node should each sample be assigned to?
P4 Out-of-order discarding for same-dimension metrics Multiple nodes computing the same dimension with different time windows causes later values to be discarded
P5 Resource balancing Uneven load across distributed compute nodes
P6 Task ID dimension explosion Stream aggregation inserts node IDs into aggregated time series — the label cardinality grows with every node you add

To address these, we built stream-metrics-route, a Go-based distributed stream aggregation gateway.


II. Three Years Later — What the Official Project Has Done

I reviewed VictoriaMetrics changelogs from v1.86 through v1.138.0 and the official documentation. Here's the scorecard.

✅ Perfectly Resolved

Issues P3, P5: Distributed Task Allocation & Resource Balancing

Official solution: vmagent now natively supports -remoteWrite.shardByURL with consistent hashing sharding.

Starting from v1.86, basic shardByURL was introduced. v1.138.0 (March 2026) was the real milestone — it upgraded the data distribution algorithm from round-robin to consistent hashing, which significantly reduces data redistribution ratios during node changes.

The architecture evolution looks like this:

┌─────────────────┐     ┌─────────────────┐
│ Prometheus      │     │ Prometheus      │
│ Agent 1         │     │ Agent 2         │
└────────┬────────┘     └────────┬────────┘
         │ remote write          │ remote write
         ▼                       ▼
┌─────────────────────────────────────────┐
│           vmagent Cluster               │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐│
│  │vmagent-0 │  │vmagent-1 │  │vmagent-2 ││
│  └─────┬────┘  └─────┬────┘  └─────┬────┘│
│        │             │             │     │
│        └─────────────┼─────────────┘     │
│                      ▼                   │
│           ┌──────────────────┐           │
│           │ Consistent Hash │           │
│           └────────┬─────────┘           │
└────────────────────┼─────────────────────┘
                     │ shard by hash
         ┌───────────┼───────────┐
         ▼           ▼           ▼
   ┌──────────┐ ┌──────────┐ ┌──────────┐
   │vmstorage │ │vmstorage │ │vmstorage │
   │    -0    │ │    -1    │ │    -2    │
   └──────────┘ └──────────┘ └──────────┘
Enter fullscreen mode Exit fullscreen mode

The VictoriaMetrics blog provides specific algorithm recommendations. Combined with VictoriaMetrics Operator, you can manage shards via shardCount.

Issue P2: Single-Node Compute Scaling

vmagent now supports horizontal scaling natively with replicas + shardCount, including HA support — see Issue #5573.

Out-of-Order / Delayed Data Accuracy (P1 — Partial Mitigation)

v1.112.0 (February 2025) was a key release, adding Aggregation Windows — dual-window buffering for histogram and rate calculations. Instead of flushing immediately, the output is delayed by a samples_lag window, which significantly improves accuracy for late-arriving data. The tradeoff: roughly doubled memory usage (maintaining two aggregation windows simultaneously).

How it works:

Collector          vmagent           VictoriaMetrics
   │                  │                    │
   │  sample1 @T0    │                    │
   ├─────────────────►│  Write to          │
   │                  │  Window A (current)│
   │                  │                    │
   │  sample2 @T1    │                    │
   │  (delayed)      │                    │
   ├─────────────────►│  Write to          │
   │                  │  Window B (previous)│
   │                  │                    │
   │                  │  Aggr result A @T2 │
   │                  ├────────────────────►│
   │                  │  Aggr result B @T3 │
   │                  ├────────────────────►│
Enter fullscreen mode Exit fullscreen mode

See the official docs on streaming aggregation windows.

❌ Still Unresolved

True Distributed Stream Aggregation Coordination

vmagent's stream aggregation is single-instance. There is no coordination mechanism between instances — if two vmagent instances aggregate the same metric, you get duplicate or conflicting output. The official recommendation is to use without/by label clauses to divide responsibility between instances, rather than providing a cross-instance coordination protocol.

Task ID Dimension Explosion (P6)

Official vmagent still inserts internal labels (such as _aggr-related labels) into aggregated time series, but lacks a stream_task_id pre-marking plus dimension control design.


III. stream-metrics-route: Current Status and Value

Core Code Architecture

File Role
router.go Routing core — filters metrics based on relabel rules
remotecluster.go Dual hashmod scheduling core
remotewrite.go Remote write HTTP client
kafka.go Kafka producer

The Core Algorithm (from remotecluster.go)

// Dual hashmod scheduling
hash := sortLabelsHashKey(ts.Labels)
dime := hashMod(r.dimension, hash) // First hashmod → task partition ID

ts.Labels = append(ts.Labels, prompb.Label{
    Name:  "stream_task_id",
    Value: strconv.Itoa(dime), // Insert stream_task_id label
})

hashnode = sortLabelsHashKey(filterLabels) // Second hashmod → node selection
tmpch := hashMod(r.uplen, hashnode)       // Which backend writer to send to
Enter fullscreen mode Exit fullscreen mode

Is It Still Needed in 2026?

Yes — but with an adjusted role. The positioning should shift from "full stream aggregation gateway" to "metric distribution routing gateway + Kafka integration layer." The core differentiated value:

  1. Dual hashmod scheduling + stream_task_id pre-injection — tags metrics at the gateway layer, so all downstream nodes route consistently by this ID. This solves dimension control earlier in the pipeline than the official approach.

  2. Multi-backend async distribution — supports async distribution to both Kafka and remote write, solving the "synchronous forwarding blocks the time window" problem from the original post.

  3. Native Prometheus relabeling integration — works with standard Prometheus relabel configs, no custom syntax to learn.


IV. Recommended 2026 Hybrid Architecture

┌─────────────────┐   ┌──────────────────┐
│  Prometheus     │   │  Business System │
│  Agent Cluster  │   │  Metrics (Kafka) │
└────────┬────────┘   └────────┬─────────┘
         │                     │
         ▼                     ▼
    ┌──────────────────────────────┐
    │   stream-metrics-route       │
    │   (Routing Layer)            │
    │   - Dual hashmod scheduling  │
    │   - stream_task_id injection │
    │   - Relabeling               │
    └──────┬──────┬──────┬─────────┘
           │      │      │
    task=0 │ task=1│task=2│
           ▼      ▼      ▼
    ┌─────────────────────────┐
    │   vmagent Cluster       │
    │  (v1.112.0+ with        │
    │   aggregation windows)  │
    └──────────┬──────────────┘
               │
               ▼
    ┌──────────────────┐      ┌────────────┐
    │  VictoriaMetrics │      │   Kafka    │
    │  (Storage)       │      │   (Topic)  │
    └──────────┬───────┘      └────────────┘
               │
               ▼
    ┌──────────────────┐
    │  vmalert         │
    │  Grafana         │
    └──────────────────┘
Enter fullscreen mode Exit fullscreen mode

Key Configuration

vmagent version requirement: >= v1.112.0, with aggregation windows enabled:

# stream aggregation config
- match: 'http_request_duration_seconds_bucket'
  interval: 5m
  without: [instance]
  enable_windows: true   # Critical! Enables dual-window buffering
  outputs: [rate_sum]
Enter fullscreen mode Exit fullscreen mode

V. Evolution Recommendations

Short-term

Action Description
Upgrade vmagent to >= v1.112.0 Enable enable_windows: true to improve histogram aggregation accuracy
Evaluate stream-metrics-route necessity If you have no Kafka requirement or high-cardinality stream_task_id control need, consider migrating away

Medium-term

Action Description
Retain stream-metrics-route as front-end routing only Keep hashmod task allocation + Kafka distribution; remove aggregation responsibility
Disable raw metric persistence Write only stream-aggregated results to storage to reduce volume
Add metadata management module The ruler-handle-process from the original post (dynamic Record Rule by dimension) is worth building or contributing

Long-term

Action Description
Contribute stream_task_id dimension control upstream If the dual hashmod design proves out in production
Improve monitoring metrics Add business-level metrics — queue depth per routing rule, distribution latency, etc.

Putting It All Together

Dimension Assessment
Problem resolution rate ~50% — 2 of 4 core problems resolved via official upgrades; 2 still need custom solutions
Is stream-metrics-route still needed? Yes — repositioned as "metric distribution routing gateway + Kafka integration layer"
Recommended architecture Prometheus → stream-metrics-route → vmagent v1.112.0+ → VictoriaMetrics Storage

Three years is a long time in the observability space. The VictoriaMetrics ecosystem has matured significantly — consistent hashing, aggregation windows, and native sharding all address problems that required custom tooling in 2023. But the hard problems around true distributed stream aggregation coordination and dimension control at the gateway layer remain open.

If you're running a similar stack, the hybrid approach — letting the official project handle what it's good at (single-node aggregation, storage) while keeping custom routing for what it isn't (distributed coordination, dimension pre-injection, Kafka bridging) — has proven to be the right call for us.


Originally published on my blog

Top comments (0)