ZhengZhiCong

Posted on Jun 7

VictoriaMetrics Stream Aggregation: A Three-Year Retrospective (2026)

#database #devops #monitoring #performance

It's been exactly three years since the first article in this series was published in March 2023. The VictoriaMetrics ecosystem has changed dramatically since then. Let's revisit the problems we laid out, see what the official project has resolved, and assess where our custom stream-metrics-route gateway stands today.

I. The Problems We Identified in 2023

Here's a quick recap of the core issues from the original post:

#	Problem	2023 Status
P1	Collection gap inflation	Network jitter or performance issues cause time gaps that inflate stream aggregation deltas
P2	Single-node compute limits	Stream aggregation has no historical state, fast but single-instance bottlenecked
P3	Distributed task allocation	Which compute node should each sample be assigned to?
P4	Out-of-order discarding for same-dimension metrics	Multiple nodes computing the same dimension with different time windows causes later values to be discarded
P5	Resource balancing	Uneven load across distributed compute nodes
P6	Task ID dimension explosion	Stream aggregation inserts node IDs into aggregated time series — the label cardinality grows with every node you add

To address these, we built stream-metrics-route, a Go-based distributed stream aggregation gateway.

II. Three Years Later — What the Official Project Has Done

I reviewed VictoriaMetrics changelogs from v1.86 through v1.138.0 and the official documentation. Here's the scorecard.

✅ Perfectly Resolved

Issues P3, P5: Distributed Task Allocation & Resource Balancing

Official solution: vmagent now natively supports -remoteWrite.shardByURL with consistent hashing sharding.

Starting from v1.86, basic shardByURL was introduced. v1.138.0 (March 2026) was the real milestone — it upgraded the data distribution algorithm from round-robin to consistent hashing, which significantly reduces data redistribution ratios during node changes.

The architecture evolution looks like this:

┌─────────────────┐     ┌─────────────────┐
│ Prometheus      │     │ Prometheus      │
│ Agent 1         │     │ Agent 2         │
└────────┬────────┘     └────────┬────────┘
         │ remote write          │ remote write
         ▼                       ▼
┌─────────────────────────────────────────┐
│           vmagent Cluster               │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐│
│  │vmagent-0 │  │vmagent-1 │  │vmagent-2 ││
│  └─────┬────┘  └─────┬────┘  └─────┬────┘│
│        │             │             │     │
│        └─────────────┼─────────────┘     │
│                      ▼                   │
│           ┌──────────────────┐           │
│           │ Consistent Hash │           │
│           └────────┬─────────┘           │
└────────────────────┼─────────────────────┘
                     │ shard by hash
         ┌───────────┼───────────┐
         ▼           ▼           ▼
   ┌──────────┐ ┌──────────┐ ┌──────────┐
   │vmstorage │ │vmstorage │ │vmstorage │
   │    -0    │ │    -1    │ │    -2    │
   └──────────┘ └──────────┘ └──────────┘

The VictoriaMetrics blog provides specific algorithm recommendations. Combined with VictoriaMetrics Operator, you can manage shards via shardCount.

Issue P2: Single-Node Compute Scaling

vmagent now supports horizontal scaling natively with replicas + shardCount, including HA support — see Issue #5573.

Out-of-Order / Delayed Data Accuracy (P1 — Partial Mitigation)

v1.112.0 (February 2025) was a key release, adding Aggregation Windows — dual-window buffering for histogram and rate calculations. Instead of flushing immediately, the output is delayed by a samples_lag window, which significantly improves accuracy for late-arriving data. The tradeoff: roughly doubled memory usage (maintaining two aggregation windows simultaneously).

How it works:

Collector          vmagent           VictoriaMetrics
   │                  │                    │
   │  sample1 @T0    │                    │
   ├─────────────────►│  Write to          │
   │                  │  Window A (current)│
   │                  │                    │
   │  sample2 @T1    │                    │
   │  (delayed)      │                    │
   ├─────────────────►│  Write to          │
   │                  │  Window B (previous)│
   │                  │                    │
   │                  │  Aggr result A @T2 │
   │                  ├────────────────────►│
   │                  │  Aggr result B @T3 │
   │                  ├────────────────────►│

See the official docs on streaming aggregation windows.

❌ Still Unresolved

True Distributed Stream Aggregation Coordination

vmagent's stream aggregation is single-instance. There is no coordination mechanism between instances — if two vmagent instances aggregate the same metric, you get duplicate or conflicting output. The official recommendation is to use without/by label clauses to divide responsibility between instances, rather than providing a cross-instance coordination protocol.

Task ID Dimension Explosion (P6)

Official vmagent still inserts internal labels (such as _aggr-related labels) into aggregated time series, but lacks a stream_task_id pre-marking plus dimension control design.

III. stream-metrics-route: Current Status and Value

Core Code Architecture

File	Role
`router.go`	Routing core — filters metrics based on relabel rules
`remotecluster.go`	Dual hashmod scheduling core
`remotewrite.go`	Remote write HTTP client
`kafka.go`	Kafka producer

The Core Algorithm (from `remotecluster.go`)

// Dual hashmod scheduling
hash := sortLabelsHashKey(ts.Labels)
dime := hashMod(r.dimension, hash) // First hashmod → task partition ID

ts.Labels = append(ts.Labels, prompb.Label{
    Name:  "stream_task_id",
    Value: strconv.Itoa(dime), // Insert stream_task_id label
})

hashnode = sortLabelsHashKey(filterLabels) // Second hashmod → node selection
tmpch := hashMod(r.uplen, hashnode)       // Which backend writer to send to

Is It Still Needed in 2026?

Yes — but with an adjusted role. The positioning should shift from "full stream aggregation gateway" to "metric distribution routing gateway + Kafka integration layer." The core differentiated value:

Dual hashmod scheduling + stream_task_id pre-injection — tags metrics at the gateway layer, so all downstream nodes route consistently by this ID. This solves dimension control earlier in the pipeline than the official approach.
Multi-backend async distribution — supports async distribution to both Kafka and remote write, solving the "synchronous forwarding blocks the time window" problem from the original post.
Native Prometheus relabeling integration — works with standard Prometheus relabel configs, no custom syntax to learn.

IV. Recommended 2026 Hybrid Architecture

┌─────────────────┐   ┌──────────────────┐
│  Prometheus     │   │  Business System │
│  Agent Cluster  │   │  Metrics (Kafka) │
└────────┬────────┘   └────────┬─────────┘
         │                     │
         ▼                     ▼
    ┌──────────────────────────────┐
    │   stream-metrics-route       │
    │   (Routing Layer)            │
    │   - Dual hashmod scheduling  │
    │   - stream_task_id injection │
    │   - Relabeling               │
    └──────┬──────┬──────┬─────────┘
           │      │      │
    task=0 │ task=1│task=2│
           ▼      ▼      ▼
    ┌─────────────────────────┐
    │   vmagent Cluster       │
    │  (v1.112.0+ with        │
    │   aggregation windows)  │
    └──────────┬──────────────┘
               │
               ▼
    ┌──────────────────┐      ┌────────────┐
    │  VictoriaMetrics │      │   Kafka    │
    │  (Storage)       │      │   (Topic)  │
    └──────────┬───────┘      └────────────┘
               │
               ▼
    ┌──────────────────┐
    │  vmalert         │
    │  Grafana         │
    └──────────────────┘

Key Configuration

vmagent version requirement: >= v1.112.0, with aggregation windows enabled:

# stream aggregation config
- match: 'http_request_duration_seconds_bucket'
  interval: 5m
  without: [instance]
  enable_windows: true   # Critical! Enables dual-window buffering
  outputs: [rate_sum]

V. Evolution Recommendations

Short-term

Action	Description
Upgrade vmagent to >= v1.112.0	Enable `enable_windows: true` to improve histogram aggregation accuracy
Evaluate stream-metrics-route necessity	If you have no Kafka requirement or high-cardinality `stream_task_id` control need, consider migrating away

Medium-term

Action	Description
Retain stream-metrics-route as front-end routing only	Keep hashmod task allocation + Kafka distribution; remove aggregation responsibility
Disable raw metric persistence	Write only stream-aggregated results to storage to reduce volume
Add metadata management module	The `ruler-handle-process` from the original post (dynamic Record Rule by dimension) is worth building or contributing

Long-term

Action	Description
Contribute `stream_task_id` dimension control upstream	If the dual hashmod design proves out in production
Improve monitoring metrics	Add business-level metrics — queue depth per routing rule, distribution latency, etc.

Putting It All Together

Dimension	Assessment
Problem resolution rate	~50% — 2 of 4 core problems resolved via official upgrades; 2 still need custom solutions
Is stream-metrics-route still needed?	Yes — repositioned as "metric distribution routing gateway + Kafka integration layer"
Recommended architecture	Prometheus → stream-metrics-route → vmagent v1.112.0+ → VictoriaMetrics Storage

Three years is a long time in the observability space. The VictoriaMetrics ecosystem has matured significantly — consistent hashing, aggregation windows, and native sharding all address problems that required custom tooling in 2023. But the hard problems around true distributed stream aggregation coordination and dimension control at the gateway layer remain open.

If you're running a similar stack, the hybrid approach — letting the official project handle what it's good at (single-node aggregation, storage) while keeping custom routing for what it isn't (distributed coordination, dimension pre-injection, Kafka bridging) — has proven to be the right call for us.

Originally published on my blog

DEV Community

VictoriaMetrics Stream Aggregation: A Three-Year Retrospective (2026)

I. The Problems We Identified in 2023

II. Three Years Later — What the Official Project Has Done

✅ Perfectly Resolved

Issues P3, P5: Distributed Task Allocation & Resource Balancing

Issue P2: Single-Node Compute Scaling

Out-of-Order / Delayed Data Accuracy (P1 — Partial Mitigation)

❌ Still Unresolved

True Distributed Stream Aggregation Coordination

Task ID Dimension Explosion (P6)

III. stream-metrics-route: Current Status and Value

Core Code Architecture

The Core Algorithm (from `remotecluster.go`)

Is It Still Needed in 2026?

IV. Recommended 2026 Hybrid Architecture

Key Configuration

V. Evolution Recommendations

Short-term

Medium-term

Long-term

Putting It All Together

Top comments (0)

I. The Problems We Identified in 2023

II. Three Years Later — What the Official Project Has Done

✅ Perfectly Resolved

Issues P3, P5: Distributed Task Allocation & Resource Balancing

Issue P2: Single-Node Compute Scaling

Out-of-Order / Delayed Data Accuracy (P1 — Partial Mitigation)

❌ Still Unresolved

True Distributed Stream Aggregation Coordination

Task ID Dimension Explosion (P6)

III. stream-metrics-route: Current Status and Value

Core Code Architecture

The Core Algorithm (from remotecluster.go)

Is It Still Needed in 2026?

IV. Recommended 2026 Hybrid Architecture

Key Configuration

V. Evolution Recommendations

Short-term

Medium-term

Long-term

Putting It All Together

The Core Algorithm (from `remotecluster.go`)