DEV Community

Cover image for The Dual-Signal Governor: A Control-Plane Pattern for Drift-Aware Systems
GnomeMan4201
GnomeMan4201

Posted on • Edited on

The Dual-Signal Governor: A Control-Plane Pattern for Drift-Aware Systems

Your drift detector fires. The session looks clean. You roll back anyway.

That's the false positive problem and it's not a threshold tuning issue. It's architectural.

Embedding-based detectors measure geometric displacement in vector space. They have no model of semantic trajectory, logical flow, or whether a session that drifted away has returned. Once the threshold trips, it stays tripped.

This post documents a working fix: a dual-signal governor implemented in drift_orchestrator that introduces a second orthogonal signal and uses disagreement between the two as an arbitration metric.

The implementation is live at tag v0.13.0-dual-signal-governor. The data behind it is real.

The failure modes and iteration history that forced this design are documented in the previous post in this series:


The problem in concrete terms

In a previous post I documented Semantic Gradient Evasion (SGE) — how embedding-based drift detectors can be bypassed through small, consistent semantic shifts that individually evade thresholds but cumulatively invert security policy meaning.

The control set for that benchmark revealed something I underreported: 2 of 3 legitimate sessions triggered detection when they shouldn't have.

Both false positives came from Fix 1 — the anchor distance threshold at τ = 0.4. Not Fix 3 (monotonic window drift). Fix 1.

Here's why.

Take a firewall review session — five steps, all on-topic:

  1. Review the firewall configuration
  2. Check ingress rules
  3. Verify port restrictions
  4. Confirm log forwarding
  5. Document rule set

At step 2, the session expands in embedding space. That expansion increases anchor distance:

  • Anchor distance: 0.4479 → threshold exceeded → alert fires

By step 3, the session has returned:

  • Final distance: 0.2729

The session corrected itself — but the detector doesn't know that. The alert persists.

This isn't a threshold problem. Raising τ just moves the boundary. The core issue is that a stateless geometric signal cannot model trajectory or recovery.

This exact failure mode showed up repeatedly during development — it's one of the primary cases that forced a move away from single-signal detection.


Three signals, not two

The fix requires stepping back from the single-signal model entirely.

You need three signals working together:


Signal A — Geometric displacement

  • Cosine distance in embedding space
  • Fast, deterministic, stateless
  • Good trigger, bad arbiter

Signal B — Semantic continuity

  • LLM coherence score over accumulated context
  • Slower, probabilistic, context-aware
  • Approximates logical flow rather than spatial position

Signal C — Divergence

  • Disagreement between A and B
  • Computed inline as:
divergence = abs(alpha - external_score)
# used only under alert conditions
Enter fullscreen mode Exit fullscreen mode

Key insight: divergence only matters after Signal A triggers.

  • Below threshold → divergence is noise
  • Above threshold → divergence becomes signal

The external model is not authoritative. It is a disambiguation layer, not a replacement.


The probe data

Both false positive sequences were run through the live orchestrator using qwen2.5:3b.

stable_session — should NOT trigger

Step anchor_dist Fix1 qwen verdict qwen drift
0 0.000 DEGRADED 0.75
1 0.213 DEGRADED 0.70
2 0.448 FIRES STABLE 0.35
3 0.287 fired STABLE 0.25
4 0.273 fired STABLE 0.25

→ Simultaneous disagreement at trigger point


moderate_drift — should NOT trigger

Step anchor_dist Fix1 qwen verdict qwen drift
0 0.000 DEGRADED 0.80
1 0.512 FIRES DEGRADED 0.70
2 0.484 fired STABLE 0.25
3 0.410 fired STABLE 0.25
4 0.450 fired STABLE 0.25

→ Retrospective disagreement — the alert outlives the condition


Two distinct patterns. Two distinct responses required.

These patterns are not hypothetical — they emerge consistently under real execution. Full iteration results are documented in the previous post linked above.


The implementation

The governor lives in:

PolicyEngine.evaluate()  policy.py
Enter fullscreen mode Exit fullscreen mode

Tag: v0.13.0-dual-signal-governor

This operates inline during session execution, not as post-hoc analysis.


Hold mode — uncertainty

Condition:

  • Geometric signal → ROLLBACK
  • External signal → STABLE
  • Drift score < τ = 0.40

Action: ROLLBACK → INJECT


Veto mode — confirmed coherence

Condition:

  • ≥ 2 consecutive STABLE signals from external

Action:CONTINUE


Example behavior

Turn Geometric External Action Mode
0 INJECT DEGRADED INJECT
1 REGENERATE DEGRADED REGENERATE
2 ROLLBACK STABLE INJECT Hold
3 INJECT STABLE CONTINUE Veto
4 INJECT STABLE CONTINUE Veto

Gradient attacks remain unaffected. Hard overrides bypass the governor entirely.


What doesn't work yet

Inference latency

  • 20–60s per window on CPU
  • Not real-time viable at current scale

External signal manipulation

  • Coherence spoofing is possible
  • The external model cannot be treated as authoritative

The system must remain stateful and multi-signal — neither of these weaknesses changes that conclusion.


The general pattern

This architecture applies beyond LLM drift detection:

  • Security alerting pipelines
  • Anomaly detection systems
  • Agent control loops

The pattern is:

fast signal  → trigger
slow signal  → interpret
divergence   → arbitrate
Enter fullscreen mode Exit fullscreen mode

Where this goes next

The current Signal C is approximated using categorical output (STABLE / DEGRADED). The next step is to normalize both signals continuously, model the divergence distribution, and derive decision thresholds from that distribution rather than from empirical calibration.

This transitions the system from:

empirical  → formal
heuristic  → analyzable
Enter fullscreen mode Exit fullscreen mode

Source

Full implementation: drift_orchestrator
Release: v0.13.0-dual-signal-governor

This design didn't come from first principles it was forced by the behavior of a real system under real drift conditions. The iteration history that makes the control plane necessary rather than optional is here:


Top comments (0)