DEV Community

Cover image for I Built a Real-Time Autonomous Infrastructure AI at 14. Here's Everything.
Vihaan Vaghela
Vihaan Vaghela

Posted on • Originally published at systemsbyvihaan.substack.com

I Built a Real-Time Autonomous Infrastructure AI at 14. Here's Everything.

Here it is — the full dev.to post:

I Built a Real-Time Autonomous Infrastructure AI at 14. Here's Everything.
I'm Vihaan. I'm 14 years old. And over the last few months I've been building something called Extremis.
Not a tutorial. Not a hackathon project. A real, production-grade autonomous infrastructure operating system — one that monitors itself, repairs itself, contains its own failures, and governs itself through a full protocol stack without human intervention.
This post is the complete story. Where it started. What it became. How it works. And where it's going.

Where It Started
Extremis started as a traffic intelligence system.
The original goal was simple: predict congestion, prioritize emergency vehicles, optimize traffic flow in real time. Clean problem. Clean solution.
But as the system grew, it became obvious that a smart model is worthless if it can't survive failure. A traffic intelligence platform that goes dark during peak congestion isn't just inconvenient — it's dangerous.
So I built fault tolerance in. And then governance. And then intelligence. And then the whole thing evolved into something I never planned from the start.

Phase 1: The Protocol Stack
The first major milestone was the eight-protocol governance stack — a complete escalation chain that runs from the first anomaly detection all the way to permanent human-authorized termination.
Sentinel → detects anomalies
Serpentine → classifies stress state
Aegis → contains failures
Atlas → rebalances load
Punisher → repairs with AI
Phoenix → orchestrates recovery
Hyperion → autonomous emergency shutdown
Terminus → permanent human-authorized termination
Every protocol has a defined role. Every handoff is deliberate. Nothing overlaps.
Sentinel runs every second collecting latency, CPU, memory, GPU, queue depth, and throughput. It uses z-score statistical analysis against a rolling history of 20 samples to detect anomalies before they become failures. Every anomaly gets categorized and severity-scored. Raw metrics flow at medium priority. Detected anomalies escalate at high priority immediately.
Serpentine synthesizes everything Sentinel produces into a unified stress state across five levels — stable, moderate, critical, failure, catastrophic — using three indices: pressure, congestion, and instability. Catastrophic requires 92% pressure or 95% instability. Not lower. Because catastrophic means exactly that.
Aegis contains failures before they cascade. Nodes above 20% error rate get isolated. Nodes above 95% load get throttled. Global circuit breakers activate under catastrophic state. Every action is logged with explicit reasoning.
Atlas redistributes load dynamically. Overloaded nodes above 80% get shifted to underutilized nodes below 50%, targeting 65% post-shift. Recommends autoscaler activation when redistribution isn't enough.
Punisher is where it gets interesting. A dual-head neural network diagnoses incidents and ranks repair strategies by risk. Governed by an adaptive credit system — 0.02 credits regenerating per second, max 10 — that prevents thrashing. The system gets more conservative as failures accumulate.
Phoenix brings the system back after repair. Sequenced restarts in dependency order. Container rebuilds for nodes above 85% load. Global runtime state restoration. Conservative by design — more restarts than necessary, never fewer.
Hyperion is the emergency shutdown. Fires within one second. Halts everything. No complex logic. No second guessing. Irreversible.
Terminus is the human kill switch. Blocks orchestrators, AI agents, and internal services from invoking it. Requires a human being at a physical console with multi-factor authorization. Encrypts a full system state archive before termination. Recovery is possible — but only deliberately, only by a human.

Phase 2: The Intelligence Layer
With the protocol stack in place, the next phase added a full intelligence layer on top of it.
The architecture became:
Input → Sentinel → Serpentine → Anarchy → ORACLE → MORL → PULSAR → Fusion → Meta → Output
Anarchy Protocol — event-triggered bounded simulation. When specific triggers fire — traffic spike, error surge, confidence collapse, novelty detection, strategy failure loop — Anarchy spawns a lightweight micro-simulator and evaluates three candidate strategies:

Baseline (current system decision)
Conservative (stability-focused)
Aggressive (performance-focused)

Each path is scored:
Score = (Accuracy × 0.4)
+ (Stability × 0.3)
- (Latency × 0.15)
- (Resource Cost × 0.15)
Override only fires if the best simulated score beats baseline by at least 10%. No unnecessary overrides. No wasted compute. 30ms total time budget. Hard cap.
ORACLE — an RSSM-based world model. Instead of running full simulations, ORACLE predicts future states in latent space. Short-path conservative mode. Confidence scaling. Memory-assisted candidate ordering. The system imagines futures rather than simulating them.
MORL — multi-objective reinforcement learning. Optimizes multiple competing goals simultaneously: minimize travel time, minimize emissions, prioritize emergency vehicles, reduce fuel consumption. Automatically gated off under low-confidence or high-risk conditions.
PULSAR — deterministic micro-adjustment engine. Reversible tuning under hard pulse-count and latency budgets. Small, precise, bounded corrections.
Fusion Layer — coordinates ORACLE, MORL, and PULSAR under safety gates. Trust recalibration. Emergency oracle down-weighting. Safety fallback gate. VIHAAN-aware clamps.
Meta Layer — background controller that monitors runtime signals and applies smoothed control updates only when stable conditions justify adaptation. Gradual. Conservative. Never aggressive.

Phase 3: The Latency Crisis and How I Fixed It
Here's where things got ugly before they got good.
After building the intelligence layer, the system was producing brutal latency numbers. p95 around 100ms. Tail spikes. Instability under stress. Not remotely real-time capable.
I profiled everything.
The culprit: Sentinel was consuming ~97% of runtime.
The original Sentinel ran heavy statistical checks on every cycle — unbounded work patterns that amplified into massive tail latency spikes. Every cycle, Sentinel was doing too much, taking too long, and poisoning everything downstream.
The fix:

Rolling window metrics with O(1) incremental statistics
5ms hard cap on Sentinel's time budget
Heavy analysis moved to bounded background tasks
Strict per-stage budget enforcement across the entire pipeline
Early-exit logic when risk or latency indicators exceed thresholds
Spike detection and temporary mitigation windows after detected spikes

One bottleneck. One fix. The entire system collapsed from unstable to real-time.

Phase 4: VIHAAN Protocol
The final piece was the VIHAAN Protocol — the ultimate authority layer named after me.
VIHAAN detects sustained systemic risk signals and escalates deterministically:
normal → warning → critical → vihaan_activated
When activated, VIHAAN applies authoritative overrides:

Disables MORL and PULSAR
Caps oracle weight
Freezes meta layer
Forces baseline decision path

Then drives a deterministic recovery state machine:
stabilize → monitor → gradual_reintroduction → full_restore
The philosophy: discipline over chaos. When uncertainty rises, autonomy is intentionally narrowed. Intelligence is reintroduced gradually, only after evidence of stability. The system earns back its own autonomy.
VIHAAN persists memory signatures and audit logs for replay and governance learning. Every decision is traceable.

The Numbers
Here's where Extremis stands right now:
Phasep95 LatencyStabilityEarly profile~100msBrittlePost-optimization~2msStableCurrent (500 cycles)1.124msStable
Current measured snapshot:

Avg latency: 0.607ms
p95: 1.124ms
p99: 1.634ms
Max: 1.908ms
Bus dropped events: 0
Bus ordering violations: 0
Regression tests: 32/32 passing

Zero dropped events. Zero ordering violations. Sub-2ms across the board under adversarial stress injection.

The Design Principles That Made It Work
Bounded everything. Every stage has a hard time cap. No unbounded loops. No open-ended computation. If it can't finish in budget, it exits.
Intelligence is subordinate to safety. The control plane is immutable. Intelligence operates only inside bounded envelopes. Safety contracts are non-negotiable.
Tail risk is a first-class problem. p95/p99 behavior is optimized alongside average latency. Average latency means nothing if your p99 is catastrophic.
Fallback is always available. No matter what fails, a safe baseline action path exists. The system can always do something safe.
Autonomy is earned, not assumed. VIHAAN narrows autonomy under uncertainty and restores it gradually after stability is proven.

What's Next
Extremis keeps growing. On the roadmap:

Graph Neural Network traffic modeling — model the entire road network as a live graph where congestion propagates across topology
Temporal Graph Networks — extend GNN to model how the graph evolves over time
Adversarial Robustness Layer — protect against spoofed sensors and adversarial inputs
Predictive Incident Detection — detect conditions that precede accidents before they happen
Digital Twin Engine — a real-time virtual mirror of the entire network running in parallel

Final Thought
Extremis started as a traffic model.
It became a fault-tolerant protocol stack. Then a real-time adaptive intelligence platform. Then a bounded, governed, auditable system that earns its own autonomy.
The single most important lesson: find your bottleneck before you optimize anything else. One unbounded component — Sentinel at 97% of runtime — was the entire problem. Everything else was noise.
Profile first. Fix the real problem. Build around constraints.
Extremis is no longer experimental. It is controlled, bounded, and real-time capable.
And it's still being built.

I document everything on Substack as I build — systemsbyvihaan.substack.com
I'm 14. This is just the beginning.

Top comments (0)