Building a 13-Agent AI System for Real-Time Road Safety Monitoring

#ai #python #nextjs #machinelearning

Kerala, India has one of the highest road accident rates in the country — over 40,000 accidents annually across its narrow, winding highways. I built SurakshaNet, a multi-agent intelligence platform that monitors 6 high-risk road segments in real time using 13 AI agents, Byzantine fault-tolerant voting, and Bayesian belief fusion.

This post covers the architecture, the problems I solved, and what I learned.

The Problem

Traditional road safety systems rely on single data sources — a camera feed or a weather alert. But accidents are multi-causal. A wet road alone is not dangerous. A wet road at night, near a school zone, with heavy traffic and poor visibility — that is dangerous.

I needed a system that could fuse multiple independent signals into a single calibrated risk score, attribute causality, and trigger the right response automatically.

Architecture Overview

The system runs 3 agent clusters in parallel per road segment, every 5 minutes:

SWARM-GUARD (Road Safety): Weather friction analysis using IRC SP:73-2015 standards, historical accident pattern matching from NCRB data, traffic anomaly detection via HERE API, and YOLOv8 visual risk assessment.

DRISHTI (Disaster Response): IMD weather situation awareness, KSDMA resource inventory tracking, census-based population vulnerability scoring, OR-Tools logistics optimization, and Claude-powered counterfactual projection.

SENTINEL (Surveillance): RT-DETR edge perception, OSNet cross-camera re-identification, Bayesian Beta-Binomial threat escalation, and Section 65B-compliant evidence packaging.

Each agent returns a standardized output: risk score (0-1), confidence (0-1), and a vote (ESCALATE / HOLD / DISMISS).

Consensus: Byzantine Voting + Bayesian Fusion

With 13 agents, some will fail or return unreliable data. The system handles this in two stages.

Stage 1 — Byzantine Voting. Agents vote weighted by confidence. ESCALATE contributes +1 x confidence, DISMISS contributes -1 x confidence, HOLD contributes 0. Agents with zero confidence are excluded. If more than 50% of agents fail, the system enters degraded mode. The weighted sum determines the consensus: above +0.50 triggers ESCALATE, below -0.50 triggers DISMISS.

Stage 2 — Bayesian Belief Fusion. Raw scores are fused accounting for inter-agent correlation. Visual risk and traffic anomaly share a correlation coefficient of 0.6 (both degrade in bad weather). Independent agents are fused via weighted average. Correlated agents are down-weighted. The final belief is computed through precision weighting:

fused_belief = (mean_indep * prec_indep + mean_corr * prec_corr) / (prec_indep + prec_corr)

Causal Attribution

If the system escalates, it does not just say "high risk detected." That would be useless. Instead, it calls Claude (claude-sonnet-4-20250514) with the raw agent scores and asks for a one-paragraph causal attribution explaining why this segment is dangerous right now.

If the Claude API fails, the attribution is set to null — never fabricated. A Slack alert is sent with raw scores marked as needing manual review.

Tech Stack

Backend: Python 3.12, FastAPI, async SQLAlchemy, PostgreSQL + PostGIS, Redis, LangGraph
AI/ML: YOLOv8, RT-DETR, OSNet, Anthropic Claude API
Frontend: Next.js 14, Leaflet.js, Recharts, Tailwind CSS
Orchestration: LangGraph StateGraph with 3-stage pipeline (vote, fuse, attribute)
Infrastructure: Docker Compose, Alembic migrations, structlog JSON logging

What the Frontend Shows

The platform has 27 pages across multiple dashboards:

Command Center — Full-screen operations view with live risk map, auto-cycling segment display, and real-time metrics
Analytics — Risk heatmap (7x24 day-hour grid), agent reliability table, agent correlation matrix
KSDMA Dashboard — Resource gap visualization, evacuation route planner, monsoon overlay
KSRTC Dashboard — Bus route advisories, depot reports, visual driver fatigue assessment
District Collector Dashboard — All 14 Kerala districts with escalation charts and monsoon preparedness scores
System Health — Agent status by cluster, API rate limits with circuit breaker states

Other features include crowd-sourced hazard reporting with GPS auto-detection, spatiotemporal forecasting, school zone safety, green corridor management, and RTSP camera integration.

Rate Limit Engineering

Free-tier APIs are the backbone. The monitoring interval is 300 seconds across 6 segments:

API	Daily Usage	Limit
Open-Meteo	1,728 calls	10,000
HERE Traffic	144 calls	1,000
OpenWeatherMap	Fallback only	1,000
Anthropic Claude	On escalation only	60 RPM

Per-API tracking uses Redis token buckets. If a limit is exhausted, the agent returns confidence=0.0 and vote=HOLD — it never gets API keys revoked.

Key Design Decisions

No random numbers for risk scores. Every score is derived from real external data. If data is unavailable, the score is 0.0 with confidence 0.0 and vote HOLD. This was a hard rule from day one.

No blocking calls. The entire application is async. asyncio.gather() for parallel agents, httpx.AsyncClient for HTTP, create_async_engine for database access. Using time.sleep() anywhere would be a production-killing bug.

Graceful degradation everywhere. If the database is unreachable, events buffer in a Redis queue and drain with exponential backoff. If an agent fails, it returns a safe default. If Claude fails, attribution is null with a Slack alert. Nothing fails silently.

Lessons Learned

Multi-agent consensus is harder than single-model inference. The correlation correction between agents was the most impactful improvement — without it, correlated agents would double-count evidence.
Rate limit management is an engineering problem, not an afterthought. Building it into the agent abstraction from the start saved significant debugging later.
Causal attribution changes how operators interact with the system. A number like "0.78" means little. "Wet road surface combined with reduced visibility and 23% speed reduction on NH-66 near a school zone active period" — that is actionable.