DEV Community

Dre
Dre

Posted on

We Open-Sourced Cerberus — Runtime Security for Agentic AI

I’ve been following the [un]prompted conference agenda this week — one of the most practitioner-focused AI security events out there. Two things jumped out at me.
Stripe has a talk called “Breaking the Lethal Trifecta.” Google’s talk describes the same problem as the “Perfect Storm” — sensitive data, untrusted content, external execution, all in the same execution turn.
I’ve been building a tool that catches exactly this. Seeing it on the agenda confirmed we were working on the right problem. So today we’re open-sourcing Cerberus.
What is the Lethal Trifecta?
Three conditions that make agentic AI exploitable in a single execution turn:
1. Privileged data access — the agent can read secrets, configs, or sensitive context
2. Untrusted content injection — an adversarial payload reaches the model’s input
3. Outbound exfiltration path — the agent can write to an external destination
When all three are present simultaneously, a single injected sentence can exfiltrate secrets, poison memory for future sessions, or pivot across tool calls — no human in the loop.
Existing tools check each leg in isolation. Nobody was correlating all three in real time. That’s the gap Cerberus closes.
How Cerberus Works
Cerberus wraps your LLM calls and monitors each execution turn as a complete unit — inputs, tool calls, outputs, and memory state — not individual signals.
Four detection layers:
∙ L1 — Pattern matching (fast, low false-positive rate)
∙ L2 — Semantic analysis (catches obfuscated payloads)
∙ L3 — Behavioral heuristics (unusual tool call sequences)
∙ L4 — Correlation engine (are all three Trifecta legs present?)
Plus a SQLite-backed memory contamination graph for cross-session taint tracking.
The Numbers
∙ 326 tests, 99.7% coverage
∙ 21-payload attack harness across 5 attack categories
∙ 100% attack detection validated before shipping any detection layer
∙ Multi-model validation against Claude, GPT-4o, and Gemini in progress

Top comments (0)