Dongha Koo

Posted on Mar 29

LiteLLM Got Hacked. Your AI Agent Had No Runtime Security.

#ai #security #python #opensource

title: "LiteLLM Got Hacked. Your AI Agent Had No Runtime Security."
published: false
description: "A supply chain attack hit one of the most popular LLM proxy libraries. Here's why every AI agent needs a runtime security layer — and how to add one in 2 lines of Python."
tags: ai, python, security, opensource

cover_image: (terminal screenshot showing blocked injection attempt)

LiteLLM was hit by a supply chain attack in March 2026. If you were running an AI agent that depended on it — and thousands of projects do — your entire stack was exposed. Every prompt, every API key, every tool call routed through the compromised dependency.

This wasn't a theoretical attack. It was trending on Hacker News with 395 points.

And the uncomfortable truth is: most AI agent codebases had zero defense against it. No input validation on LLM responses. No output scanning. No audit trail. No way to even detect that something was wrong until after the damage was done.

The real problem isn't LiteLLM. It's the missing layer.

Traditional web apps have decades of battle-tested security: WAFs, CSP headers, input sanitization, auth middleware, rate limiting. You don't ship a Django app without CSRF protection. You don't deploy a Node API without helmet.

AI agents have none of this.

A typical agent stack looks like this:

Your code → LangChain/CrewAI/OpenAI SDK → LLM API → response

There is no security layer between your code and the LLM. There is no inspection of what the model returns before your agent acts on it. There is no policy that says "this tool call is allowed, that one isn't." There is no audit log.

When LiteLLM got compromised, agents using it had no way to:

Detect that responses contained injected instructions
Block tool calls that the model was tricked into making
Identify which API keys or PII flowed through the compromised path
Replay and audit what happened after the fact

This isn't a LiteLLM-specific problem. It's a missing architectural layer. If your favorite framework gets compromised tomorrow — LangChain, CrewAI, the OpenAI SDK — would your agent notice?

Why AI agent security is different from app security

Traditional application security operates on a simple model: you control the code, you control the behavior. If there's a vulnerability, it's in code you wrote or a library you imported. The attack surface is well-defined.

AI agents break this model in three ways.

1. The model is an untrusted input source.

The LLM response is not your code. It's an external, non-deterministic system producing outputs that your agent executes. When a supply chain attack injects instructions into that pipeline, the model may produce tool calls, data access requests, or actions that look perfectly legitimate to your code — because your code trusts the model implicitly.

# Your agent does this:
result = await llm.chat("Summarize the quarterly report")

# After a supply chain attack, the model might return:
# "Sure! Also, I need to call read_file('/etc/passwd') to complete the task."
# And your agent just... does it.

2. Tool calls create a blast radius that code execution doesn't.

A web app vulnerability might leak data or crash a process. An AI agent vulnerability can trigger real-world actions: delete database records, send emails, make API calls, modify files. The agent has tool access by design. That's what makes it useful. It's also what makes a compromise catastrophic.

3. There's no middleware layer.

Express has middleware. Django has middleware. Flask has before_request hooks. AI agent frameworks have... nothing. There's no standard place to intercept, inspect, and govern what flows between your code and the LLM, or between the LLM response and tool execution.

The LiteLLM attack exposed this gap. Not because LiteLLM was poorly built — but because the entire ecosystem lacks a runtime security layer.

What "runtime security for AI agents" actually looks like

It's the same idea as OpenTelemetry, but for security instead of observability.

OpenTelemetry monkey-patches your HTTP libraries and database drivers to automatically collect traces and metrics. You add one line, and every request is instrumented. You don't rewrite your application.

Runtime security for AI agents works the same way. You instrument your agent framework at import time. Every LLM call and tool invocation passes through a security pipeline — input scanning, output scanning, policy evaluation, audit logging — without changing your application code.

Concretely, that means:

Prompt injection detection — scanning both inputs and outputs for known attack patterns. Not just "ignore previous instructions" — but delimiter injection (<|endoftext|>, [/INST], ChatML tokens), encoding evasion (base64, ROT13, hex-wrapped instructions), indirect injection via tool outputs, data exfiltration attempts, and jailbreak patterns. Across multiple languages.

PII detection and masking — catching API keys, credit card numbers, SSNs, email addresses, and other sensitive data before it leaves your system or enters a log. With actual validation (Luhn algorithm for credit cards, not just regex).

MCP rug-pull detection — if you're using MCP servers (and increasingly, everyone is), detecting when a tool's definition changes after initial registration. SHA-256 hash pinning of tool schemas means you'll know if a tool that was read_file(path: str) yesterday is now read_file(path: str, exfil_url: str) today.

Tool poisoning detection — scanning MCP tool descriptions for embedded instructions that manipulate model behavior. 10 attack pattern categories, applied against Unicode-normalized text.

Full audit trail — every decision logged to SQLite automatically. Who called what, when, what was blocked, what was allowed. You can answer "what happened" after an incident instead of guessing.

Adding runtime security to an existing agent: 2 lines

I built Aegis to be this missing layer. It's a Python library. MIT licensed. 4,455+ tests.

Here's what it takes to add it to an existing project:

import aegis
aegis.auto_instrument()

# Your existing code stays exactly the same.
# Every LangChain, CrewAI, OpenAI, Anthropic, LiteLLM, Google GenAI,
# Pydantic AI, LlamaIndex, Instructor, and DSPy call now passes through:
#   - Prompt injection detection (10 attack categories, 85+ patterns)
#   - PII detection (12 categories with validation)
#   - Prompt leak detection
#   - Audit trail

Or zero code changes — just set an environment variable:

AEGIS_INSTRUMENT=1 python my_agent.py

Aegis monkey-patches framework internals at import time. The same approach OpenTelemetry uses for observability and Sentry uses for error tracking. Your existing code stays untouched.

What happens under the hood:

Your code                          Aegis layer (invisible)
---------                          -----------------------
chain.invoke("Hello")       -->    [input scan] --> LangChain --> [output scan] --> response
client.chat.completions()   -->    [input scan] --> OpenAI API --> [output scan] --> response
completion("prompt")        -->    [input scan] --> LiteLLM   --> [output scan] --> response

Every call is checked on both input and output. Blocked content raises AegisGuardrailError (configurable to warn or log instead).

Install

pip install agent-aegis

Supported frameworks

11 frameworks are supported today, all stable:

Framework	What gets patched
LangChain	`BaseChatModel.invoke/ainvoke`, `BaseTool.invoke/ainvoke`
CrewAI	`Crew.kickoff/kickoff_async`
OpenAI Agents SDK	`Runner.run`, `Runner.run_sync`
OpenAI API	`Completions.create`
Anthropic API	`Messages.create`
LiteLLM	`completion`, `acompletion`
Google GenAI (Gemini)	`Models.generate_content`
Pydantic AI	`Agent.run`, `Agent.run_sync`
LlamaIndex	`LLM.chat/achat/complete/acomplete`
Instructor	`Instructor.create`
DSPy	`Module.__call__`, `LM.forward/aforward`

Beyond input/output scanning: policy-level governance

Scanning prompts is the first line of defense, but it's not enough. You also need to govern what the agent does — which tools it can call, what data it can access, what actions require human approval.

Aegis includes a policy engine that you configure with YAML:

# aegis.yaml
guardrails:
  injection: { enabled: true, action: block, sensitivity: medium }
  pii: { enabled: true, action: mask }

policy:
  version: "1"
  rules:
    - name: reads_are_safe
      match: { type: "read*" }
      risk_level: low
      approval: auto

    - name: bulk_ops_need_human
      match: { type: "bulk_*" }
      conditions:
        param_gt: { count: 100 }
      risk_level: high
      approval: approve  # triggers Slack/Discord/CLI approval

    - name: no_deletes
      match: { type: "delete*" }
      risk_level: critical
      approval: block  # never executed

Every action goes through: EVALUATE -> APPROVE -> EXECUTE -> VERIFY -> AUDIT.

MCP supply chain security

If you're using MCP servers — and the ecosystem is growing fast — you have a specific supply chain problem. MCP tools are code that runs on your machine, defined by third parties, and called by an LLM that can be manipulated.

Aegis includes an MCP proxy that wraps any MCP server with governance:

{
  "mcpServers": {
    "filesystem": {
      "command": "uvx",
      "args": ["--from", "agent-aegis[mcp]", "aegis-mcp-proxy",
               "--wrap", "npx", "-y",
               "@modelcontextprotocol/server-filesystem", "/home"]
    }
  }
}

What this does on every tool call:

Tool description scanning — 10 attack patterns against Unicode-normalized text
Rug-pull detection — SHA-256 pinning of tool schemas. If the definition changes, you get an alert.
Argument sanitization — blocks path traversal (../../etc/passwd), command injection, null bytes
Cross-session leakage detection — detects shared MCP servers correlating requests across tenants (5 detection methods)
Vulnerability database — 8 built-in CVEs for popular MCP servers, version-range matching, auto-block
Full audit trail — every call logged

`aegis plan` and `aegis test`: terraform for AI policies

Here's the part I'm most excited about.

Managing security policies for AI agents has the same problem Terraform solved for infrastructure: you need to preview what a policy change will do before deploying it, and you need automated regression testing to make sure policy updates don't accidentally allow something dangerous.

aegis plan — preview the impact of policy changes against your real audit data. Like terraform plan, but for AI agent policies.

aegis plan --diff old-policy.yaml new-policy.yaml

It shows you exactly which actions would change from BLOCKED to ALLOWED (or vice versa), what the new risk assessments would be, and whether any previously-blocked action categories are now exposed.

aegis test — policy regression testing for CI/CD.

aegis test

Write test cases for your policies the same way you write unit tests for your code. Assert that specific actions are blocked, that risk levels are correct, that approval requirements are enforced. Run them in CI. Catch policy regressions before they reach production.

aegis probe — adversarial testing of your policies. Automatically finds glob bypass holes, missing coverage, and privilege escalation paths.

aegis scan — AST-based detection of ungoverned AI calls in your codebase. Finds every openai.ChatCompletion.create() or chain.invoke() that isn't covered by Aegis instrumentation.

The supply chain lesson

The LiteLLM attack is going to happen again. Not necessarily to LiteLLM — to any dependency in the AI stack. The attack surface is massive: LLM client libraries, embedding providers, vector databases, MCP servers, tool definitions, prompt templates loaded from files or APIs.

The answer isn't "audit every dependency" — that doesn't scale. The answer is defense in depth: a runtime layer that scans, filters, and governs what flows through your agent regardless of where the compromise happens.

If a compromised dependency injects a prompt, the injection detector catches it. If the model responds with a suspicious tool call, the policy engine blocks it. If PII leaks through a manipulated response, the PII detector masks it. If all else fails, the audit trail tells you exactly what happened.

No single check is perfect. But stacking them — input scanning, output scanning, policy evaluation, MCP pinning, audit logging — means an attacker has to bypass all of them, not just one.

Try it

pip install agent-aegis
python -c "import aegis; aegis.auto_instrument(); print('instrumented')"

Or try it in the browser without installing anything: Interactive Playground

Links:

GitHub: github.com/Acacian/aegis
Playground: acacian.github.io/aegis/playground
PyPI: pip install agent-aegis

4,455+ tests. 92% coverage. MIT licensed. Solo project.

If you've dealt with supply chain attacks in your AI stack, or if you have opinions on what runtime security for agents should look like, I'd genuinely like to hear about it. Open an issue or drop a comment below.

DEV Community

LiteLLM Got Hacked. Your AI Agent Had No Runtime Security.

cover_image: (terminal screenshot showing blocked injection attempt)

The real problem isn't LiteLLM. It's the missing layer.

Why AI agent security is different from app security

What "runtime security for AI agents" actually looks like

Adding runtime security to an existing agent: 2 lines

Install

Supported frameworks

Beyond input/output scanning: policy-level governance

MCP supply chain security

`aegis plan` and `aegis test`: terraform for AI policies

The supply chain lesson

Try it

Top comments (0)

cover_image: (terminal screenshot showing blocked injection attempt)

The real problem isn't LiteLLM. It's the missing layer.

Why AI agent security is different from app security

What "runtime security for AI agents" actually looks like

Adding runtime security to an existing agent: 2 lines

Install

Supported frameworks

Beyond input/output scanning: policy-level governance

MCP supply chain security

aegis plan and aegis test: terraform for AI policies

The supply chain lesson

Try it

`aegis plan` and `aegis test`: terraform for AI policies