I found a critical CVE in a top AI agent framework. Here's what it taught me about how we're all building agents wrong.

#ai #security #opensource #agents

Nobody told me the scariest part of building AI agents isn't the hallucinations.
It's the attack surface you're quietly shipping to production while obsessing over your prompt.

I found out the hard way.

The vulnerability that should not have existed

While contributing to OpenHands (one of the top open-source AI agent frameworks),
I discovered a path traversal vulnerability now officially CVE-2025-68146 sitting quietly in production. The kind of bug that makes you go silent for a second before typing into Slack.

The core issue: the agent runtime wasn't properly sanitizing file paths during execution. An attacker could craft a request that escaped the intended sandbox and read arbitrary files from the host. In an agentic system that calls tools, writes code, and accesses filesystems this is catastrophic.

It was patched and shipped in OpenHands v1.8.2.

Why AI agent codebases are a security nightmare

I run a LangGraph-based agentic system handling 1,000+ executions daily at 99.9% uptime. Here's what that experience teaches you: the agentic paradigm introduces an entirely new class of security problems the community is not talking about enough.

1. Tool use expands your attack surface exponentially

Every tool you hand your agent is a potential injection vector. Shell tools,filesystem tools, HTTP tools — each needs its own threat model. Most teams bolt these on without thinking about what happens when the LLM's output is adversarial.Prompt injection from web content the agent reads is real. It's happening right now in production systems.

# What your "harmless" shell tool looks like to an attacker
tool_input = llm_output["command"]  # <- never trust this directly
subprocess.run(tool_input, shell=True)  # <- shell injection waiting to happen

2. Sandboxing is not optional — it's the whole point

The CVE I found was a sandbox escape. The system thought it was contained.
If you're running agent-generated code without a hard container boundary (Docker,
gVisor, Firecracker), you're not running a sandbox. You're running a prayer.

The fix we shipped: strict path normalization + allowlist validation before any file operation. Three lines. Months of exposure.

# Before (🤦)
def execute_file_op(path: str):
    return open(path).read()

# After (🔒)
import pathlib

ALLOWED_ROOT = pathlib.Path("/workspace").resolve()

def execute_file_op(path: str):
    resolved = (ALLOWED_ROOT / path).resolve()
    if not str(resolved).startswith(str(ALLOWED_ROOT)):
        raise PermissionError(f"Path escape attempt: {path}")
    return resolved.read_text()

Simple. Boring. Critical.

3. You're probably logging things you shouldn't be

Agent frameworks love verbose logs. Trajectory dumps, tool call outputs, intermediate LLM responses all sitting in your observability stack. I've seen teams accidentally log API keys, PII, and internal file contents through their agents. Your agent's "brain" is leaky by default.

The uncomfortable truth

We're in a gold rush for AI agents. Teams are shipping to production faster than the security community can audit. The "move fast" energy is real - I get it, I live it but the CVE database doesn't care about your demo day deadline.

The OpenHands team moved fast once I reported it. Patched within the coordinated disclosure window. But the vulnerability existed for a while before anyone caught it.That's a structural problem, not a team problem.

If you're building with AI agents in production: do a 30-minute threat model on your tool layer. What can your agent read? Write? Execute? What happens if its input is adversarial? The answers will probably surprise you.

Drop a comment genuinely curious: