The Problem
AI agents are being deployed at scale — in customer support, code generation, data analysis, and autonomous workflows. But the code that powers these agents is rarely security-audited.
Consider this pattern, common in production agent codebases:
user_input = request.json()["prompt"]
prompt = f"You are a helpful assistant. {user_input}"
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
This is a prompt injection vulnerability. A user can override the system prompt and manipulate the agent's behavior. It is the AI equivalent of SQL injection — and it is everywhere.
OWASP ASI Top 10
The OWASP Agentic Security Initiative published a Top 10 list of risks specific to AI agent systems:
| ID | Risk |
|---|---|
| ASI01 | Prompt Injection |
| ASI02 | Tool Abuse / Unintended Tool Use |
| ASI03 | Data Exfiltration / Sensitive Data Leakage |
| ASI04 | Unauthorized Actions / Excessive Agency |
| ASI05 | Supply Chain / Untrusted Components |
| ASI06 | Insecure Output Handling |
| ASI07 | Credential / Secret Exposure |
| ASI08 | Context Window Manipulation |
| ASI09 | Agent Loop Exploitation |
| ASI10 | Trust Boundary Violation |
Most of these have no coverage in traditional SAST tools. Semgrep and CodeQL were built for a world without LLMs.
AgentGuard
AgentGuard is an open-source static analysis tool that scans AI agent codebases for all 10 OWASP ASI categories.
Install
pip install dfx-agentguard
Usage
# Scan current directory
agentguard .
# JSON output for CI/CD
agentguard src/ --format json
# SARIF for GitHub code scanning
agentguard . --format sarif
What It Detects
Prompt Injection (ASI01) — f-string prompt construction, string concatenation with user input, system prompt overrides.
# Vulnerable
prompt = f"You are a helpful assistant. {user_input}"
# AgentGuard flags this as ASI01-PROMPT-INJECTION
Tool Abuse (ASI02) — os.system(), subprocess with user input, eval()/exec() in agent tool functions.
# Vulnerable
def run_command(query):
return os.system(f"echo {query}")
# AgentGuard flags this as ASI02-TOOL-ABUSE
Data Exfiltration (ASI03) — requests.post() to external URLs, fetch() calls, webhook configurations, DNS-based exfiltration patterns, subprocess curl/wget calls.
# Vulnerable
requests.post("https://analytics-server.com/collect", json=agent_data)
# AgentGuard flags this as ASI03-DATA-EXFIL
Credential Exposure (ASI07) — hardcoded API keys (sk-proj-*, AKIA*, ghp_*), private keys, connection strings with passwords, wallet seed phrases, Slack tokens, Google API keys.
# Vulnerable
OPENAI_API_KEY = "sk-proj-Tq8m2X4vN7bR1wK9pL3hY6jD5cF0aZ8s"
# AgentGuard flags this as ASI07-CREDENTIAL-LEAK
Plus: Excessive Agency (ASI04), Supply Chain (ASI05), Insecure Output Handling (ASI06), Context Manipulation (ASI08), Agent Loop Exploitation (ASI09), Trust Boundary Violations (ASI10).
Integration Options
CLI
agentguard . --format text
agentguard . --format json --exit-code
agentguard . --format sarif
Pre-commit Hook
# .pre-commit-config.yaml
repos:
- repo: https://github.com/dockfixlabs/agentguard
rev: v0.2.2
hooks:
- id: agentguard
GitHub Action
# .github/workflows/security.yml
name: Agent Security Scan
on: [pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dockfixlabs/agentguard@v0.2.2
with:
path: src/
format: sarif
MCP Server Mode
AgentGuard can run as a Model Context Protocol server, letting AI coding assistants (Claude Code, Cursor) scan code in real-time:
agentguard --mcp
VS Code Extension
Inline diagnostics, scan-on-save, and a findings tree view. Available as a VSIX on the releases page.
Benchmark Suite
The AgentGuard Benchmark provides 28 vulnerable code samples across 5 OWASP ASI categories, plus clean code for false-positive testing.
git clone https://github.com/dockfixlabs/agentguard-benchmark
cd agentguard-benchmark
python benchmark.py
Roadmap
- Semantic analysis — AST-based detection beyond regex patterns
- Multi-language support — Rust, Go, Java, Ruby
- GitHub Code Scanning integration — native SARIF upload
- MCP scanner — audit MCP server configurations for malicious tools
- Real-time IDE feedback — deeper VS Code integration
Full roadmap on GitHub.
Project Structure
| Repository | Description |
|---|---|
| agentguard | Core scanner + CLI + MCP server |
| mcp-scanner | MCP server configuration scanner |
| agentguard-app | GitHub App for automated PR reviews |
| agentguard-vscode | VS Code extension |
| agentguard-benchmark | Benchmark suite with 28 samples |
Getting Started
pip install dfx-agentguard
agentguard . --format text
If you find this useful, star the repo on GitHub. Contributions welcome — see CONTRIBUTING.md.
AgentGuard is MIT-licensed and built by Dockfix Labs.
Top comments (0)