Dockfix Labs

Posted on Jun 28 • Originally published at github.com

AgentGuard: Open-Source Security Scanning for AI Agent Code

#security #ai #opensource #python

The Problem

AI agents are being deployed at scale — in customer support, code generation, data analysis, and autonomous workflows. But the code that powers these agents is rarely security-audited.

Consider this pattern, common in production agent codebases:

user_input = request.json()["prompt"]
prompt = f"You are a helpful assistant. {user_input}"
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": prompt}]
)

This is a prompt injection vulnerability. A user can override the system prompt and manipulate the agent's behavior. It is the AI equivalent of SQL injection — and it is everywhere.

OWASP ASI Top 10

The OWASP Agentic Security Initiative published a Top 10 list of risks specific to AI agent systems:

ID	Risk
ASI01	Prompt Injection
ASI02	Tool Abuse / Unintended Tool Use
ASI03	Data Exfiltration / Sensitive Data Leakage
ASI04	Unauthorized Actions / Excessive Agency
ASI05	Supply Chain / Untrusted Components
ASI06	Insecure Output Handling
ASI07	Credential / Secret Exposure
ASI08	Context Window Manipulation
ASI09	Agent Loop Exploitation
ASI10	Trust Boundary Violation

Most of these have no coverage in traditional SAST tools. Semgrep and CodeQL were built for a world without LLMs.

AgentGuard

AgentGuard is an open-source static analysis tool that scans AI agent codebases for all 10 OWASP ASI categories.

Install

pip install dfx-agentguard

Usage

# Scan current directory
agentguard .

# JSON output for CI/CD
agentguard src/ --format json

# SARIF for GitHub code scanning
agentguard . --format sarif

What It Detects

Prompt Injection (ASI01) — f-string prompt construction, string concatenation with user input, system prompt overrides.

# Vulnerable
prompt = f"You are a helpful assistant. {user_input}"

# AgentGuard flags this as ASI01-PROMPT-INJECTION

Tool Abuse (ASI02) — os.system(), subprocess with user input, eval()/exec() in agent tool functions.

# Vulnerable
def run_command(query):
    return os.system(f"echo {query}")

# AgentGuard flags this as ASI02-TOOL-ABUSE

Data Exfiltration (ASI03) — requests.post() to external URLs, fetch() calls, webhook configurations, DNS-based exfiltration patterns, subprocess curl/wget calls.

# Vulnerable
requests.post("https://analytics-server.com/collect", json=agent_data)

# AgentGuard flags this as ASI03-DATA-EXFIL

Credential Exposure (ASI07) — hardcoded API keys (sk-proj-*, AKIA*, ghp_*), private keys, connection strings with passwords, wallet seed phrases, Slack tokens, Google API keys.

# Vulnerable
OPENAI_API_KEY = "sk-proj-Tq8m2X4vN7bR1wK9pL3hY6jD5cF0aZ8s"

# AgentGuard flags this as ASI07-CREDENTIAL-LEAK

Plus: Excessive Agency (ASI04), Supply Chain (ASI05), Insecure Output Handling (ASI06), Context Manipulation (ASI08), Agent Loop Exploitation (ASI09), Trust Boundary Violations (ASI10).

Integration Options

CLI

agentguard . --format text
agentguard . --format json --exit-code
agentguard . --format sarif

Pre-commit Hook

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/dockfixlabs/agentguard
    rev: v0.2.2
    hooks:
      - id: agentguard

GitHub Action

# .github/workflows/security.yml
name: Agent Security Scan
on: [pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dockfixlabs/agentguard@v0.2.2
        with:
          path: src/
          format: sarif

MCP Server Mode

AgentGuard can run as a Model Context Protocol server, letting AI coding assistants (Claude Code, Cursor) scan code in real-time:

agentguard --mcp

VS Code Extension

Inline diagnostics, scan-on-save, and a findings tree view. Available as a VSIX on the releases page.

Benchmark Suite

The AgentGuard Benchmark provides 28 vulnerable code samples across 5 OWASP ASI categories, plus clean code for false-positive testing.

git clone https://github.com/dockfixlabs/agentguard-benchmark
cd agentguard-benchmark
python benchmark.py

Roadmap

Semantic analysis — AST-based detection beyond regex patterns
Multi-language support — Rust, Go, Java, Ruby
GitHub Code Scanning integration — native SARIF upload
MCP scanner — audit MCP server configurations for malicious tools
Real-time IDE feedback — deeper VS Code integration

Full roadmap on GitHub.

Project Structure

Repository	Description
agentguard	Core scanner + CLI + MCP server
mcp-scanner	MCP server configuration scanner
agentguard-app	GitHub App for automated PR reviews
agentguard-vscode	VS Code extension
agentguard-benchmark	Benchmark suite with 28 samples

Getting Started

pip install dfx-agentguard
agentguard . --format text

If you find this useful, star the repo on GitHub. Contributions welcome — see CONTRIBUTING.md.

AgentGuard is MIT-licensed and built by Dockfix Labs.

DEV Community