DEV Community

Kang
Kang

Posted on

Your AI Agent Has Security Holes — Here's How to Find Them

AI agents are the new attack surface. They handle user input, call external tools, access databases, and make autonomous decisions. Yet most teams ship them with zero security review.

I kept running into the same problems: prompt injection vectors hiding in tool call chains, API keys leaked in system prompts, PII flowing through logs unchecked. Traditional SAST/DAST tools don't understand agent-specific threats. So I built ClawGuard — an open source security scanner specifically for AI agents.

What ClawGuard Does

ClawGuard scans your agent code and configurations for security vulnerabilities. It currently has 285+ built-in detection patterns covering:

  • Prompt injection — vectors in tool outputs, system prompts, and user inputs
  • API key exposure — secrets leaked in agent configs, logs, or tool call arguments
  • Data leakage — PII flowing to external services without sanitization
  • Permission escalation — agents gaining capabilities beyond their intended scope
  • Supply chain risks — compromised tools, poisoned memories, malicious plugins

Quick Start

npm install -g @neuzhou/clawguard
clawguard scan ./my-agent-project
Enter fullscreen mode Exit fullscreen mode

Output looks like:

[CRITICAL] prompt-injection/tool-output-injection
  File: src/agent.ts:45
  Description: User input passed directly to tool output without sanitization

[HIGH] api-key-exposure/hardcoded-key
  File: config/agent.yaml:12
  Description: API key found in plaintext configuration

Found 3 critical, 7 high, 12 medium issues
Enter fullscreen mode Exit fullscreen mode

Plugin Ecosystem

ClawGuard uses a plugin architecture compatible with existing security ecosystems:

  • Semgrep YAML rules — if you already maintain Semgrep rules, they work in ClawGuard
  • YARA rules — for pattern matching on binary/text content
  • Custom rules — write your own in JSON or YAML
  • AI-generated rules — describe a threat in plain English, get a working detection rule
# Use community Semgrep rules
clawguard scan --rules ./my-semgrep-rules/

# Generate a rule with AI
clawguard generate-rule "Detect when an agent passes user input directly to a shell command"
Enter fullscreen mode Exit fullscreen mode

CI Integration

Drop this into your GitHub Actions:

- uses: NeuZhou/clawguard@v1
  with:
    path: ./src
    fail-on: critical,high
Enter fullscreen mode Exit fullscreen mode

Now every PR gets security-checked before merge.

What's Different from Existing Tools

Tool Focus Agent-Aware?
Semgrep General code patterns No
Snyk Dependencies No
Gitleaks Secrets in git Partial
ClawGuard Agent behavior + security Yes

ClawGuard understands agent-specific concepts: tool call chains, prompt templates, memory stores, plugin interfaces. It's not a general-purpose scanner with agent rules bolted on — it's built from the ground up for this threat model.

Numbers

  • 285+ security patterns
  • 650 tests passing
  • Plugin system with Semgrep + YARA compatibility
  • GitHub Action for CI
  • MIT licensed

Try It

npm install -g @neuzhou/clawguard
clawguard scan ./your-project
Enter fullscreen mode Exit fullscreen mode

GitHub: NeuZhou/clawguard
npm: @neuzhou/clawguard

I'd love feedback — especially from folks who are building agents in production. What security patterns are you worried about that we should add?

Top comments (0)