neuzhou

Posted on Mar 17

Your AI Agent Has Security Holes — Here's How to Find Them

#typescript

AI agents are the new attack surface. They handle user input, call external tools, access databases, and make autonomous decisions. Yet most teams ship them with zero security review.

I kept running into the same problems: prompt injection vectors hiding in tool call chains, API keys leaked in system prompts, PII flowing through logs unchecked. Traditional SAST/DAST tools don't understand agent-specific threats. So I built ClawGuard — an open source security scanner specifically for AI agents.

What ClawGuard Does

ClawGuard scans your agent code and configurations for security vulnerabilities. It currently has 285+ built-in detection patterns covering:

Prompt injection — vectors in tool outputs, system prompts, and user inputs
API key exposure — secrets leaked in agent configs, logs, or tool call arguments
Data leakage — PII flowing to external services without sanitization
Permission escalation — agents gaining capabilities beyond their intended scope
Supply chain risks — compromised tools, poisoned memories, malicious plugins

Quick Start

npm install -g @neuzhou/clawguard
clawguard scan ./my-agent-project

Output looks like:

[CRITICAL] prompt-injection/tool-output-injection
  File: src/agent.ts:45
  Description: User input passed directly to tool output without sanitization

[HIGH] api-key-exposure/hardcoded-key
  File: config/agent.yaml:12
  Description: API key found in plaintext configuration

Found 3 critical, 7 high, 12 medium issues

Plugin Ecosystem

ClawGuard uses a plugin architecture compatible with existing security ecosystems:

Semgrep YAML rules — if you already maintain Semgrep rules, they work in ClawGuard
YARA rules — for pattern matching on binary/text content
Custom rules — write your own in JSON or YAML
AI-generated rules — describe a threat in plain English, get a working detection rule

# Use community Semgrep rules
clawguard scan --rules ./my-semgrep-rules/

# Generate a rule with AI
clawguard generate-rule "Detect when an agent passes user input directly to a shell command"

CI Integration

Drop this into your GitHub Actions:

- uses: NeuZhou/clawguard@v1
  with:
    path: ./src
    fail-on: critical,high

Now every PR gets security-checked before merge.

What's Different from Existing Tools

Tool	Focus	Agent-Aware?
Semgrep	General code patterns	No
Snyk	Dependencies	No
Gitleaks	Secrets in git	Partial
ClawGuard	Agent behavior + security	Yes

ClawGuard understands agent-specific concepts: tool call chains, prompt templates, memory stores, plugin interfaces. It's not a general-purpose scanner with agent rules bolted on — it's built from the ground up for this threat model.

Numbers

285+ security patterns
650 tests passing
Plugin system with Semgrep + YARA compatibility
GitHub Action for CI
MIT licensed

Try It

npm install -g @neuzhou/clawguard
clawguard scan ./your-project

GitHub: NeuZhou/clawguard
npm: @neuzhou/clawguard

I'd love feedback — especially from folks who are building agents in production. What security patterns are you worried about that we should add?

DEV Community