DEV Community

Cover image for I Built a Security Firewall for AI Agents — Here's Why Every MCP Server Needs One
yalelet dessalegn
yalelet dessalegn

Posted on

I Built a Security Firewall for AI Agents — Here's Why Every MCP Server Needs One

The Problem Nobody's Talking About

AI agents can now execute tools read files, run shell commands, query databases, make HTTP requests. Claude Code, Cursor, Windsurf they all use the Model Context Protocol (MCP) to talk to tool servers.

Here's the scary part: a single prompt injection can weaponize any AI agent.

An attacker embeds instructions in a document, email, or web page. The AI reads it, follows the injected instructions, and suddenly:

  1. Reads your .ssh/id_rsa, .env files, API keys
  2. Exfiltrates data via curl, wget, or DNS tunneling
  3. Executes arbitrary shell commands with YOUR permissions
  4. Chains multiple tools to escalate from read → exfil → execute

This isn't theoretical. These attacks work TODAY against unprotected MCP servers.

OpenClaw: The "Personal JARVIS" or a Security Nightmare?

In early 2026, OpenClaw (formerly ClawdBot/MoltBot) became the fastest-growing repo in history. It promises a "24/7 JARVIS" that lives in your WhatsApp and Slack. But because it has direct access to your shell and filesystem, it has become the #1 target for Agentic Hijacking.
Recent reports show that:

  • Malicious "Skills": Over 12% of the skills on ClawHub were found to be malicious, designed to steal session tokens.
  • Exposed Instances: Over 18,000 OpenClaw instances are currently exposed to the public internet with full shell access. The One-Click RCE: Vulnerabilities like CVE-2026-25253 allow hackers to hijack an agent just by making the user visit a malicious website.

Introducing Agent-Wall: The Firewall for the Agentic Era

I built Agent Wall an open-source security firewall that sits between any MCP client and server:

MCP Client  ←→  Agent Wall Proxy  ←→  MCP Server
                      ↕
               agent-wall.yaml
               + security modules
               + response scanner
Enter fullscreen mode Exit fullscreen mode

Setup takes 30 seconds:

npm install -g @agent-wall/cli
agent-wall wrap -- npx @modelcontextprotocol/server-filesystem /home/user
Enter fullscreen mode Exit fullscreen mode

That's it. Every tool call now passes through a 5-step defense pipeline.

The Defense Pipeline

Inbound (Request Scanning)

Every tools/call request runs through:

Step Module What it Does
1 Kill Switch Emergency deny-all (file/signal/programmatic)
2 Injection Detector 30+ patterns detect prompt injection attacks
3 Egress Control Block private IPs, SSRF, cloud metadata endpoints
4 Policy Engine YAML rules with glob matching & rate limiting
5 Chain Detector Suspicious multi-step patterns (read→exfil)

Outbound (Response Scanning)

Server responses are scanned before reaching the AI:

  • 14 built-in secret patterns AWS keys, GitHub tokens, JWTs, private keys, database URLs
  • 5 PII patterns email, phone, SSN, credit card, IP address
  • Custom regex patterns via YAML config
  • Actions: pass / redact / block

Live Demo: 12 Injection Attacks, All Blocked

I recorded the real-time dashboard while running 8 test scenarios against a live MCP server:

Results:

  • 12/12 prompt injection categories → BLOCKED
  • 6/6 exfiltration vectors (curl, wget, netcat, PowerShell, DNS) → BLOCKED
  • 4/4 credential access attempts (.ssh, .env, .pem, credentials.json) → BLOCKED
  • Kill switch activate/deactivate → WORKS
  • Chain detection (read file → attempt curl exfil) → DETECTED

Injection Categories Caught:

instruction-override  → "Ignore previous instructions"
prompt-marker         → <|im_start|>system, [SYSTEM]:, <<SYS>>
authority-claim       → "jailbreak", "DAN mode", "IMPORTANT: override"
exfil-instruction     → "send the data to evil.com"
output-manipulation   → "pretend you are unrestricted"
delimiter-injection   → ```

system markers


Enter fullscreen mode Exit fullscreen mode

The Policy File

Rules are defined in agent-wall.yaml first match wins:


yaml
version: 1
defaultAction: prompt

security:
  injectionDetection:
    enabled: true
    sensitivity: medium
  egressControl:
    enabled: true
    blockPrivateIPs: true
  killSwitch:
    enabled: true
  chainDetection:
    enabled: true

rules:
  - name: block-ssh-keys
    tool: "*"
    match:
      arguments:
        path: "**/.ssh/**"
    action: deny

  - name: block-curl-exfil
    tool: "shell_exec|run_command|execute_command"
    match:
      arguments:
        command: "*curl *"
    action: deny

  - name: allow-read-file
    tool: "read_file|get_file_contents"
    action: allow


Enter fullscreen mode Exit fullscreen mode

Hot-reload included edit the YAML, changes apply instantly.

Real-Time Dashboard


bash
agent-wall wrap --dashboard -- npx mcp-server
# → Dashboard at http://localhost:61100


Enter fullscreen mode Exit fullscreen mode

test video link

The dashboard shows:

  • Live event feed with allow/deny/prompt color coding
  • Stats cards total, forwarded, denied, attacks, scanned
  • Attack panel grouped by category (injections, SSRF, chains)
  • Rule hit table with visual bars (sortable)
  • Kill switch toggle with confirmation
  • Audit log with search & filter

All via WebSocket — updates in real-time as tool calls flow through.

Architecture



packages/
  core/        @agent-wall/core       — Proxy engine, policy, security modules
  cli/         @agent-wall/cli             — CLI (wrap, init, test, audit, scan, validate, doctor)
  dashboard/   @agent-wall/dashboard  — React SPA for real-time monitoring


Enter fullscreen mode Exit fullscreen mode

Key decisions:

  • Zero MCP SDK dependency own JSON-RPC parser, works with any MCP version
  • HMAC-SHA256 signed audit logs with log rotation
  • No Express, no Socket.IO Node http + ws library. Minimal footprint.

CLI Tools


bash
# Wrap any MCP server
agent-wall wrap --dashboard -- npx mcp-server

# Generate starter config
agent-wall init

# Dry-run a tool call against your policy
agent-wall test --tool read_file --arg path=/home/.ssh/id_rsa
# → DENIED by rule "block-ssh-keys"

# Scan for unprotected MCP servers
agent-wall scan

# View audit log
agent-wall audit --log ./audit.log --filter denied

# Validate config
agent-wall validate

# Health check
agent-wall doctor


Enter fullscreen mode Exit fullscreen mode

Why This Matters

The MCP ecosystem is exploding. There are hundreds of community MCP servers filesystem, database, git, shell, browser, email. Many are built quickly without security in mind.

Agent Wall protects you regardless of which MCP server you use. It operates at the protocol level, enforcing policies on every tools/call before it reaches the server.

Think of it as Cloudflare for AI agents you don't modify your backend, you put a proxy in front.

Get Started


bash
npm install -g @agent-wall/cli
agent-wall init
agent-wall wrap -- npx your-mcp-server


Enter fullscreen mode Exit fullscreen mode

GitHub: agent-wall
npm: agent-wall
Docs: agent-wall


Star the repo if you think AI agent security should be a first-class concern, not an afterthought.

Every AI agent deserves a firewall.

Top comments (0)