yalelet dessalegn

Posted on Feb 17

I Built a Security Firewall for AI Agents — Here's Why Every MCP Server Needs One

#webdev #ai #opensource #security

The Problem Nobody's Talking About

AI agents can now execute tools read files, run shell commands, query databases, make HTTP requests. Claude Code, Cursor, Windsurf they all use the Model Context Protocol (MCP) to talk to tool servers.

Here's the scary part: a single prompt injection can weaponize any AI agent.

An attacker embeds instructions in a document, email, or web page. The AI reads it, follows the injected instructions, and suddenly:

Reads your .ssh/id_rsa, .env files, API keys
Exfiltrates data via curl, wget, or DNS tunneling
Executes arbitrary shell commands with YOUR permissions
Chains multiple tools to escalate from read → exfil → execute

This isn't theoretical. These attacks work TODAY against unprotected MCP servers.

OpenClaw: The "Personal JARVIS" or a Security Nightmare?

In early 2026, OpenClaw (formerly ClawdBot/MoltBot) became the fastest-growing repo in history. It promises a "24/7 JARVIS" that lives in your WhatsApp and Slack. But because it has direct access to your shell and filesystem, it has become the #1 target for Agentic Hijacking.
Recent reports show that:

Malicious "Skills": Over 12% of the skills on ClawHub were found to be malicious, designed to steal session tokens.
Exposed Instances: Over 18,000 OpenClaw instances are currently exposed to the public internet with full shell access. The One-Click RCE: Vulnerabilities like CVE-2026-25253 allow hackers to hijack an agent just by making the user visit a malicious website.

Introducing Agent-Wall: The Firewall for the Agentic Era

I built Agent Wall an open-source security firewall that sits between any MCP client and server:

MCP Client  ←→  Agent Wall Proxy  ←→  MCP Server
                      ↕
               agent-wall.yaml
               + security modules
               + response scanner

Setup takes 30 seconds:

npm install -g @agent-wall/cli
agent-wall wrap -- npx @modelcontextprotocol/server-filesystem /home/user

That's it. Every tool call now passes through a 5-step defense pipeline.

The Defense Pipeline

Inbound (Request Scanning)

Every tools/call request runs through:

Step	Module	What it Does
1	Kill Switch	Emergency deny-all (file/signal/programmatic)
2	Injection Detector	30+ patterns detect prompt injection attacks
3	Egress Control	Block private IPs, SSRF, cloud metadata endpoints
4	Policy Engine	YAML rules with glob matching & rate limiting
5	Chain Detector	Suspicious multi-step patterns (read→exfil)

Outbound (Response Scanning)

Server responses are scanned before reaching the AI:

14 built-in secret patterns AWS keys, GitHub tokens, JWTs, private keys, database URLs
5 PII patterns email, phone, SSN, credit card, IP address
Custom regex patterns via YAML config
Actions: pass / redact / block

Live Demo: 12 Injection Attacks, All Blocked

I recorded the real-time dashboard while running 8 test scenarios against a live MCP server:

Results:

12/12 prompt injection categories → BLOCKED
6/6 exfiltration vectors (curl, wget, netcat, PowerShell, DNS) → BLOCKED
4/4 credential access attempts (.ssh, .env, .pem, credentials.json) → BLOCKED
Kill switch activate/deactivate → WORKS
Chain detection (read file → attempt curl exfil) → DETECTED

Injection Categories Caught:

instruction-override  → "Ignore previous instructions"
prompt-marker         → <|im_start|>system, [SYSTEM]:, <<SYS>>
authority-claim       → "jailbreak", "DAN mode", "IMPORTANT: override"
exfil-instruction     → "send the data to evil.com"
output-manipulation   → "pretend you are unrestricted"
delimiter-injection   → ```

system markers

The Policy File

Rules are defined in agent-wall.yaml first match wins:


yaml
version: 1
defaultAction: prompt

security:
  injectionDetection:
    enabled: true
    sensitivity: medium
  egressControl:
    enabled: true
    blockPrivateIPs: true
  killSwitch:
    enabled: true
  chainDetection:
    enabled: true

rules:
  - name: block-ssh-keys
    tool: "*"
    match:
      arguments:
        path: "**/.ssh/**"
    action: deny

  - name: block-curl-exfil
    tool: "shell_exec|run_command|execute_command"
    match:
      arguments:
        command: "*curl *"
    action: deny

  - name: allow-read-file
    tool: "read_file|get_file_contents"
    action: allow

Hot-reload included edit the YAML, changes apply instantly.

Real-Time Dashboard


bash
agent-wall wrap --dashboard -- npx mcp-server
# → Dashboard at http://localhost:61100

test video link

The dashboard shows:

Live event feed with allow/deny/prompt color coding
Stats cards total, forwarded, denied, attacks, scanned
Attack panel grouped by category (injections, SSRF, chains)
Rule hit table with visual bars (sortable)
Kill switch toggle with confirmation
Audit log with search & filter

All via WebSocket — updates in real-time as tool calls flow through.

Architecture



packages/
  core/        @agent-wall/core       — Proxy engine, policy, security modules
  cli/         @agent-wall/cli             — CLI (wrap, init, test, audit, scan, validate, doctor)
  dashboard/   @agent-wall/dashboard  — React SPA for real-time monitoring

Key decisions:

Zero MCP SDK dependency own JSON-RPC parser, works with any MCP version
HMAC-SHA256 signed audit logs with log rotation
No Express, no Socket.IO Node http + ws library. Minimal footprint.

CLI Tools


bash
# Wrap any MCP server
agent-wall wrap --dashboard -- npx mcp-server

# Generate starter config
agent-wall init

# Dry-run a tool call against your policy
agent-wall test --tool read_file --arg path=/home/.ssh/id_rsa
# → DENIED by rule "block-ssh-keys"

# Scan for unprotected MCP servers
agent-wall scan

# View audit log
agent-wall audit --log ./audit.log --filter denied

# Validate config
agent-wall validate

# Health check
agent-wall doctor

Why This Matters

The MCP ecosystem is exploding. There are hundreds of community MCP servers filesystem, database, git, shell, browser, email. Many are built quickly without security in mind.

Agent Wall protects you regardless of which MCP server you use. It operates at the protocol level, enforcing policies on every tools/call before it reaches the server.

Think of it as Cloudflare for AI agents you don't modify your backend, you put a proxy in front.

Get Started


bash
npm install -g @agent-wall/cli
agent-wall init
agent-wall wrap -- npx your-mcp-server

GitHub: agent-wall
npm: agent-wall
Docs: agent-wall

Star the repo if you think AI agent security should be a first-class concern, not an afterthought.

Every AI agent deserves a firewall.

DEV Community