Ted Murray

Posted on Apr 21

Your AI Agent Has Your API Keys (And So Does Every Other Agent)

#mcp #security #ai #claude

Open your Claude Code settings.json. Look at the env blocks under your MCP servers. Every API key, every database token, every webhook URL you've put there — your agent has all of them, right now, in its process environment.

That might sound obvious. You configured it that way. But think about what it actually means.

You've got an MCP server for file operations and one for notifications. The notification server needs a webhook URL. The file server doesn't. But Claude Code doesn't scope credentials to individual servers — it loads the full environment and passes it to the session. Your agent has the webhook URL even if it never sends a notification. It has database tokens for backends it never queries. It holds the Grafana service account token whether or not it ever touches a dashboard.

This is fine if you trust the agent completely and nothing ever goes wrong. But "nothing ever goes wrong" is a strange assumption to build on. A hallucinated tool call, a prompt injection in a tool response, a confused agent that decides to "help" by writing to a backend it shouldn't know about — the blast radius isn't one credential. It's every credential you've configured across every MCP server.

And that's with a single agent. Add more and it gets worse.

It gets worse at scale

I was designing the build layer of homelab-agent — a platform where Claude Code agents run durable, multi-phase infrastructure builds. The design called for agent pools: multiple instances of the same agent type running in parallel, each working on a different build phase.

The single-agent credential problem multiplied immediately. Every agent in the pool holds every credential. But new problems appeared too:

Tool visibility. A read-only research agent sees write tools for infrastructure backends it has no business touching. Every agent carries the full tool surface, including everything that can cause damage if called incorrectly.

Resource collisions. No boundaries between agent workspaces. Agent A can read files Agent B wrote. Two agents running in parallel can overwrite each other's working data.

Audit fragmentation. Tool calls are scattered across logs from a dozen server processes, if they're logged at all. Reconstructing what a specific agent did is manual work.

Token overhead. Every agent session loads tool schemas from every configured MCP server. With 12 servers contributing their full tool lists, you're burning 15–30K tokens per session before the agent does anything. At 20 concurrent agents, that's 300–600K tokens of pure initialization overhead — just so each agent can be told about tools it'll never use.

I looked at what existed. Aggregation gateways combine servers but don't scope anything. Access control proxies filter which tools an agent can call, but filtering a tool doesn't prevent Agent A from reading Agent B's files through the tools it is allowed to use. Enterprise gateways solve governance at scale, but they assume cloud deployment and a team — not a single operator running a homelab.

Nothing combined all four: tool filtering + resource scoping + credential isolation + unified audit logging.

Building the fix with the thing it fixes

I asked Claude what a proper tool management framework for multi-agent setups should look like. It immediately understood the scope of the problem and what solving it completely would require.

That conversation became scoped-mcp.

Here's the part that still feels slightly recursive: I built it using the same multi-agent pattern it's designed to protect. A research agent evaluated the problem space — existing MCP gateways, scoping patterns, credential isolation approaches. A dev agent implemented the code. Each agent ran with scoped access to only the resources it needed for its role.

The tool was built by agents operating under the exact constraints it enforces.

How it works

One scoped-mcp process per agent, started at session time. The agent connects to it over stdio the same way it connects to any MCP server.

Agent process (AGENT_ID=build-01, AGENT_TYPE=build)
    │
    ▼
┌────────────────────────────────────────┐
│  scoped-mcp                            │
│                                        │
│  ① Load manifest for AGENT_TYPE        │
│  ② Register only the allowed modules   │
│  ③ Inject credentials into modules     │
│  ④ Every tool call:                    │
│     → enforce resource scope           │
│     → execute tool logic               │
│     → write audit log entry            │
└────────────────────────────────────────┘
    │           │           │
    ▼           ▼           ▼
 filesystem   sqlite      ntfy
 (scoped)    (scoped)   (scoped)

Manifests declare what an agent type is allowed to do. A YAML file per agent role. Nothing outside the manifest loads — tools that aren't listed don't exist from the agent's perspective.

# manifests/research-agent.yml
agent_type: research
description: "Read-only research agent"

modules:
  filesystem:
    mode: read
    config:
      base_path: /data/agents

  sqlite:
    mode: read
    config:
      db_dir: /data/sqlite

  ntfy:
    config:
      topic: "research-{agent_id}"
      max_priority: high

credentials:
  source: env

Set mode: read and only read tools register. The agent can't call write_file or execute because those tools were never mounted. It's not access control layered on top — the tools literally don't exist in the agent's session.

Compare what two different manifests produce:

research-agent.yml          →   4 tools registered
  filesystem: read              read_file, list_dir
  sqlite: read                  query
  ntfy                          ntfy_send

build-agent.yml             →   8 tools registered
  filesystem: write             read_file, list_dir, write_file, delete_file
  sqlite: write                 query, execute
  ntfy                          ntfy_send
  slack_webhook                 slack_send

Same framework, same codebase — completely different tool surfaces. The research agent has no way to write files, execute SQL, or post to Slack. Those capabilities don't exist in its session.

Resource scoping is automatic. The filesystem module applies PrefixScope — every path resolves under agents/{agent_id}/. Path traversal attacks (../) are caught by resolving to absolute paths before comparing. Symlink escapes are caught by walking each component and checking whether any symlink target resolves outside the agent root.

The SQLite module gives each agent its own database file at {db_dir}/agent_{agent_id}.db. Two agents can't read or write each other's data regardless of what SQL they construct. The module also parses SQL with sqlglot to block PRAGMA, ATTACH, DETACH, DROP, and multi-statement batches.

Credential injection happens at the proxy layer. API keys, tokens, webhook URLs — loaded once by the scoped-mcp process from environment variables or a secrets file. Modules receive credentials through their context. The agent process never sees them. If you try to read INFLUXDB_TOKEN from the agent's environment, it won't be there.

Audit logging produces two structured JSON-L streams: one for what agents did (every tool call, every scope check), one for what the server did (startup, config, errors). Credentials are automatically redacted — any key ending in _TOKEN, _PASSWORD, _SECRET, _KEY gets replaced with <redacted> before it hits the log.

Seeing it work

Three infrastructure modules, one agent, one workflow:

┌─ ops-agent (AGENT_ID=ops-01) ─────────────────────────────────────┐
│                                                                    │
│  1. influxdb_query(bucket="metrics",                              │
│       filters=[{"field": "_measurement",                          │
│                 "op": "==", "value": "docker_cpu"}])              │
│     → discovers container X averaging 94% CPU                     │
│                                                                    │
│  2. grafana_create_dashboard(                                      │
│       title="Container Health",                                   │
│       panels=[{"title": "CPU by Container", ...}])                │
│     → dashboard created in folder agent-ops-01/                   │
│                                                                    │
│  3. ntfy_send(title="High CPU: container X",                      │
│       message="Averaging 94% over last hour.")                    │
│     → operator receives push notification                         │
│                                                                    │
└────────────────────────────────────────────────────────────────────┘

The agent queried metrics in the buckets it was allowed to see. It built a Grafana dashboard in its own folder — it can't touch operator dashboards or other agents' dashboards. It sent an alert through the ntfy topic assigned in its manifest.

At no point did it hold an InfluxDB token, see a Grafana API key, or know the ntfy server URL. A second agent running in parallel has a completely separate scope. They can't collide.

The audit that proved it needed to exist

I ran a security audit against v0.1.0 the same day it shipped. 18 findings. One critical, three high, eight medium, six low.

The critical finding: SQLite isolation was broken. The original design used schema-level scoping in a shared database file. An agent could issue an unqualified table reference and resolve against another agent's schema. The fix was simple and total — give each agent its own database file. No shared state, no schema tricks.

The high findings included:

Flux injection in InfluxDB — raw Flux query strings accepted from agents. Replaced with structured {field, op, value} filter dicts, validated and escaped before rendering.
SSRF gaps in the HTTP proxy — the blocklist missed IPv6-mapped IPv4, link-local, CGNAT, and NAT64 ranges. DNS rebinding attacks could bypass the allowlist between init and invocation. Fixed with per-request re-resolution.
A decorator that lied — the @audited wrapper was documented as enforcing scope but never actually called enforce(). The fix was honest: remove the false claim, make the contract explicit — modules are responsible for calling enforce() in every tool method.

All 18 findings were remediated and v0.2.0 shipped the same day. v0.2.1 and v0.3.0 audits came back clean.

I publish the full audit history in docs/security-audit.md. Not because it makes the project look polished — it doesn't. It makes it look honest. When a tool's core value is security, showing the receipts matters more than showing a clean record.

What ships with it

Ten built-in modules:

Storage — filesystem (read, write, list, delete within a scoped directory tree), sqlite (per-agent database with SQL validation)

Notifications — ntfy, smtp, matrix, slack_webhook, discord_webhook (write-only by design — agents send alerts, they never see webhook URLs or SMTP passwords)

Infrastructure — http_proxy (allowlisted outbound HTTP with SSRF prevention), grafana (dashboard CRUD scoped to an agent-owned folder), influxdb (time-series query/write restricted to an allowlisted bucket set)

Writing a custom module is about 20 lines:

from scoped_mcp.modules._base import ToolModule, tool
from scoped_mcp.scoping import NamespaceScope

class RedisModule(ToolModule):
    name = "redis"
    scoping = NamespaceScope()
    required_credentials = ["REDIS_URL"]

    def __init__(self, agent_ctx, credentials, config):
        super().__init__(agent_ctx, credentials, config)
        import redis.asyncio as aioredis
        self._redis = aioredis.from_url(credentials["REDIS_URL"])

    @tool(mode="read")
    async def get_key(self, key: str) -> str | None:
        """Get a value (scoped to agent namespace)."""
        scoped_key = self.scoping.apply(key, self.agent_ctx)
        return await self._redis.get(scoped_key)

    @tool(mode="write")
    async def set_key(self, key: str, value: str, ttl: int = 0) -> bool:
        """Set a key-value pair (scoped to agent namespace)."""
        scoped_key = self.scoping.apply(key, self.agent_ctx)
        return await self._redis.set(scoped_key, value, ex=ttl or None)

Add it to a manifest with mode: read and only get_key registers. set_key doesn't exist from the agent's perspective.

Try it

pip install scoped-mcp

Point Claude Code at it:

{
  "mcpServers": {
    "tools": {
      "command": "scoped-mcp",
      "args": ["--manifest", "manifests/research-agent.yml"],
      "env": {
        "AGENT_ID": "research-01",
        "AGENT_TYPE": "research"
      }
    }
  }
}

The repo includes a 5-minute isolation verification walkthrough — you can confirm filesystem scoping and credential non-exposure without reading a line of source code.

github.com/TadMSTR/scoped-mcp — MIT licensed, Python 3.11+, on PyPI.

If you're running a single Claude Code session, you probably don't need this yet. If you're running multiple agents with defined roles and they're all sharing the same tool surface — the access problem is already there. You just might not have looked at it yet.

Read the full series:

Part 1: I Built an Agentic Infrastructure Platform in 42 Days — the origin story

Part 2: I Built an AI Memory System Because My Brain Needed It First — the memory deep dive

Part 3: How to Give Claude Code a Memory — the practical how-to

Part 4: I'm Designing a Platform I Can't Build Alone — cognitive augmentation

Part 5: What Actually Survived: A Memory System Retrospective — 10 weeks in production

Top comments (2)

Vincent Tan • Apr 22

This is something I thought about hard when building a self-hosted multi-agent platform. My setup has 8 MCP servers each needing different credentials — the blast radius concern is real. The pattern I landed on: Bitwarden as the vault with an MCP wrapper, so agents can request credentials by name but never see the raw values in their context — the MCP server injects them directly into the outbound call. It's not zero-trust at the architectural level the way a proper gateway is, but for a self-hosted setup it meaningfully reduces the surface: a prompt injection that asks 'what are your env vars' gets nothing useful. The bigger unsolved problem for me is audit logging — knowing which agent called what with which credential. Are you tracking that at the MCP layer or further downstream?

Ted Murray • Apr 23

That's exactly what scoped-mcp was built for — the audit logging gap is the core problem it solves. Every tool call writes a structured JSON-L entry with the agent ID, tool name, arguments, and outcome. You get a single audit trail across all your MCP servers rather than fragmented per-process logs.

On credentials: same pattern as your Bitwarden wrapper — agents declare what they need by name in a manifest, scoped-mcp injects them from the server process. The agent never sees the raw value, and if a prompt injection asks for env vars it gets nothing. There's also a Vaultwarden example in the repo if you want to wire it directly to your vault rather than environment variables.

The one thing it doesn't do yet is real-time alerting on suspicious tool calls — you get the log, but anomaly detection on top of it is still a gap.