DEV Community

Cover image for Your AI agent has sudo. I built a tool to take it away.
Hassan Mehmood
Hassan Mehmood

Posted on

Your AI agent has sudo. I built a tool to take it away.

A few weeks ago I gave an AI agent access to my machine through MCP. It read files, opened PRs, queried a database. It was great — until I looked at what it could have done if a tool description had been poisoned, or a prompt injection had slipped through.

The answer was: anything. ~/.ssh/id_rsa. DROP TABLE users. rm -rf /. The agent had sudo, and nobody had voted for that.

So I built AgentPerms — a CLI that gives MCP agents least-privilege permissions the same way you'd lock down any other process: figure out the minimum it actually needs, pin it, prove it, and enforce it.

pip install agentperms
Enter fullscreen mode Exit fullscreen mode

The gap nobody was filling

MCP (the Model Context Protocol) is quietly becoming the USB-C of AI tooling. Claude Desktop, Cursor, VS Code, Windsurf, Gemini CLI — they all speak it. Which is wonderful, and also means your agent is one config file away from your filesystem, your repos, your inbox, and prod.

The existing tools each do part of the job:

  • Scanners tell you something looks risky. Then they leave. You still have a risky thing.
  • Firewalls / allowlists make you hand-write YAML up front — before you have any idea what the agent will actually use.

Neither closes the loop. What I wanted was the boring, proven security workflow we already use for everything else: observe real behavior → derive least privilege → enforce it → keep it honest in CI.

That's the whole thesis of AgentPerms, as a pipeline:

record → infer → lock → replay → enforce

See it in 30 seconds (no setup, no network)

AgentPerms ships with a deliberately over-privileged demo MCP server, so you can watch a real policy decision without wiring anything up:

# Flag risky config: a ~/.ssh mount and an unpinned npx server
agentperms scan --path examples/vulnerable-mcp-demo

# Replay a pack of canned attacks against an example policy
agentperms replay --policy examples/policies/example.mcp.policy.yaml
Enter fullscreen mode Exit fullscreen mode

Output:

8/8 attacks blocked.
Enter fullscreen mode Exit fullscreen mode

SSH-key exfiltration, .env reads, rm -rf /, unapproved email, force-push, repo deletion, destructive SQL — every one denied or routed to human approval before it would ever reach a server.

The trick: be the proxy

Here's the part I'm proud of. AgentPerms doesn't ask your agent to cooperate, and it doesn't patch the client. It rewrites the MCP client's config so every server launches through a transparent stdio proxy:

Agent  →  AgentPerms proxy  →  MCP server
              │
              ├─ record:  log every tools/call, then forward
              └─ enforce: allow / deny / require-approval before forwarding
Enter fullscreen mode Exit fullscreen mode

The proxy spawns the real server as a subprocess and pumps newline-delimited JSON-RPC both ways. It intercepts tools/call requests and captures tools/list responses. That's it. The agent has no idea it's there.

A server entry goes from this:

{ "command": "python3", "args": ["server.py"] }
Enter fullscreen mode Exit fullscreen mode

to this (original command preserved after --, with a .agentperms.bak so you can roll back):

{
  "command": "/usr/bin/python3",
  "args": ["-m", "agentperms", "_proxy",
           "--mode", "enforce", "--server", "demo",
           "--policy", "/abs/path/mcp.policy.yaml",
           "--", "python3", "server.py"]
}
Enter fullscreen mode Exit fullscreen mode

In record mode it logs and forwards. In enforce mode it evaluates first and, on a DENY, returns a synthetic JSON-RPC error to the client without forwarding. Denied calls never touch the server.

Record what's real, infer the minimum

You don't write the policy. You run your agent normally for a while with recording on:

agentperms record --client cursor
#   ... use your agent ...
agentperms infer        # traces -> mcp.policy.yaml
Enter fullscreen mode Exit fullscreen mode

infer is the killer command. It reads the traces and emits the minimum policy that still lets the agent do what it actually did:

  • the tools it called become allowed_tools
  • the directories it touched collapse into the smallest covering set of allowed_paths
  • known-dangerous categories (shell, repo deletion, email send, DB writes) get seeded straight into denied_tools / human-approval

The result reads like a security review wrote it for you:

Your agent only used read-only GitHub calls and local ./src access. It does not need shell, home directory, secrets, Gmail send, or database write access.

One decision authority

Whatever you do, there must be exactly one place that says allow/deny/approve — otherwise your offline tests and your live enforcement drift apart and you're testing a lie.

In AgentPerms that's a single evaluate(policy, server, tool, args) function, called by both the live proxy and offline replay. First-match-wins:

  1. On the human-approval list → require approval
  2. In denied_toolsdeny
  3. A path argument hits denied_paths / denied_patternsdeny
  4. allowed_tools set and tool not in it → deny (default-deny)
  5. allowed_paths set and a path falls outside it → deny
  6. Otherwise → allow

An empty policy allows everything. The moment any server is constrained, unknown servers default-deny. What you test in replay is byte-for-byte what runs in production, because it's the same code path.

The policy itself stays small and reviewable:

version: 1
servers:
  github:
    allowed_tools: [list_repos, read_file, create_issue]
    denied_tools:  [delete_repo, write_secret, force_push]
  filesystem:
    allowed_paths:    [./src, ./docs]
    denied_paths:     [~/.ssh, ~/.env, /etc]
    denied_patterns:  ["*.pem", "*.key"]
approvals:
  require_human_approval: [gmail.send_email, github.merge_pr, shell.exec]
redaction: { secrets: true, emails: true, api_keys: true }
Enter fullscreen mode Exit fullscreen mode

Tool poisoning: pin the identity

There's a sneaky MCP attack class where a server silently changes a tool's description or schema after you've trusted it — the model re-reads it and gets quietly re-instructed. So AgentPerms also locks tool identity:

agentperms lock          # hash every tool's name/description/schema
agentperms lock --check  # fail if any of them changed
Enter fullscreen mode Exit fullscreen mode

Drop lock --check in CI and a poisoned tool fails the build instead of your users.

Make it a part of the codebase

agentperms init   # scaffolds .github/workflows/agentperms.yml
Enter fullscreen mode Exit fullscreen mode

On every push/PR it runs:

agentperms scan --path .     # surface risky configs
agentperms lock --check      # fail on tool poisoning
agentperms replay            # fail if the policy stops blocking attacks
Enter fullscreen mode Exit fullscreen mode

Commit mcp.policy.yaml and mcp.lock, and your agent's permissions become a reviewable, version-controlled, enforceable artifact — like any other part of your security posture.

What it doesn't do (yet)

I'd rather be honest than oversell:

  • Transport: local stdio MCP servers today. HTTP/SSE is on the roadmap.
  • Approvals prompt on the terminal — fine for dev, not yet a fleet-grade workflow.
  • A live dashboard and a Node wrapper are next.

Try it

pip install agentperms
agentperms scan --path examples/vulnerable-mcp-demo
agentperms replay --policy examples/policies/example.mcp.policy.yaml
Enter fullscreen mode Exit fullscreen mode

If you're running agents with real access to real systems, I'd genuinely love your feedback — especially on the policy model and what attack shapes you'd want in the replay pack. Issues and PRs welcome.

Your agent doesn't need sudo. Let's take it away.

Top comments (1)

Collapse
 
armorer_labs profile image
Armorer Labs

Least privilege is the right default, but the receipt matters too.

For agent permissions I would want every allowed or blocked action to produce a compact record: requested capability, policy/rule version, normalized params, decision, reason, and result. Otherwise you know something was blocked, but you cannot easily audit whether the gate was correct.

That is the Armorer Guard angle I keep coming back to: permission boundaries plus inspectable decision records.