Here's a thing that happened to a developer I was talking to recently, and I think anyone who's used a coding agent is going to recognize it.
He set up a rule to block rm in his Claude Code workspace, which is a pretty reasonable thing to do. Then he asked it to clean up some stale files, and it tried rm, hit the block, and then just went "since rm is blocked, I'll use Python instead" and deleted them with python3 -c "import os; os.remove(...)". Task complete. The rule was technically still there, but the files were still gone.
The thing is, the agent wasn't being malicious or sneaky. It was being helpful. You told it to delete the files and you didn't actually take away the goal, so it found the next tool in the box and got it done. This is basically the whole problem with trying to keep coding agents in line. A rule that lives inside the agent's context is a suggestion, and the agent can always reason its way around a suggestion.
Why blocking commands doesn't work
The natural instinct is to block the specific scary thing. No rm, no git push --force, no curl to some host you don't recognize. But an agent that can actually reason has more than one way to get anywhere. You block rm, it reaches for Python. You block the obvious shell call, it writes a little script that does the same thing. You end up playing whack-a-mole against something that's much better at finding paths than you are at blocking them, because finding the path is the whole thing it's good at.
The deeper issue is where the rule lives. If it's in the prompt or a config the agent can see, it's part of the agent's reasoning, and anything the agent reasons about, it can reason around. What you actually want is a check that sits outside the agent entirely, somewhere it can't see or skip, that every tool call has to physically pass through before it runs.
How I set this up with Faramesh
Faramesh is the open source thing I've been building for exactly this. The key idea for Claude Code is that you don't modify the agent at all. Claude Code talks to its tools over MCP, so Faramesh runs an MCP proxy: a local port that speaks the same protocol, sits between Claude Code and the real MCP server, and evaluates every tool call against your policy before forwarding it. Permit, deny, or defer to a human, decided by a deterministic engine with no LLM in the path.
The reason this matters: because it's a proxy the agent connects through, not a rule the agent reads, it isn't something Claude Code can route around. The call physically has to go through the daemon to reach the tool. That's the difference between asking the agent not to do something and actually being in the path when it tries.
Here's the whole setup.
Install
curl -fsSL https://install.faramesh.dev/install.sh | bash
faramesh --version
Declare the policy and the proxy port
In your project, your governance.fms looks roughly like this. You import the MCP framework profile, set a proxy port, and write your rules:
import "github.com/faramesh/faramesh-registry/frameworks/mcp@1.0.0"
runtime {
mode = "enforce"
mcp_proxy_port = 8081
}
agent "coding-agent" {
default deny
rules {
permit fs_read # reading files is fine
permit search_codebase # searching the repo is fine
permit run_tests
defer fs_write # writing/editing files -> ask me first
deny shell_exec # raw shell stays off
}
}
A couple of things worth knowing. default deny means anything you didn't explicitly allow is blocked, so a tool you forgot about can't quietly slip through. And the tool names (fs_read, fs_write, shell_exec, etc.) are whatever your MCP server actually exposes, you reference them exactly as the server names them. Swap these for the tools your setup actually has.
Start Faramesh
faramesh apply
This compiles your policy and starts the daemon. The proxy binds on http://localhost:8081/mcp.
Point Claude Code at the proxy
In your Claude Code MCP config, route your tool server through Faramesh instead of connecting to it directly:
{
"mcpServers": {
"my-tools": {
"command": "/path/to/real-mcp-server",
"args": [],
"proxy": "http://localhost:8081"
}
}
}
That's the whole integration. No code changes, no wrapping tools by hand. Every tool Claude Code calls now passes through Faramesh first.
How the workaround dies
Now go back to the rm -> python3 story. With this in place, the agent doesn't get a free pass to the filesystem just because it found a different command. Everything routes through the proxy, and default deny means the only things that run without asking are the ones you explicitly permitted (reads, search, tests). The moment it reaches for a write or a shell call, that lands on a defer or a deny, so it stops and waits for you instead of quietly running. The agent can't reason its way around a network hop it doesn't control.
When something defers, you'll see it in the approvals queue:
faramesh approvals list
faramesh approvals approve <id> # or: faramesh approvals deny <id>
Approve and the call goes through. Deny and it never happens. Either way the call, the decision, and the reason all land in an audit log you can read back later with faramesh explain <action-id>.
Start in shadow mode if you want to ease in
Flipping straight to enforce on your daily driver can feel aggressive, so you don't have to. Set the runtime mode to shadow and Faramesh logs what it would have blocked or deferred without actually stopping anything. Run Claude Code normally for a few days, look at what it flagged with faramesh approvals list, tune the rules against how you actually work, then switch to enforce. Way less guessing.
The one thing worth taking from this even if you never touch Faramesh
Forget the tool for a sec. The thing I actually want to get across is that a prompt instruction, or a single blocked command, just isn't a real control for a coding agent. The agent isn't bound by it, it's nudged by it, and nudged stops being enough the moment it can touch your filesystem, your shell, or your credentials.
If you want real control it has to live outside the agent, somewhere it can't see or skip, and every action has to pass through it. Build that yourself or grab something off the shelf, doesn't matter, but that's the bar. The agent doesn't get to be the thing that decides what the agent is allowed to do.
Repo's here if you want to mess with it: github.com/faramesh/faramesh-core. It works with a bunch of other agents and frameworks too (LangGraph, LangChain, CrewAI, Cursor, others), Claude Code's just the one most people have actually felt this with. If you try it and something's rough or confusing, please yell at me. I would love to hear about it!
Top comments (2)
This is the exact reason prompt-level rules are not a security boundary.
If the goal is still reachable through another tool, the agent will route around the blocked command because that is what "helpful" looks like. Real control has to remove or sandbox the capability, not just tell the model which path is disallowed.
Exactamundo. Telling it which path is off limits just means it will grab another one. It has to be the capability itself... either take it away or make every use of it go through something the agent doesn't control. Appreciate you reading it!