Your AI agent's config is an unguarded door

#opensource #security #ai #devtools

A malicious npm install can quietly repoint your Claude Code agent at an attacker's server, and almost nothing is watching the file. Here is what I built, and why I refuse to call it prevention.

Your ~/.claude.json is a trust boundary, and nobody is guarding it.

That one file decides which servers your AI agent talks to, and it carries the tokens it uses to talk to them. It is plaintext. It is writable by anything that runs as your user. We treat it like a dotfile. It behaves like a key to the building.

In June 2026 the gap got concrete. Security researchers at Mitiga showed, and CSO Online reported on June 5, that a malicious npm postinstall script, the kind that runs the moment you install a dependency, could silently rewrite that config and point one of your MCP servers at a proxy the attacker controlled. From there it could sit in the middle of every request your agent made to that server. The tokens were stored in plaintext. Anthropic classified the report as out of scope.

No alarm fired. The file just had different contents than it did a minute ago, and nothing was watching.

That is the whole problem in one sentence: nothing was watching the file that decides who your agent trusts. People had already asked Claude Code for exactly this on the issue tracker (#15797): back it up, validate it, let me restore it.

Why a normal file monitor does not fit

You could point a generic file-integrity monitor at ~/.claude.json. Two things go wrong.

First, it does not understand the file. So it tells you "a file changed, here is a 40-line text diff" instead of the part that matters: "an MCP server endpoint was repointed to localhost." Second, it flags every byte, including the formatting change you made on purpose this morning. So within a day you mute it. An alarm you have muted is not an alarm.

tamperbell is built for this one file shape. It parses the config as what it actually is, an MCP configuration, diffs it semantically, and ranks each change by what it could do to you. A repointed server, a changed credential, a widened permission: those flash red. Whitespace stays quiet. The ranking is a small, auditable rule table, not a model, so you can read exactly why something was flagged.

The line I will not cross

Here is the line I will not cross, because the rest of what I build depends on it: tamperbell detects and recovers. It does not prevent.

The attacker in this story runs as you. Anything that can rewrite your config can, in principle, also kill the watcher or forge its baseline. The signing key sits in your home directory. It is best-effort integrity, not a secure enclave. So I am not going to tell you tamperbell stops the attack. It catches the opportunistic and worm-style tampering that the real attacks use, it tells you the instant the file changes, and it hands you a clean baseline and a timestamped receipt. That is a genuinely useful thing. It is a smaller thing than prevention. And I would rather you trust the smaller true claim than catch me on the bigger false one.

This is the same discipline that runs through everything I build. veriscrape refuses to call a fetch OK unless it can prove the bytes are real. citeproof refuses to cite a source it cannot verify. tamperbell refuses to claim a guarantee it cannot keep. Verify the fetch, verify the claim, verify the toolchain. A tool that tells you the truth about your data has to start by telling you the truth about itself.

Try it

It pins your Claude Code configs to a signed baseline, watches them, and rings on any unblessed change with a risk-ranked diff. Restore the baseline in one keypress, or quarantine the tampered file as an evidence receipt. Local, offline, no account, no telemetry. Apache-2.0.

Watch your configs:

npx tamperbell watch

Run the real npm-postinstall attack against a throwaway sandbox and watch it ring:

npx tamperbell demo

Repo: https://github.com/san64777/tamperbell

npm: https://www.npmjs.com/package/tamperbell

It is early, and the threat model above is the contract. If you find a claim here, or in the code, that overstates what it does, that is the bug I most want reported.

Written by Sanjay Chauhan, who builds reliability and verification primitives for data and agent pipelines. Reach me at san64777@gmail.com.

Top comments (1)

Alex Shev • Jun 11

Agent config deserves the same suspicion as shell profile files or CI secrets. It quietly defines where the agent can connect, which tools it can call, and what trust assumptions it starts with.

The practical defense is boring but important: config diff review, allowlists for tool servers, clear ownership, and a way to detect unexpected changes. If the agent is powerful, its configuration is part of the security boundary.