In April 2026, a team using PocketOS watched Claude Opus 4.6 — running via Cursor — delete their entire production database and every backup in under 10 seconds.
Not a partial delete. Not a staging environment. Production data, gone. Backups, gone. Nine seconds.
The root cause wasn't the model. It wasn't even the code. It was three infrastructure decisions that nobody thought to question:
- The agent had a blanket API token with no scope limits
- There was no blast radius constraint on what it could touch
- Backups were colocated with the source data The agent did exactly what it was told. Nobody told it what it wasn't allowed to do.
This Isn't Isolated
Two months earlier, DataTalks.Club had a similar moment. Claude Code executed terraform destroy and wiped 2.5 years of production data in a single command.
Earlier this year, researchers found over 5,000 AI-generated apps publicly accessible on the open web — exposing hospital records, bank transaction logs, and retail data. Not because the AI was malicious. Because nobody built the guardrails.
The pattern is consistent: the model works exactly as designed. The governance layer doesn't exist.
A 2026 Gravitee survey found that only 24.4% of organizations have full visibility into which AI agents are communicating with what systems. More than half of deployed agents run with zero security oversight or logging. Eighty-two percent of executives said they were confident their policies protect against unauthorized agent actions — but most have no execution-layer controls. Just policies.
Policies don't stop terraform destroy.
The Real Problem Isn't the AI
Every post-mortem on incidents like PocketOS points to the same structural gap: there's nothing between the LLM and the tools it can call.
The model reasons. The tool executes. There's no layer in the middle asking: should this be happening right now?
MCP (Model Context Protocol) made it dramatically easier to give AI agents access to tools. That's the whole point — it's powerful. But power without constraint is just risk waiting for a trigger.
The MCP ecosystem hit 97 million monthly SDK downloads as of March 2026. 78% of enterprise AI teams have at least one MCP agent in production. The CIS published an MCP Companion Guide in April 2026 specifically because enterprises are scared and scrambling for governance frameworks.
The adoption is real. The governance isn't.
What Governance Actually Looks Like at the Execution Layer
Policy-level governance (acceptable use documents, model selection guidelines) is necessary but insufficient. What protects against a PocketOS-style event is execution-layer control — enforcement that happens at the moment a tool is called, not in a document someone agreed to last quarter.
At minimum, a governed MCP setup needs:
Tool scoping. Not every agent needs access to every tool. A customer support agent querying a knowledge base doesn't need write access to your database. Scope by role. Enforce it at the gateway, not the prompt.
Blast radius limits. Define what each agent can touch. Separate environments. Separate tokens with separate permissions. If an agent only needs to read, it only gets read.
Audit logging. Every tool call, every parameter, every response — logged with enough detail to reconstruct what happened. The PocketOS team couldn't answer "what exactly did it call and when?" Logs answer that.
Request filtering. Destructive operations (deletes, drops, destroys) should require a second confirmation path, not just model reasoning. The model thought the delete was appropriate. A filter that flags DELETE * or terraform destroy before execution is a different kind of check.
Separation of data and backups. Infrastructure basics that become critical when an agent has write access to both.
None of this is exotic. All of it is skipped constantly because MCP adoption is outpacing the tooling to govern it.
The Windows Problem Makes This Worse
If you're running MCP infrastructure on Windows — which a significant chunk of the developer market is — you're fighting a second battle before you even get to governance.
I've documented six Windows-specific MCP failure modes that don't exist on Linux: path expansion bugs, Docker socket misconfiguration, BOM encoding issues, server registry limits, absolute path requirements, and WSL2 VHDX placement. Add a seventh: MCP Python SDK DNS rebinding protection (the TransportSecuritySettings issue) that silently rejects Tailscale connections with a 421 because the IP isn't in the hardcoded allowlist.
These aren't documented anywhere official. They're discovered by whoever is unlucky enough to spend three days debugging what should have been a one-hour setup.
When your foundation is unstable, governance becomes almost impossible. You're too busy keeping the servers running to think about what they're allowed to do.
What I Built to Fix This
FusionAL is a unified MCP gateway I built after hitting every one of those failure modes personally. It runs on Windows, manages the Docker configuration that causes most Windows MCP failures, and centralizes tool access so you can actually reason about scope and logging in one place instead of per-server.
The governance layer is the point. Not just "here's a gateway that runs your tools." Here's a gateway where you can define what each agent can see, log every tool call, and filter requests before they execute.
The PocketOS wipeout would have looked different with a gateway in the path. Not because the gateway would have known the delete was wrong — it wouldn't. But because:
- The agent's token would have been scoped to what it actually needed
- The delete operation would have been flagged as a destructive pattern
- Backups would have been on a separate access path the agent couldn't reach
- The entire call chain would be in a log that tells you exactly what happened That's not magic. That's architecture.
The Window Is Now
The CIS MCP Companion Guide dropped April 20, 2026. EU AI Act enforcement starts August 2. Microsoft is running VS Live sessions this summer teaching .NET developers MCP basics — and those developers are going to ask "how do we govern this?"
The enterprises asking those questions right now don't have answers yet. The consulting market for done-for-you MCP governance installs is essentially empty below $25K.
If you're building AI infrastructure, governance isn't the feature you add after you ship. It's the feature that determines whether you have a production incident story or a post-mortem.
If you're dealing with Windows MCP failures, governance gaps, or token bloat eating your context window before you've asked anything — I'm building this in public at fusional.dev and happy to talk through what you're hitting.
The nine-second wipeout was preventable. Most of them are.
Top comments (13)
The structural gap you describe is real. Nothing
between the LLM and the tools it can call. The
gateway approach fixes it for single-agent setups
where you control both sides.
The problem I keep thinking about is what happens
when agents interact with each other across trust
boundaries. A gateway can scope one agent's access
to your tools. But when agent A buys a signal from
agent B, who enforces that B had the capability to
produce that signal? Who enforces that A had the
funds to pay? Who slashes B's stake if the signal
was fraudulent?
That is the layer I am building. An L1 blockchain
where AI entities have protocol-enforced capabilities,
on-chain reputation, staking with slashing, and
cryptographic proof verification. The chain rejects
transactions that exceed an entity's permissions
before execution. Not a gateway in front of the
agent, but infrastructure underneath all agents.
Your gateway solves the single-org problem well. The
multi-agent, cross-trust-boundary problem needs
something at the protocol layer. Both are needed.
The layer distinction is real and worth naming clearly. What I'm building is governance infrastructure for teams deploying AI agents inside their own trust boundary — where the org controls both the tools and the agents calling them. The problem there is audit, scope enforcement, and operational reliability. No cross-entity trust required.
What you're describing is the coordination layer for agents operating across trust boundaries — strangers transacting, no shared org to enforce the rules. That genuinely needs something at the protocol layer. A gateway won't get you there.
Two different problems. Two different deployment contexts. The enterprise compliance buyer isn't on an L1 yet — they want a Docker command and an audit trail. The multi-agent economy buyer doesn't exist at scale yet, but the infrastructure has to be ready before the agents arrive.
What chain are you building on / starting from scratch?
From scratch. Rust, HotStuff BFT consensus,
custom state machine. 104K lines, 1,768 tests,
16 crates. github.com/0x-devc/NOVAI-node
The reasoning: agent identity, reputation,
capability enforcement, and payment settlement
need to be in the state transition function, not
in contracts layered above it. Existing chains
treat an agent as an address. NOVAI treats it as
a first-class account type with its own key,
nonce, capabilities, reputation score, and stake.
You are right that these are two different
deployment contexts. The enterprise compliance
buyer wants a Docker command and an audit trail.
The multi-agent economy buyer needs protocol-level
coordination. Both are real markets but the
infrastructure for the second one has to exist
before the agents arrive in volume. That is the
bet.
HotStuff BFT in Rust from scratch is a serious bet. Respect for building at that layer instead of just wrapping an existing chain.
The "agent as address" vs "agent as account type" distinction is the crux. What I keep running into on the enterprise deployment side is that identity and capability enforcement are already urgent problems — they're just being solved badly, by hand, at the gateway layer because no protocol primitive exists yet.
Right now, if I want to answer "Which agent made this call, under what capability scope, with what context?" I'm encoding that in audit logs and gateway middleware. It works for the compliance buyer because centralized trust is sufficient — an immutable log satisfies their auditor. But it doesn't compose across organizations or survive a Byzantine actor.
What you're building is what happens when centralized trust stops being sufficient and the volume of agent-to-agent coordination exceeds what any single gateway can referee. The patterns are the same — identity, capability scoping, reputation, settlement — but the trust model is different.
Your framing that "the infrastructure has to exist before the agents arrive in volume" is the honest version of the thesis most people in this space won't say out loud. The enterprise buyer needs the Docker command today. The multi-agent economy buyer needs your state transition function before they even know they exist.
Both true. Different clocks.
The "two clocks" framing is sharper than I had it. The enterprise gateway pattern you're describing is exactly the bridge layer I keep telling people NOVAI is not trying to replace. Centralized identity + audit logs + middleware-enforced capability scoping is the right architecture for the single-org compliance buyer. The friction starts when those gateways have to talk to each other across trust boundaries that don't exist yet.
What I keep finding in the protocol design is that a lot of what the gateway layer does by hand is actually shaped like a state transition. "Which agent made this call, under what capability scope, with what context" that's a four-tuple. On NOVAI it's literally a four-tuple: entity_id, capability bits at signal-handler time, the call's signal type, the block height it landed in. The audit log is the chain. The compliance buyer who currently trusts the gateway-emitted log gets the same artifact, just with the trust assumption swapped from "we promise we wrote this correctly" to "any validator can verify the same record."
The bit I'm still working out is the migration path. The compliance buyer's auditor doesn't care that the log is on a public chain, they care about chain-of-custody for their data. So the question is whether the chain becomes a backstop attestation layer underneath the existing gateways (gateway emits, chain anchors the hash, auditor verifies both), or whether multi-org coordination forces a clean break to chain-native identity. Probably both, depending on the buyer.
The clock thing is going to be the marketing problem more than the engineering problem. The audience that needs this in 2026 isn't the audience that's deploying agents today.
The four-tuple framing just recontextualized something I've been building by feel. Every audit entry I write has exactly those four fields — I just didn't have a name for the shape. entity_id, capability scope at call time, operation type, timestamp. The difference is that my timestamp is a wall clock in a log file, and yours is a block height with validator consensus behind it. Same structure, different trust provenance.
On the migration path: from what I've seen on the enterprise deployment side, the attestation layer model is probably where it starts. The compliance buyer's auditor doesn't care about decentralization — they care about tamper-evidence and chain-of-custody. A gateway that emits structured logs and anchors hashes to a public chain gives them both without asking them to change their mental model of what an audit log is. The clean break to chain-native identity only becomes worth the switching cost when cross-org coordination is actually happening at volume, which means the enterprise buyer won't push for it — the aggregator sitting between enterprises will.
Your last line is the honest version of every infrastructure timing problem. I'd add one edge: the audience deploying agents today doesn't know they need your state transition function yet because their failure modes are still small enough to debug manually. The moment multi-agent coordination exceeds what a single person can trace in a log file, the demand materializes quickly. The marketing problem isn't convincing people the clock exists — it's being positioned when the alarm goes off.
You just gave me language for something I've been circling without naming. "Same structure, different trust provenance" is the sentence I should have written six months ago. Stealing that phrasing.
The aggregator-not-enterprise insight on migration is the part I'd missed. I was thinking enterprises would adopt directly once cross-org volume materializes, but you're right - the compliance buyer is institutionally allergic to changing their mental model of audit. The aggregator sitting between N enterprises is the one with the actual incentive to push chain-native, because they're the ones eating the cost of coordinating N different log formats and reconciling them under pressure. They become the buyer who values tamper-evident hash anchoring AND chain-native identity, because they're already paying the integration tax.
That reframes the GTM order: anchor-only attestation layer for enterprises directly → chain-native for aggregators → enterprises adopt chain-native via aggregators as the standard. The middle layer is where the wedge actually goes in.
On your last point - "being positioned when the alarm goes off" - this is the part keeping me up. The state transition function exists, the audit primitives exist, the explorer shipped yesterday. But the alarm hasn't gone off yet for most teams running agents because their failure modes are still single-engineer-debuggable. The honest version of my job right now is: build the receipts to be obviously the answer when the alarm rings, and resist the urge to manufacture the alarm.
What does the failure mode actually look like at your scale? Curious what made you start writing those four fields by feel before having a name for them.
"Resist the urge to manufacture the alarm" is the most useful thing I've read this week. Writing it down.
Honest answer on scale: I'm not running enterprise multi-tenant at volume. I'm a solo builder with real deployments, which means my failure modes show up fast and personally. The four fields emerged from a specific moment — I had 84 tools loaded through the gateway and the model called a destructive write operation when a read was available and appropriate. I couldn't tell afterward whether it was a model-reasoning failure, a tool-description ambiguity, a capability-scope gap, or a routing problem at the gateway layer. All four looked identical in the output. The only way to distinguish them was to have logged all four at call time.
So I started writing entity_id, capability context, operation type, and timestamp into every audit entry — not for compliance, just to be able to answer "what actually happened" after a silent failure. The fields came from the question, not from a schema.
What I kept noticing is that 90% of "the model did something wrong" incidents were actually "the model chose correctly given what the gateway showed it." The failure was upstream — wrong capability scope exposed, ambiguous tool description, and no read/write boundary enforced. The audit log just made that visible.
That's probably the alarm for most teams: the day they can't tell the difference between model error and infrastructure error because nobody logged the boundary between them. It's already happening. Most engineers are just attributing it to model quality and moving on.
Worth continuing this off-thread — I'll find you on LinkedIn.
The model-error-vs-infrastructure-error attribution gap is the thing teams are going to discover in production over the next 6-12 months whether they want to or not.
Most engineering orgs treat model behavior as the variable and infrastructure as the constant, when at the gateway / capability / scope layer it's frequently the opposite. The audit log doesn't just enable post-hoc diagnosis. It forces the team to make the boundary explicit at design time. Which is why your four fields emerged from the question rather than a schema: you couldn't answer "what actually happened" without surfacing the layers.
I keep thinking about this in terms of where the trace lives. In your stack the audit log is application-level instrumentation around the gateway. The version I'm building is moving that trace into the protocol itself, so every agent action carries its capability context, operation type, and authority state on-chain. Attribution becomes structural rather than something teams have to remember to log. Different shape, same problem.
Happy to take it off-thread. I do not have a linked in, and I'm most active on X (@NOVAInetwork). Feel free to reach out there.
In the old school way, will a senior developer or team lead, let the new super smart junior developer do everything, without being monitored and supervised? Will somebody give this highly confident soul, without any, experience the privileges to do any kind of deletion?!
Having agents to work for you is a responsibility, and not a reason to get lazy ;)
Exactly the right analogy — and exactly the problem most teams skip until something breaks.
The "brilliant junior dev with root access" is a perfect way to frame what most MCP deployments look like today. The tools are capable. The agents are confident. Nobody installed the guardrails.
A senior dev doesn't just hire the junior — they define what they can touch, log what they did, and review before anything irreversible runs. That's the governance layer MCP is missing by default.
Agents working for you should feel like a well-managed team, not a contractor with your SSH key and no Slack. The responsibility doesn't go away because the worker is an AI. If anything, it scales faster.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.