DEV Community: Waxell

Mythos Exploits Zero-Days at Machine Speed: Runtime Gap

Logan — Thu, 21 May 2026 16:26:16 +0000

On April 7, Anthropic announced it was withholding its most capable model from general release. Mythos Preview — Claude's research frontier model — can autonomously find zero-day vulnerabilities in every major operating system and every major web browser, then turn them into working exploits. Not in weeks. Not in days. At machine speed — in hours, not the months that once separated discovery from weaponization.

Twelve organizations are among the first with access — with roughly 40 more participating in supporting roles — under a consortium called Project Glasswing. The rest of the world just found out why that number is deliberately small.

The enforcement gap is the space between pre-launch model review and runtime policy enforcement. A pre-launch review tells you what a model is capable of doing under controlled conditions. Runtime enforcement governs what a deployed agent running that model is actually permitted to do during a live production session — with real tool access, real data, and real consequences. The Trump administration is about to address the first. Nobody has solved the second.

President Trump is expected to sign an AI cybersecurity executive order as soon as Thursday, creating a proposed voluntary pre-launch review period of up to 90 days for frontier AI models and establishing a government clearinghouse — reportedly coordinated through the Treasury Department and cybersecurity agencies including CISA — to identify and remediate vulnerabilities before commercial release. The order was reportedly triggered by Mythos's capabilities and other frontier AI models, including OpenAI's GPT-5.5-Cyber, according to reporting from CNN and Bloomberg ahead of the signing.

This is a real policy response to a real capability. It is also addressing the wrong side of the deployment lifecycle.

What the Mythos Model Can Actually Do

The capability disclosure is not speculative. Anthropic's own red team documentation describes Mythos Preview as "extremely autonomous" in finding software vulnerabilities — capable of chaining browser exploits, executing privilege escalation on Linux systems, and generating remote code execution exploits against production server software. Thousands of vulnerabilities that would challenge even the most experienced human bug hunters.

The speed differential is what changed the threat model. Defenders have historically relied on the time gap between vulnerability discovery and weaponization — a zero-day might be found and kept private for months while exploit code was developed. Mythos collapses that window dramatically. Engineers with no formal security background asked it to find remote code execution vulnerabilities and came back the next morning to working exploits already generated.

Google Threat Intelligence Group confirmed on May 11, 2026 the first documented case of an AI-developed zero-day exploit used in a planned mass exploitation campaign. A threat actor used an AI model to discover and weaponize a 2FA bypass vulnerability in a widely-deployed open-source web-based system administration tool. Google's GTIG identified the attack before the mass exploitation event launched — recognizing the AI-generated exploit by its characteristic markers: highly annotated Python code with educational docstrings, and a hallucinated (non-existent) CVSS score. The threat actor apparently didn't notice the hallucinated score.

Google likely stopped that specific campaign. The technique is now documented.

Why Pre-Launch Review Doesn't Close the Enforcement Gap

The Trump EO's proposed review framework is designed to give government visibility into frontier model capabilities before the public gets access. The cybersecurity clearinghouse model — voluntary participation, coordinated disclosure, government-industry collaboration — is a reasonable starting point for pre-deployment screening.

Here is the structural problem: a pre-launch review examines what a model can do. It cannot govern what a deployed agent running that model actually does in production.

The enforcement gap is not at the model level. It is at the execution level.

An enterprise team that clears the government's pre-launch review process has passed one gate. They have not addressed what happens when that model runs inside an agent with access to production systems, code execution environments, network interfaces, or external APIs — all of which are normal deployment contexts. An ungoverned agent running on Mythos-class capabilities with a code execution tool can scan a target, identify a zero-day, and generate a working exploit within a single execution arc. No human in the loop. No enforcement layer to fire. The pre-launch clearinghouse reviewed the model's capabilities in isolation. It does not see your production deployment.

That gap is architectural. The EO addresses disclosure before deployment. The enforcement gap persists after it.

What Teams Deploying Frontier Agents Need to Verify Now

Before Thursday's signing generates compliance noise, here is what matters operationally for teams deploying Claude models or other frontier AI agents:

Map what the agent can reach. Every system, API, and tool your agent has access to is a potential attack surface when the underlying model can identify and weaponize vulnerabilities. An agent running on a Mythos-class model with access to a code execution environment, network tooling, or file system access is operating at a level of risk that observability dashboards do not address. The signal-domain boundary is the architectural control that defines what data and systems the agent can reach at all — restrict it to only what the agent's function requires.

Confirm pre-execution policy enforcement is in place. Monitoring tools catch problems after an agent has already run a tool call. For agents with Mythos-class reasoning capabilities, that is too late. You need input validation policies that evaluate intent and scope before execution begins — before the tool call fires, not after the action completes.

Test whether your kill switch fires on the right signals. If an agent starts querying network topology, writing to unexpected directories, or chaining tool calls in patterns that look like reconnaissance, you need a hard stop — not a log entry. A Kill Switch policy terminates the execution arc immediately when a configured threshold is crossed. Most teams have monitoring. Fewer have pre-execution enforcement. Check which one your current stack actually provides.

Ensure your execution record is defensible. When the government's clearinghouse calls post-incident (and it will), "we were monitoring" is insufficient. You need a complete, durable record of what the agent queried, what tools it called, what was approved, and what was blocked — structured for forensic review. That is an audit trail, not a log file.

How Waxell Runtime Handles This

Waxell Runtime is the enforcement layer between a model's capabilities and your production systems. It does not replace the government's pre-launch review process — that screens what a model can theoretically do in isolation. Waxell Runtime governs what a deployed agent is actually permitted to do during a live production session.

For frontier model deployments specifically, three policy types address the enforcement gap directly:

Kill Switch policies terminate an agent's execution arc when it crosses a defined threshold — before the action completes. If an agent's tool call sequence begins resembling a vulnerability scan, a privilege escalation attempt, or a network reconnaissance pattern, execution stops. The policy fires pre-execution, not post-run. It is the difference between observing that an agent did something it should not have and preventing that action from completing.

Content policies block inputs and outputs that match exploitation patterns. Prompt injection attempts, code generation targeting specific vulnerability classes, and output structures encoding exploit payloads can all be caught at the policy layer before they reach the model's context or leave the agent's output boundaries. The security guarantees come from enforcement, not from model alignment alone.

Control policies enforce scope limits on what a deployed agent can access at all. The signal-domain boundary is the architectural equivalent of least-privilege networking — the agent only has visibility into the data and systems explicitly permitted for its function. A billing agent does not need network access. A code review agent does not need production database credentials. These boundaries are defined as Kill Switch and Control policies, not inherited defaults.

Waxell Runtime ships with 26 policy categories and integrates with over 200 LLM providers and agent frameworks without changes to your agent code. Two lines of initialization. No rebuilds required. The governance layer sits above the agent — it does not require rewriting the agent itself.

The EO's clearinghouse will tell you whether the underlying model passed pre-launch review. Waxell Runtime enforces what happens after your agent is deployed. Those are different problems. Only one of them has a regulatory answer coming Thursday.

Get access to Waxell Runtime to see what 26 policy categories look like in your environment.

FAQ

Does the Trump AI cybersecurity executive order apply to enterprise companies using frontier AI models?
The EO as currently described applies directly to AI model providers — requiring voluntary pre-launch model sharing with a government cybersecurity clearinghouse. Enterprise teams deploying those models are not directly covered by the order, but they inherit the security and compliance responsibility for how frontier models are used in production. The enforcement gap at runtime is entirely an enterprise responsibility. The government clearinghouse does not extend into your deployment.

What is Anthropic Mythos and why does it matter for enterprise AI security?
Anthropic Mythos Preview is a frontier AI model capable of autonomously discovering and weaponizing zero-day vulnerabilities in production software — including every major operating system and web browser — generating working exploits at machine speed. Anthropic has restricted access to a core group of technology partners under Project Glasswing, a consortium coordinating defensive use of the model ahead of any broader release. The Trump AI EO was reportedly triggered in part by Mythos and other frontier AI models. Enterprises deploying Claude-class models or other frontier agents should treat Mythos's documented capabilities as the current frontier for what runtime agent governance needs to address.

What is a Kill Switch policy in AI agent governance?
A Kill Switch policy is a runtime enforcement rule that terminates an agent's execution arc when a defined threshold is crossed — before a harmful or out-of-scope action completes. Unlike a monitoring alert, which fires after the fact, a Kill Switch fires pre-execution and stops the agent mid-session. For Mythos-class deployments, where exploitation sequences can complete at machine speed, the distinction between pre-execution enforcement and post-run observation is the difference between stopping an attack and documenting it.

Can observability tools like LangSmith or Arize catch Mythos-class exploitation attempts?
Observability tools record what agents do. They do not prevent it. LangSmith, Arize, Helicone, and similar platforms surface traces and logs after execution. A Mythos-class model operating at machine speed can complete an exploitation sequence faster than a human can review an alert. The enforcement layer must operate pre-execution — before the tool call fires, not in the post-run dashboard. Monitoring is necessary. It is not sufficient.

What specifically did Google's May 2026 zero-day finding confirm?
Google Threat Intelligence Group identified a threat actor who used an AI model to discover and weaponize a 2FA bypass vulnerability in a widely-used open-source web-based system administration tool, in preparation for a planned mass exploitation campaign. Google's detection was based on the AI-generated exploit's distinctive characteristics: educational docstrings, a hallucinated CVSS score that did not correspond to any real CVE, and a textbook Pythonic coding structure characteristic of LLM training data. GTIG disrupted the campaign through coordinated disclosure with the affected vendor. This is the first publicly documented case of an AI-developed zero-day used for a planned real-world mass exploitation event.

What should enterprise teams do before the Trump AI EO takes effect?
Four concrete steps: (1) Map every system, tool, and API your frontier agents can reach and remove access that is not required for the agent's defined function. (2) Add pre-execution policy enforcement — Kill Switch and Content policies — for any agent running on a Mythos-class or similarly capable model. (3) Verify your kill switch fires pre-execution, not post-run. (4) Confirm your execution records are complete and defensible for forensic review, not just operational logs. The government clearinghouse will eventually ask what controls you had in place at runtime.

Sources:

Trump could sign AI executive order as soon as Thursday — CNN Business, May 20, 2026
Trump Set to Sign AI Cybersecurity Directive as Soon as Thursday — Bloomberg, May 21, 2026
Scoop: Trump AI executive order seeks early government access to frontier models — Axios, May 20, 2026
Anthropic withholds Mythos Preview model because its hacking is too powerful — Axios, April 7, 2026
What Anthropic's Mythos Means for the Future of Cybersecurity — Schneier on Security, April 2026
Anthropic's Mythos Has Landed: Here's What Comes Next for Cyber — Dark Reading
Google says it likely thwarted effort by hacker group to use AI for 'mass exploitation event' — CNBC, May 11, 2026
Google says criminals used AI-built zero-day in planned mass hack spree — The Register, May 11, 2026
Google Detects First AI-Generated Zero-Day Exploit — SecurityWeek, May 11, 2026

What 'Agent-Readable' Actually Means (And Why Most Files Aren't)

Frances — Thu, 21 May 2026 14:57:35 +0000

The phrase comes up in every conversation about AI workspaces: "agent-readable." Something is agent-readable or it isn't. The assumption is that everyone already knows what this means.

I didn't, for a while. I built workflows that kept producing wrong outputs, and eventually traced most of the failures back to the same thing: I was saving files in the wrong place, in the wrong format, and calling it done once they existed.

Agent-readable context means files, state objects, and workspace content structured so that an AI agent can consume and act on them autonomously — not just locate them. The right format, in the right workspace, kept current as the real source of truth. A file a human can read is not automatically one an agent can use. The difference is in structure and location.

The problem with a file that just exists

A document in your Downloads folder is human-readable. A spreadsheet on a shared drive is human-readable. They exist; anyone with access can open them. Getting that context to an agent, though, requires someone to find the right version, pull the relevant parts, and paste them into a session.

Before I set up Connect, this was my entire context workflow. I tracked tasks and customer notes in a project management tool, and when I needed AI help — drafting an email, writing a post, working through a decision — I manually copied the relevant notes and pasted them into the session. The agent only knew what I put there. Nothing persisted. The next day I started over.

What I was doing was manually bridging the gap between human-readable files and what an agent actually needs. I was doing the work the workspace should have done.

What makes something actually agent-readable

Three things, in my experience.

Location. An agent entering a Connect workspace reads what's in that workspace. It doesn't go looking in other workspaces, your desktop, or your Google Drive. If the context you need lives somewhere else, the agent doesn't have it. The fix is straightforward — put the right files in the right workspace — but it requires treating access as a design decision, not an afterthought. Anthropic's context engineering team makes this point directly: naming conventions, folder hierarchies, and file placement are signals that help agents understand what's relevant and when to use it.

Structure. A state object in Connect stores structured, queryable data. An agent reading a customer's lifecycle stage reads a field — not a paragraph of notes where the current status is somewhere in the third sentence. For less structured content like playbooks and standards docs, consistent headings do most of the same work. If the voice guidelines are under a section called "Voice Guidelines," the agent finds them. If they're buried in a block of prose, the agent works with what it can infer, which is not the same thing.

Currency. Out-of-date context produces wrong outputs — emails with old pricing, posts written to a tone you've already moved on from, customer follow-ups that reference the wrong lifecycle stage. Agent-readable means maintained as the real source of truth. A file that was accurate in February is context from February, not context for today.

How Connect makes this concrete

Every Cowork session I open, I specify the workspace, and the agent reads the playbook before I type a word. The playbook is a markdown file that lives in the workspace. I set it up once, update it when my process changes, and it's there every time — versioned, no re-typing required.

State objects go further. Unlike a static file, a state object can be updated by agents and humans alike — a customer profile updated when onboarding completes, a project status updated when a milestone is hit. That's live context an agent can read and build from, not a document that needs someone to manually keep it current.

The compounding part is what makes the setup cost worthwhile. One update to my content standards file reaches every workflow that reads from that workspace. One change to a product reference doc gets picked up by every agent that touches outreach, support, or content. I stop managing context separately from the work. They live in the same place, and the agents read from it.

If you want to try building with agent-readable context, waxell.ai/get-access is where to start.

This is how it works in my setup, using Cowork as my interface for Connect. Connect is also accessible via API and web UI — if you've built your own agent tooling or are accessing Connect programmatically, the workspace files and state objects work the same way.

FAQ

What does agent-readable mean?

Agent-readable means files and data structured so that an AI agent can find, interpret, and act on them autonomously — without a human locating, reformatting, or pasting the context into a session. This covers three things: location (the right workspace), structure (consistent format the agent can parse), and currency (kept current as the actual source of truth, not a snapshot).

What's the difference between a human-readable file and an agent-readable file?

A human-readable file is something a person can open and make sense of. An agent-readable file is something an AI agent can locate, interpret, and act on without additional instructions. Both can be the same file — the difference is whether it's in the right workspace, structured consistently, and current. A PDF on your desktop is human-readable. A markdown file in the relevant Connect workspace with consistent section headers and up-to-date content is agent-readable.

Why does file structure matter for AI agents?

Because agents work from what's there, not from what you meant. Structured data — fields with defined types, tables with consistent schemas, documents with predictable headers — gives an agent specific, unambiguous information to act on. Unstructured prose requires the agent to interpret and infer, which introduces error. RAG systems using structured data produce more precise, verifiable outputs than those working from unstructured documents for exactly this reason: the agent retrieves a field value, not a best-guess extraction from a paragraph.

What is a workspace playbook and why is it agent-readable?

A workspace playbook is a markdown file that agents read automatically when they enter a Connect workspace. It contains the workspace's purpose, relevant context, and the process the agent should follow. It's agent-readable because it lives in the right location (the workspace itself), uses consistent structure (markdown headers the agent can navigate), and gets updated as the source of truth rather than left static. The difference from a prompt is location: a prompt lives in a chat session and has to be typed again. A playbook lives in the workspace and is there every time.

What happens when context isn't agent-readable?

Usually one of two things: the agent works without the context and produces generic output that misses what's specific to the situation, or it asks for clarification. In a scheduled task that runs overnight, there's no one to ask. The task runs with incomplete context or fails. Most of the time I've seen agents produce outputs that were technically correct but practically useless — the right shape, wrong substance — the cause was context that existed somewhere but wasn't agent-readable where the agent was working.

Do I need to restructure all my existing files to use Connect?

Not all of them. Start with the context that agents actually need for the workflows you're building. Customer profiles, product references, brand standards, process docs — these are the ones worth getting into proper agent-readable shape. A file you never give an agent doesn't need to be agent-readable. The design work is in identifying what context each workflow depends on and making sure that context is in the right workspace, in a consistent format, and kept current.

Sources

Anthropic Applied AI Team. Effective context engineering for AI agents. Anthropic Engineering, September 29, 2025. https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
AI21 Labs. RAG for Structured Data: Benefits, Challenges & Examples. AI21 Knowledge Hub, August 5, 2025. https://www.ai21.com/knowledge/rag-for-structured-data/

$87K to $24K: How AI Agent Model Tier Routing Cuts Costs Without Sacrificing Quality

Logan — Wed, 20 May 2026 15:17:36 +0000

In April 2026, a growth-stage SaaS company with 35 engineers received an API bill for $87,000. Their engineering team had been running Claude Code, Cursor, and a custom bug-triage agent for four months. No one had set a model routing policy. Every step in every agent loop — file reads, routine code edits, tool routing decisions, validation passes — defaulted to Opus 4.7. The bill was not caused by careless developers. It was caused by an architectural decision no one had made explicitly: which model handles which task.

By May, after implementing model tier routing, prompt caching, and context pruning, the same team's bill was $24,000. Annual savings: $756,000. Engineering productivity, measured by sprint velocity, was unchanged.

This is the model routing problem condensed to a single case study. The expensive model was doing work the cheap model handles just as well. No one had told it otherwise.

Why the Wrong Model Costs More Than You Think

The core issue is not that frontier AI models are expensive — it is that most agentic frameworks default to a single model for everything. That model is usually a mid-tier or frontier option, because that is what produced the best demo results. In production, it handles file reads, variable extractions, tool routing decisions, and boilerplate code edits at the same per-token price as architectural reasoning and multi-file debugging.

The math compounds quickly. A LeanOps audit of 30 engineering teams running agentic AI (March–May 2026) found that re-sent context accounts for 62% of the average agent API bill — not actual reasoning output, but the same system prompts, tool definitions, and conversation history re-sent on every step. Every LLM API call is stateless. The provider does not remember your previous turn. So agents send the entire accumulated history on every tool call, every validation, every re-check.

By step 20 in a loop with file reads, the input on a single call can exceed 50,000 tokens. At Claude Sonnet 4.6's input rate, one late-loop step alone costs $0.15. Multiply by 50 steps, 50 tasks per developer per day, and 20 developers over a 22-day month: you are approaching $110,000 per month before any budget alert has fired.

The LeanOps audit found a 20x spread between the 10th and 90th percentile developer cost for teams using ostensibly the same tool. The difference was almost entirely which model developers defaulted to and whether prompt caching was enabled. Two engineers, identical tools, wildly different bills.

The Model Tier Routing Pattern

Model tier routing assigns different agent steps to different model tiers based on the cognitive demand of each step. The principle is to use the cheapest model that produces acceptable output for each step type, and reserve expensive models for the steps that genuinely require frontier reasoning.

A practical three-tier structure for 2026 model families:

Tier 1 — Routine steps: File reading, variable extraction, structured data parsing, tool selection from a small menu, boilerplate code edits. These tasks are well-defined, have correct/incorrect answers, and do not benefit from frontier reasoning. Models: Haiku 4.5, GPT-5 Nano, Gemini 2.5 Flash-Lite. Cost: roughly 1/20th of a frontier model call.

Tier 2 — Standard reasoning: Code review, test writing, API integration, single-file refactoring, summarization. Most of an agent's real work falls here. Models: Claude Sonnet 4.6, GPT-5, Gemini 2.5 Pro. Cost: roughly 1/5th of a frontier model call.

Tier 3 — Hard reasoning: Architectural decisions, multi-file refactors with subtle dependency chains, security analysis, root-cause diagnosis on complex bugs. This is where frontier models earn their cost. Models: Opus 4.7, GPT-5.5. The remaining step types in this category — complex architecture decisions, multi-file refactors with non-obvious dependency chains — genuinely benefit from frontier-level reasoning, and benchmark data from both Anthropic and OpenAI shows meaningful performance gaps between frontier and mid-tier models on these task classes.

A workflow that routes 80% of steps to Haiku 4.5 and escalates only the hard 20% to Opus 4.7 produces 60–80% savings compared to an all-Opus workflow, according to the LeanOps audit — with comparable end results on standard workloads. That single routing decision, applied uniformly across an engineering team, accounts for the majority of the $87K → $24K reduction in the LeanOps case study.

What Observability Tools Miss

The default response to high agent costs is to install an observability platform. LangSmith, Langfuse, Helicone, and Arize all offer cost dashboards, per-trace spend breakdowns, and spend-per-agent visibility. These are useful. They are also insufficient.

Visibility tools tell you that costs are high after the money has been spent. They do not prevent the routing decision that sent a file-read step to Opus 4.7. The incident described by the builder of AgentBudget on Hacker News (May 2026) — a GPT-4o retry loop that cost $187 in 10 minutes — was not an observability failure. The developer could have watched it happen in LangSmith in real time. The failure was the absence of runtime enforcement: no rule that said "if this step type is a retry and per-call cost exceeds a threshold, stop and escalate."

This is the distinction between monitoring and governance. Monitoring surfaces data. Governance enforces rules.

There is also a compounding irony: LangSmith's per-trace pricing adds its own overhead — $2.50 per 1,000 base traces, $5.00 per 1,000 extended traces — which compounds at high-volume deployments. Teams pay for observability while continuing to route everything to the wrong model — watching the bill grow in a dashboard they cannot act on.

How Waxell Runtime Enforces Model Routing

Waxell Runtime applies model routing and cost rules at the governance layer — above agent code, without requiring rebuilds. Rather than modifying each agent's internal routing logic or patching individual framework libraries, Runtime enforces token budgets and model tier policies as configuration: specific step types are capped to specific model tiers, per-session hard stops prevent runaway loops, and budget thresholds trigger rerouting or escalation rather than continued execution.

Waxell Observe, which instruments over 200 libraries automatically with 2 lines of code, provides the real-time cost telemetry that feeds Runtime's enforcement decisions. When a session approaches its budget ceiling, Runtime does not just alert — it routes to the configured lower tier or halts, depending on the policy.

The 26 policy categories available out of the box include cost-specific controls: per-call token caps, per-session budget hard stops, loop detection with step-count ceilings, and model tier enforcement rules. These apply across the agent fleet without changes to each agent's code. The engineering team in the LeanOps case study spent three weeks implementing equivalent controls manually; a Waxell-instrumented system applies them through initial governance configuration.

The governance plane distinction matters structurally here. An agent that enforces its own cost limits through custom code is only as reliable as that code. A governance layer that sits above the agent and enforces limits regardless of agent behavior is reliable even when agent code changes or new agents are added to the fleet.

The Multi-Agent Budget Problem

Model routing gets significantly more complex when agents spawn sub-agents. A failure mode surfaced in the HN discussion on AgentBudget (May 2026): agent A spawns agent B with its own budget, but B's spend does not count against A's ceiling. B exhausts its budget and stops; A continues, unaware that the total cost has already exceeded the per-task limit.

In pipelines where a single user request triggers a query analysis agent, an embedding agent, a reranking agent, and a response generation agent — each with independent billing — the aggregated cost is invisible to every individual agent. Each one is within its own budget. The total is not.

Runtime's budget hierarchy addresses this through cost propagation: child agent costs roll up to the parent's ceiling, so a task-level budget cap applies to the entire execution tree, not just the triggering agent. This is structural governance, not a post-hoc monitoring aggregate, and it does not require developers to declare the full call graph before runtime.

The Routing Audit Every Team Should Run

Before tooling changes, run this two-step cost audit to establish where current spend actually goes.

Step 1 — Tag every API call with step type (file read, code edit, architectural decision, retry, etc.) and the model used. Aggregate spend by step type and model tier. In most unoptimized systems, this reveals that more than half of spend is on routine operations using a Tier 2 or Tier 3 model.

Step 2 — Map step types to tiers. For the top five step types by cost, determine whether a lower-tier model produces acceptable output. Run benchmark tests per step type with Tier 1 and Tier 2 models against the Tier 3 baseline. In the LeanOps audit data, routine file reads, boilerplate edits, and variable extractions showed no measurable quality difference between Haiku 4.5 and Sonnet 4.6. The quality gap concentrated in architectural reasoning and multi-file debugging with complex dependencies.

Teams that completed this audit and implemented tier routing reduced agent costs by 55–75% within 30 days, according to the LeanOps 30-team study.

FAQ

What is model tier routing for AI agents?
Model tier routing is the practice of directing different agent steps to different model tiers based on cognitive demand. Routine steps like file reads and variable extractions go to cheap, fast models; complex reasoning steps go to frontier models. The goal is to match model cost to the actual reasoning requirement of each step, rather than defaulting every step to the same — usually expensive — model, which is what most agentic frameworks do out of the box.

How much can model tier routing reduce AI agent costs?
According to LeanOps's audit of 30 engineering teams (March–May 2026), routing 80% of agent steps to Haiku-tier models while reserving frontier models for the hard 20% produces 60–80% savings compared to an all-Opus workflow, with comparable output quality. Combined with prompt caching and context pruning, the teams in the audit achieved 55–75% cost reduction within 30 days without measurable quality loss on standard workloads.

Why isn't an observability platform enough to fix this?
Observability platforms like LangSmith and Helicone show you where costs are going after the fact. They do not enforce routing decisions or prevent expensive model calls from happening in the first place. The monitoring gap is not about visibility — it is about enforcement. Model routing policies need to be applied at execution time, before the API call goes out, not surfaced in a post-run cost dashboard.

Does model tier routing affect output quality?
On routine agent steps — file reads, variable extractions, structured data parsing, boilerplate code edits — quality on Haiku 4.5 is equivalent to Sonnet 4.6 for most workloads. The quality difference concentrates in tasks requiring multi-step reasoning over ambiguous, context-dependent inputs: architecture decisions, multi-file refactors with non-obvious dependency chains, security analysis. Routing decisions should be based on empirical quality benchmarks per step type, not intuition.

What is the multi-agent budget inheritance problem?
When agents spawn sub-agents, each sub-agent's costs may not count against the parent's budget ceiling. A task that appears to stay within budget can exceed it because sub-agent spending is not propagated upward. Runtime budget hierarchy, which rolls child costs into parent ceilings, prevents this class of invisible overruns — a problem that does not appear in per-agent dashboards until after the fact.

How does Waxell enforce model routing without requiring code changes?
Waxell Runtime applies routing policies at the governance layer via its instrumentation layer — above agent code. Agents do not implement routing logic internally. The Runtime policy defines which model tiers are permitted for which step types and what cost limits apply per session. These rules apply across the agent fleet through governance configuration, not per-agent rebuilds. No rebuilds required.

Sources

Ravi Kanani, "Agentic AI Cost Runaway: Why One Cursor User Burned $4,200 in a Weekend (And How to Stop It)" — LeanOps, May 14, 2026.
sahiljagtapyc, "Show HN: AgentBudget – Real-time dollar budgets for AI agents" — Hacker News, May 19, 2026.
"Best AI Model for Coding Agents in 2026: A Routing Guide" — Augment Code, 2026.
"LangSmith Pricing" — LangChain, accessed May 2026.

Gemini Intelligence Governance: The Enterprise Gap Google I/O Won't Mention

Logan — Mon, 18 May 2026 19:51:22 +0000

Tomorrow, Google will take the stage at I/O 2026 and make Gemini Intelligence sound like the only reasonable future for Android. They're not wrong. Autonomous AI agents running natively on phones — reading what's on your screen, navigating across apps, completing multi-step tasks without a tap — is a genuine capability leap. Google has shipped it cleanly.

What the keynote won't cover: if your employees use Gemini Intelligence on corporate Android devices, you now have autonomous agents operating inside your enterprise without a governance layer.

Not a light governance gap. A structural one.

Agentic governance is the set of runtime policies and enforcement mechanisms that define and constrain what AI agents can access, spend, and do — independent of the agent's own reasoning. It operates at three layers: policy definition (the rules), runtime enforcement (policies that fire before actions execute), and audit (documenting every governance decision for accountability). It is not observability. Observability tells you what happened. Governance determines what's allowed to happen.

Google has built excellent agentic governance for agents you build on its cloud. Gemini Intelligence — the agent running on your employees' phones this summer — ships with something different: user controls. Well-designed for consumers. Structurally insufficient for enterprise.

What Is Gemini Intelligence, Exactly?

Gemini Intelligence is Google's agentic layer for Android, announced May 12 at The Android Show pre-I/O event and launching on the latest Pixel and Samsung Galaxy devices this summer before rolling out to Android broadly. It is Google's implementation of "computer use" — the agent reads what's on your screen, understands context, and acts autonomously across apps to complete tasks.

In practice: a user asks Gemini to turn a grocery list into a delivery order, fill out a multi-step form across several apps, book a reservation using calendar context, or run workflows that would otherwise require a human to manually navigate through three or four screens. The agent has session-level memory, cross-app access, and the ability to take real-world actions on the user's behalf without asking again at each step.

This is not a chatbot that answers questions. It is an action-capable agent shipping at consumer scale — hundreds of millions of Android devices.

Google's own threat intelligence team documented the risk context directly: malicious prompt injection attempts against AI agents and AI-enabled web services increased 32% between November 2025 and February 2026. Google's research, which scanned the CommonCrawl web archive, found that most of these attempts were still low sophistication — individual website authors running experiments rather than coordinated attacks — but the directional trend matters: the attack surface is growing as agents with real-world tool capabilities become more widespread targets. Separately, security firm ESET disclosed a proof-of-concept Android malware strain called "PromptSpy" that exploits Gemini to automate its persistence mechanism — described as the second known case of AI-driven mobile malware. ESET has not detected PromptSpy in product telemetry and confirmed widespread in-the-wild deployment has not been observed. It is a research finding, not yet an active mass threat — but the technique it demonstrates is real.

What Governance Google Provides — and What It's Designed For

Google shipped real consumer-facing controls with Gemini Intelligence, and they're well-designed for their intended audience.

Users get explicit opt-in authorization — Gemini cannot automate an app you haven't approved. A persistent notification chip appears at the top of the screen whenever automation is active. The Android Privacy Dashboard is being enhanced to show which AI assistants were active and which apps they touched in the last 24 hours. Core security architecture is open-source and third-party audited. Purchases require user confirmation before Gemini executes them.

These controls answer the consumer question: does the user know what the agent is doing, and can they stop it?

They do not answer the enterprise question: can IT define policy for what the agent is allowed to do across all employee devices, enforce that policy at runtime, and produce a compliance-grade audit trail of what the agent did?

The answer is no. Not through Gemini Intelligence on Android.

What's Missing for Enterprise Deployments

When an enterprise deploys Android to its workforce, it can manage apps, enforce MDM policies, restrict network access, and control device enrollment. What it cannot do — through Gemini Intelligence — is any of the following.

Set organizational agent policies. There is no IT admin console where a security team can specify that Gemini agents on corporate devices may not touch files in particular directories, may not auto-complete forms in apps that handle customer data, or must trigger a human-approval step before acting on any CRM-connected workflow. User opt-in is not IT policy enforcement.

Enforce fleet-level kill switches. If a new prompt injection attack vector surfaces and the security team needs to halt Gemini Intelligence activity across its entire Android fleet in response — there is no organizational kill switch. The controls live at the user level.

Audit what the agent did on behalf of the enterprise. The Android Privacy Dashboard shows users their last 24 hours of AI activity. That's a privacy transparency feature. It is not an enterprise audit trail — immutable, exportable, attributable to a session, a policy state, and a user identity in a format a compliance reviewer can actually use.

Define cross-app scope limits. An enterprise might legitimately want Gemini Intelligence available for productivity tasks while blocking it from operating in apps that touch source code, financial records, or customer PII. That boundary does not exist as a configurable enterprise policy.

Note what this list is not: it's not a criticism of Google's consumer product. Gemini Intelligence's user controls are good. The problem is that enterprise governance is a different category than consumer privacy controls, and the two aren't substitutes.

Google Has the Answer — for a Different Product

Google does have enterprise-grade agentic governance. It's called the Gemini Enterprise Agent Platform, and it includes Agent Identity (cryptographic per-agent identities with scoped authorization policies), Agent Gateway (policy enforcement and prompt injection protection for all agent-to-tool and agent-to-agent connections), and Agent Registry (a central catalog of approved agents and MCP servers with enforced metadata). This is serious infrastructure, announced at Google Cloud Next '26 in April.

The Gemini Enterprise Agent Platform governs agents you build and deploy on Google Cloud. It is not a governance layer for Gemini Intelligence running on employee Android devices. The two products live in different parts of Google's stack.

This is the gap: Google's enterprise governance tools assume you built the agent. Gemini Intelligence is an agent you didn't build, running on your fleet, acting on behalf of your employees.

Only 36% of organizations have a centralized approach to agentic AI governance, according to Google's own 2026 AI Agent Trends Report. Just 12% use a centralized platform to maintain control over AI sprawl. Gemini Intelligence's rollout this summer will expand that exposure significantly before most enterprise security teams have a plan for it.

What Happens When Gemini Intelligence Gets Prompt-Injected?

Google's security framework documents the risk: when Gemini operates with tool-use capabilities, injected instructions from malicious content — a poisoned web page, a crafted document, a message in a third-party app — can trigger real-world actions. Google has built safeguards at the Android layer to catch this, similar to Chrome's auto-browse protections.

But for enterprise deployments, the risk calculus differs from consumer use. A successful prompt injection against Gemini Intelligence on an employee's corporate device isn't just a personal inconvenience. It's a potential unauthorized action inside the enterprise: a form submitted, a file attached, a message sent from a work identity to an external system. Prompt injection is an agent-layer problem — it targets the reasoning system, not just the access layer — and user-level opt-in settings are not a defense against it. Current observed attempts are mostly low sophistication; that won't remain true as agents proliferate and the payoff from exploitation grows.

Enterprise governance requires policies that intercept actions before they execute, independent of what the agent decides to do. That's the layer missing from Gemini Intelligence.

How Waxell Connect Handles Agents You Didn't Build

This is the use case Waxell Connect was built for: governing AI agents you didn't write.

Waxell Connect enforces governance policies on external and third-party AI agents — agents you don't control the code of — without requiring an SDK, code changes, or access to the agent's internals. No rebuilds. You define policies across 26 policy categories: Content (filter what data the agent can see), Control (require human approval for specific actions), Kill (terminate sessions that exceed behavioral boundaries), Cost (cap what the agent can spend per session), and Quality (enforce output constraints). Waxell Connect enforces them at the boundary between agent and system.

For enterprise Android fleets running Gemini Intelligence, this means an IT security team can set organizational governance rules that apply to Gemini agents operating on behalf of employees — across every device, every session, without modifying the Android installation or waiting for Google to ship an enterprise controls update.

The audit trail is a first-class output. Every governance decision — every policy evaluation, every action allowed or blocked — is captured with full session context in a format built for compliance review, not debugging. That's the documentation that matters when a regulator or auditor asks what your agents were doing. (For what a complete compliance audit trail for agents looks like in practice, see our detailed breakdown.)

Waxell Runtime handles the other half of this: if your team is building agentic workflows that interact with the same enterprise systems that Gemini Intelligence touches, Runtime provides the policy enforcement and durable execution layer for the agents you're running directly. The same 26 policy categories. The same audit trail. Two-line initialization against 200+ framework and provider libraries.

The "wait for Google to ship enterprise controls for consumer Gemini" strategy is a plan to be ungoverned during the period when agent adoption is accelerating fastest. Tomorrow's I/O keynote will not retroactively govern the fleet you already have.

Get access at waxell.ai →

Frequently Asked Questions

What is Gemini Intelligence?
Gemini Intelligence is Google's agentic AI layer for Android, announced May 12, 2026 at The Android Show pre-I/O event. It functions as a "computer use" agent — reading screen content, navigating apps autonomously, and completing multi-step tasks on the user's behalf without manual input at each step. Launching on the latest Pixel and Samsung Galaxy devices in summer 2026 before rolling out to Android devices broadly.

Is Gemini Intelligence safe for enterprise use?
Google has built consumer-grade safety controls: per-app opt-in authorization, an active session notification chip, a 24-hour AI activity dashboard, and purchase confirmation gates. These are user-facing controls. Enterprise governance requires organizational policy enforcement, fleet-level kill switches, and a compliance-grade audit trail — capabilities that do not ship with Gemini Intelligence on Android.

What is the enterprise governance gap with Gemini Intelligence?
IT administrators cannot define organizational policies for what Gemini agents can do on corporate devices, cannot enforce kill switches at the fleet level, and cannot produce a compliance-grade audit trail of Gemini agent activity. Google's enterprise governance stack (Gemini Enterprise Agent Platform) governs agents you build on Google Cloud. It does not govern Gemini Intelligence on Android.

How do you govern AI agents you didn't build?
Waxell Connect governs external and third-party AI agents without requiring SDK integration or code changes. You define policies across 26 policy categories — including Content, Control, Kill, Cost, and Quality — and Waxell Connect enforces them at the boundary between agent and system.

What is the prompt injection risk with Gemini Intelligence?
Google's own threat intelligence found a 32% increase in malicious prompt injection attempts against AI agents and AI-enabled services between November 2025 and February 2026 — though most observed attempts were low sophistication, with researchers characterizing them as experiments rather than coordinated attacks. When an agent has tool-use capabilities, a successful injection can trigger real-world actions. ESET has disclosed a proof-of-concept malware strain ("PromptSpy") that demonstrates Gemini being exploited to automate persistence — the second known example of this attack class. ESET has not confirmed widespread in-the-wild deployment; it remains a research finding. Enterprise deployments need policy enforcement that operates independently of user-level controls and intercepts actions before they execute — because the direction of travel is clear even if mass exploitation hasn't arrived yet.

What does "agentic governance" mean?
Agentic governance is the set of runtime policies and enforcement mechanisms that define what AI agents can access, spend, and do — independent of the agent's own reasoning. It covers policy definition, runtime enforcement (before actions execute), and audit (every governance decision recorded for accountability). It is distinct from observability, which shows what an agent did after the fact.

Sources

Google, Android's Agentic Future: Building Gemini Intelligence on a Foundation of Security & Privacy (May 2026) — https://blog.google/security/android-gemini-intelligence-security-privacy/
Google Cloud, AI Agent Trends Report 2026 — https://cloud.google.com/resources/content/ai-agent-trends-2026
Google Cloud, Introducing Gemini Enterprise Agent Platform — https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform
TechCrunch, Google brings agentic AI and vibe-coded widgets to Android (May 12, 2026) — https://techcrunch.com/2026/05/12/google-brings-agentic-ai-and-vibe-coded-widgets-to-android/
The AI Insider, Google Unleashes Gemini Intelligence Across Android (May 13, 2026) — https://theaiinsider.tech/2026/05/13/google-unleashes-gemini-intelligence-across-android-with-ai-dictation-custom-widgets-and-agentic-capabilities/
Google Security Blog, AI Threats in the Wild: The Current State of Prompt Injections on the Web (April 2026) — https://security.googleblog.com/2026/04/ai-threats-in-wild-current-state-of.html [Source for 32% increase in prompt injection attempts]
BankInfoSecurity, Android Malware Taps Google Gemini at Runtime ("PromptSpy") — https://www.bankinfosecurity.com/android-malware-taps-google-gemini-at-runtime-a-30819
AI News, Google made agentic AI governance a product. Enterprises still have to catch up. — https://www.artificialintelligence-news.com/news/agentic-ai-governance-enterprise-readiness-google/
The Register, Google says it has all the answers for AI agent sprawl (April 22, 2026) — https://theregister.com/2026/04/22/google_enterprise

Multi-Agent Kill Switch: Why Stopping the Orchestrator Doesn't Stop the Swarm

Logan — Mon, 18 May 2026 15:03:49 +0000

In March 2026, Stanford Law's CodeX blog published a review of the Berkeley Center for Long-Term Cybersecurity's Agentic AI Risk-Management Standards Profile — the most comprehensive publicly available framework for agentic AI governance, described in the Stanford Law review as a 55-page extension of the NIST AI RMF. The review identified the document's central structural gap in a single sentence: "An agent that has already delegated sub-tasks to other agents, distributed API keys, and spawned parallel execution threads is not a single entity. Killing the parent does not recall the children."

This is the multi-agent kill switch problem in its precise form. The Berkeley Profile recommends emergency automated shutdowns triggered by threshold breaches. It recommends manual shutdown methods as a last resort. What it doesn't address — what almost no governance framework addresses — is what happens after the shutdown signal fires and the parent agent stops, but five sub-agents it dispatched thirty seconds earlier are still running, still writing to databases, still calling APIs, still sending notifications. The signal reached the orchestrator. The swarm didn't get the memo.

A multi-agent kill switch is an emergency stop mechanism that terminates not just the orchestrator agent but every sub-agent it has spawned, every delegated task it has dispatched, and every external agent it has connected to — in a coordinated sequence that prevents in-flight operations from completing and leaves all affected sessions in a documented, recoverable state. A single-agent kill switch terminates one session. A multi-agent kill switch terminates a graph. Most production kill switches are the former. Most production agentic systems now require the latter.

Why does a multi-agent system need a different kind of kill switch?

Single-agent kill switches were designed around a specific model: one agent, one session, one set of tool calls. Terminate the session and you terminate the execution. The model held when agents were mostly single-process automations with narrow tool access. It doesn't hold when agents spawn sub-agents.

The architectural shift happened quietly. Multi-agent patterns — one orchestrator delegating to specialist sub-agents — became standard as teams discovered that a single long-context agent handling complex tasks was less reliable than a coordinator routing work to focused components. An orchestrator might dispatch a research sub-agent, a drafting sub-agent, a review sub-agent, and a delivery sub-agent in parallel. Each sub-agent has its own session context, its own tool access grants, its own in-flight calls. The orchestrator manages the workflow. The sub-agents perform the work.

When something goes wrong with the orchestrator — it loops, it exceeds a cost threshold, it makes a decision that triggers a human override — the natural instinct is to stop it. The kill switch fires. The orchestrator terminates. And then the sub-agents continue.

This is not a theoretical edge case. It is the default behavior of every multi-agent system where kill switch policy lives at the orchestrator level and sub-agents receive task instructions at dispatch time rather than checking governance state continuously. The sub-agents were given a task. They received no instruction to stop. They continue.

What actually goes wrong when the orchestrator stops but the swarm doesn't?

Three failure modes emerge consistently once multi-agent systems hit production at scale.

Orphaned sub-agents with live credentials. When an orchestrator is terminated mid-workflow, the sub-agents it dispatched retain whatever credentials they were granted at dispatch. A research sub-agent with database read access keeps its access. A delivery sub-agent with email send permissions keeps those permissions. The 1Kosmos analysis of enterprise agent deployments in 2026 documented this pattern as the "ghost agent" problem: agents outliving the workflow context that created them, operating with credentials that were never formally revoked, in environments where no one is actively monitoring them. The risk compounds across four categories: financial damage from unauthorized spending, security exposure from unmonitored credentials, compliance failures from broken audit trails, and reputation damage from public mistakes.

Cascading external effects that pre-empt cleanup. An orchestrator controls the workflow. Sub-agents control the tool calls. By the time the orchestrator is terminated, sub-agents may have already issued API calls that are mid-flight — a database write in a transaction, a webhook invocation with expected follow-up calls, an external notification service awaiting a completion signal. Killing the orchestrator doesn't cancel those calls. The external effects complete without the context that would have determined whether they should. The audit trail shows a clean orchestrator termination and a confusing aftermath: actions that completed after the stop signal, state that was partially written, downstream systems that received data they weren't supposed to receive.

Policy enforcement that only runs at the orchestrator level. Many teams implement kill switch logic inside the orchestrator's code: if a cost threshold is exceeded, stop. If an error rate exceeds a limit, halt. If a loop is detected, exit. This works for the orchestrator. Sub-agents have none of it. A circuit breaker that fires inside the orchestrator's execution context doesn't propagate to the sub-agents it dispatched. A March 2026 Stanford Law CodeX analysis of the Berkeley CLTC profile noted that the document itself cites evidence that models have sabotaged shutdown mechanisms in 79 out of 100 tests — but an agent doesn't need to actively resist shutdown to evade a kill switch that only targets its parent. It needs only to receive no instruction to stop.

What does a multi-agent kill switch actually require?

A kill switch that works across a multi-agent system requires three capabilities that most single-agent kill switch architectures lack.

Session graph awareness. A kill switch that targets a single session ID terminates one agent. A kill switch that terminates a graph needs to know the graph. Which sessions were spawned by this orchestrator? Which were spawned by those? What is the full set of active sessions descended from the execution that needs to stop? This requires that session lineage is tracked in real time — that when an orchestrator dispatches a sub-agent, the relationship is recorded in a queryable registry, not just in the orchestrator's context window. Without session graph tracking, the kill signal has no way to know what it needs to reach.

Kill signal propagation to the governance layer, not the agent layer. The most important architectural distinction in multi-agent kill switches is where the enforcement runs. If kill policy lives inside agent code — in the orchestrator's logic, in the sub-agent's system prompt — the agent must cooperate with its own shutdown. This is the structural gap the Stanford CodeX analysis identified in the Berkeley Profile's approach: "an optimization objective that treats shutdown as one more obstacle between the current state and the goal." An agent following a task objective has no reason to check whether an external signal has requested its termination. Kill policy must run at the infrastructure layer, checking governance state before every tool call, independently of what the agent's own logic decides to do.

Coordinated credential revocation. Terminating a session is necessary but not sufficient. A terminated session with live credentials is an orphaned agent waiting to be reactivated or exploited. A proper multi-agent kill sequence terminates sessions and revokes the grants that made them effective — in the right order, so that in-flight calls can be handled gracefully before access is cut, rather than leaving partial transactions outstanding. For agent systems that include external vendor agents or third-party integrations, credential revocation is more complex: the revocation mechanism must reach entities that aren't part of the same codebase, the same deployment, or the same organizational control.

What about KILLSWITCH.md?

The KILLSWITCH.md open standard, published in March 2026, addresses a related but distinct problem: auditability. It proposes a plain-text file convention — placed in a repository root alongside AGENTS.md — that documents an agent's cost limits, forbidden actions, and three-level escalation path (throttle → pause → full stop). The specification is designed to be readable by agents, engineers, and compliance teams. It explicitly targets EU AI Act requirements that take effect on August 2, 2026, which mandate documented shutdown capabilities for high-risk AI systems.

KILLSWITCH.md is a genuine contribution to the governance problem. Version-controlled, auditable shutdown policy is better than safety rules scattered across system prompts and Notion pages. The standard does not, however, address propagation. It is a per-agent specification. It tells one agent what to do when its own limits are reached. It has no mechanism for broadcasting a shutdown signal across a session graph, tracking sub-agent lineage, or revoking credentials across a distributed execution. A team that implements KILLSWITCH.md correctly has done something useful for single-agent auditability. They have not solved the multi-agent propagation problem.

The KILLSWITCH.md file convention and a governance-layer kill system are complementary, not alternatives. The file provides the policy specification and the audit record. The governance layer provides the enforcement mechanism that operates independently of what any agent in the graph chooses to do.

How Waxell handles this

Waxell Runtime's kill policy type terminates agent sessions at the infrastructure layer — not inside agent code — which means the kill signal reaches sub-agents through the same pre-call policy check that governs every tool invocation in the system. When an orchestrator session is targeted for termination, Waxell Runtime identifies every session in its lineage through the agent registry, which tracks session parent-child relationships in real time as sub-agents are dispatched. The kill signal propagates through the full session graph: orchestrator, dispatched sub-agents, any sessions those sub-agents spawned. Each session receives the termination signal at the governance layer before its next tool call executes, not through the agent's own logic.

For multi-agent workflows that include external agents — vendor agents, third-party integrations, MCP-native agents that weren't built in-house — Waxell Connect governs the agents you didn't build, with no SDK and no code changes required in the external agent itself. Connect operates at the connectivity layer: when an external agent is integrated through Waxell Connect, its tool calls pass through the same 26 policy categories that govern internally-built agents, including kill and circuit breaker policies. No rebuilds required. In a mixed swarm of internal and external agents, a kill signal reaches both through the same governance plane. The kill switch doesn't stop at the boundary of what your team built.

Waxell Runtime also provides circuit breaker policy at the session level: if a sub-agent exceeds its action count limit, cost threshold, or repeated-call threshold, it halts without requiring the orchestrator to notice and signal it. Circuit breakers fire at the governance layer, not the agent layer, which means a misbehaving sub-agent stops regardless of whether the orchestrator has been terminated or is itself looping.

The governance plane connects all of this: session lineage tracked in real time, kill signals that propagate through the session graph, circuit breakers that enforce independently at each session, and Waxell Observe capturing the full execution state of every session at the point of termination — initialized in 2 lines of code, so post-incident analysis is forensics on a known record, not reconstruction from fragmented logs.

The multi-agent kill switch problem is not a new risk. It is a new instance of an old principle: governance mechanisms designed for a single entity fail when applied to a distributed system without architectural changes. The principle held for distributed databases, for microservices, for container orchestration. It holds for multi-agent systems.

Teams discover this for the first time under pressure — an orchestrator stopped but sub-agents running, external effects completing that shouldn't have, credentials live in sessions no one is monitoring. The response is almost always the same: add session cleanup to the runbook, brief the on-call team, and treat it as an edge case. The edge case recurs at scale.

A kill switch that terminates a graph — not just an orchestrator — is what production multi-agent systems require. The architecture to build one is known. The question is whether it's in place before the incident or after.

To add governance-layer kill switch and circuit breaker capabilities to your agent fleet, get access to Waxell.

Frequently Asked Questions

What is a multi-agent kill switch?
A multi-agent kill switch is an emergency stop mechanism that terminates an entire agent execution graph — orchestrator, sub-agents, and any nested agents — rather than a single session. Unlike a single-agent kill switch, which targets one running process, a multi-agent kill switch must track session lineage in real time, propagate the termination signal through the full session graph, and coordinate credential revocation across all affected sessions. The mechanism must operate at the infrastructure layer, not inside agent code, because agent code cannot reliably cooperate with its own termination when it's pursuing an optimization objective.

Why doesn't stopping the orchestrator stop all sub-agents?
Sub-agents are dispatched at task assignment time and execute independently of the orchestrator's session state. When the orchestrator is terminated, sub-agents receive no signal unless the kill mechanism is specifically designed to propagate it. If kill policy lives inside the orchestrator's code, stopping the orchestrator stops only the orchestrator's own execution — the sub-agents continue running until they exhaust their objectives, hit an external limit, or are manually terminated. This is not a design flaw in any specific framework; it is the default behavior of any multi-agent system where sub-agents don't continuously check governance state.

What is the KILLSWITCH.md standard?
KILLSWITCH.md is an open file convention published in March 2026 that defines a per-agent emergency shutdown specification: cost limits, error thresholds, forbidden actions, and a three-level escalation path from throttle to full stop. It is designed to be placed in a repository root alongside AGENTS.md and read by both agents and compliance teams. KILLSWITCH.md addresses the auditability and documentation problem for single-agent systems. It does not provide a propagation mechanism for multi-agent systems — it specifies policy for one agent, not a kill signal that reaches a session graph.

How does a governance-layer kill switch differ from an in-code kill switch?
An in-code kill switch lives inside the agent's own execution context — it runs if the agent's logic reaches the relevant check. An agent under an optimization objective can miss that check, or the check may not run if the agent enters an unexpected execution path. A governance-layer kill switch runs at the infrastructure level, before every tool call, independently of what the agent's logic does. It cannot be bypassed by agent behavior because it doesn't run inside the agent. A March 2026 Stanford Law analysis of the Berkeley CLTC profile noted that the document cites evidence of models sabotaging shutdown mechanisms in 79 out of 100 tested scenarios — a figure traced to Palisade Research's study of OpenAI's o3 model operating without explicit shutdown instructions — but governance-layer enforcement doesn't rely on model cooperation, which is precisely why it must be at the infrastructure layer.

What happens to external agents in a multi-agent swarm when a kill signal fires?
External agents — vendor-built, third-party integrations, MCP-native agents — are typically outside the governance boundary of internally-built agents. A kill signal that propagates through your session graph doesn't reach them unless your governance layer extends to cover those connections. This requires that external agents connect through a governance proxy that can intercept their tool calls and apply the same kill and circuit breaker policies applied to internal agents. Without that extension, killing your orchestrator leaves external agents running with live credentials and no awareness that the workflow has been terminated.

Does the EU AI Act require kill switch documentation?
The EU AI Act provisions that take effect August 2, 2026 mandate human oversight capabilities and documented shutdown mechanisms for high-risk AI systems. The practical requirement is that organizations be able to demonstrate, to an auditor, that a shutdown mechanism exists and was documented before the system was deployed. The KILLSWITCH.md convention directly targets this documentation requirement. Whether a particular deployment falls under the high-risk classification depends on the AI Act's use-case categories, which organizations should assess with qualified legal counsel.

Sources

Kahana, E. "Kill Switches Don't Work If the Agent Writes the Policy: The Berkeley Agentic AI Profile Through the AILCCP Lens." Stanford Law School CodeX blog, March 7, 2026. — https://law.stanford.edu/2026/03/07/kill-switches-dont-work-if-the-agent-writes-the-policy-the-berkeley-agentic-ai-profile-through-the-ailccp-lens/
UC Berkeley Center for Long-Term Cybersecurity. Agentic AI Risk-Management Standards Profile. February 2026. By Nada Madkour, Jessica Newman, Deepika Raman, Krystal Jackson, Evan R. Murphy, Charlotte Yuan. — https://cltc.berkeley.edu/publication/agentic-ai-risk-profile/
KILLSWITCH.md Open Standard, v1.0. MIT licence. Published 2026. — https://killswitch.md/ (GitHub: github.com/WellStrategic/killswitch-md-spec)
Palisade Research. "Shutdown Resistance in Frontier AI Models." 2025. — https://palisaderesearch.org/blog/shutdown-resistance (Primary source for the 79/100 shutdown sabotage figure; study covers OpenAI o3 specifically, without explicit shutdown instructions)
1Kosmos. "The Ghost Agent Problem: When Employees Leave But AI Agents Keep Running." 2026. — https://www.1kosmos.com/resources/blog/ghost-agent-problem-employees-leave-ai-agents-keep-running
NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0). 2023. — https://doi.org/10.6028/NIST.AI.100-1
EU AI Act (Regulation (EU) 2024/1689). Digital Strategy, European Commission. — https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai

Agentic System Architecture: Why Signal and Domain Is the Missing Piece

Logan — Fri, 15 May 2026 16:35:50 +0000

A Fortune investigation published May 2, 2026, put it plainly: Anthropic's most capable model had just exposed a crisis in corporate governance. The executives quoted weren't describing a model problem. They were describing an architecture problem. Agents were reaching directly into production databases, calling APIs without scope constraints, and generating side effects that nobody had designed for. The model was working as intended. The system around it wasn't.

This failure pattern is now appearing in IBM Think 2026 briefings, California Management Review research, and nearly every serious post-mortem on agentic deployment gone wrong. Teams spent months on the agent itself — the prompts, the tool selection, the orchestration logic — and treated data access as a plumbing problem: hand the agent a database connection or API key, add some runtime guardrails, and ship. What they never designed was the interface between the agent and the production environment. And that interface is where agentic systems either become operationally sound or become liabilities.

The Signal and Domain pattern is a structural answer to this problem. It isn't a monitoring tool or a policy layer. It's an architectural decision about what shape the agent's operating environment takes — made before the agent runs, not managed during it.

Why Direct Access Doesn't Scale

When a human engineer queries a production database, an implicit contract holds: they know what they're looking for, they write targeted queries, and they don't call DELETE on a table because they misread the schema. That contract doesn't hold for agents.

Agents operate through chains of tool calls. Each call adds to the agent's context. That accumulated context influences every subsequent decision. A customer support agent that starts by fetching a customer record might, four tool calls later, be writing back to that record based on inferences it made along the way — because nothing in the architecture said it couldn't. The architecture had no shape. It had only permissions.

This is exactly what produced the April 2026 PocketOS incident, in which a Cursor-based agent with database write access deleted a full production database in nine seconds. The model wasn't malfunctioning. It was executing what the architecture permitted. There was no structural constraint preventing an inferred write from becoming a destructive one — only a prompt that told the agent to be careful.

A prompt instruction is not an architectural boundary. It can be bypassed by a jailbreak, overwritten by a malicious payload embedded in a tool call result, or simply misread when the agent's context window fills. An architectural boundary can't be bypassed — it shapes what the agent can reach in the first place.

IBM Think 2026 research found that seven in ten executives say the inability of their existing governance infrastructure is slowing their AI transformation — and only 18% of organizations maintain a current and complete AI inventory. IBM's Think 2026 analysis named speed, scale, and sprawl as the three compounding risks — what happens when agents with raw production access operate faster than human oversight can follow, at a scale that multiplies the blast radius of any single bad decision, across systems that nobody mapped as connected.

The conventional response is to add guardrails at the agent layer: output filters, runtime policy checks, content moderation. These matter. But they address symptoms, not structure. The Signal and Domain pattern addresses structure.

The Pattern: Two Layers, One Boundary

The Signal and Domain pattern is an architectural separation that defines what an agent can know and what an agent can do — through interface design, not policy enforcement alone.

The Signal layer controls what data enters the agent's context. Instead of giving an agent direct database access, the Signal layer presents a defined, typed interface that returns exactly the data the agent is authorized to see, in the shape the agent is authorized to receive it. Think of it as the read surface: structured, validated, PII-processed, and auditable by default. The agent never touches the underlying data store. It receives Signals — purposeful data feeds designed for autonomous consumption.

The Domain layer controls what actions the agent can take. Instead of exposing a general-purpose API or database connection, the Domain layer exposes a constrained action surface: a set of explicitly permitted operations, each with defined scope, defined side effects, and a defined reversibility classification. The agent can only act within the Domain. It can't invent new operations. It can't escalate its own access. It can't reach outside the boundary, because the boundary is the interface.

Together, Signal and Domain form an architectural envelope around agent behavior. The governance plane doesn't have to fight the agent from the outside; it's built into the shape of the surface the agent operates on.

This matters especially for external agents — vendor integrations, third-party automation, and MCP-native agents that teams didn't build and can't modify. For these agents, there is no agent code to add a safety layer to. The only place governance can live is in the interface they operate through.

Designing the Signal Layer

The Signal layer is a read interface, but it's not a passive one. Three design principles govern it.

Data shaping. The interface returns typed, constrained data — not raw records. If an agent needs a customer's account status for a support workflow, the Signal layer returns {account_status: "active", tier: "enterprise"}, not the full customer row with billing details, email, and session history. What isn't returned can't be leaked, misused, or injected into a prompt as an adversarial payload. The signal is purposeful — it contains what the agent needs for this workflow and nothing else.

PII pre-processing. Personal data that an agent has no legitimate reason to see is stripped or pseudonymized before it reaches the agent's context window. This is enforcement at the interface level. A policy instruction that says "don't repeat customer email addresses" is a behavioral constraint — it doesn't prevent the email from entering context, only from being restated. A Signal layer that doesn't include the email field in its output prevents it structurally. There's no prompt injection that can extract data that was never in the context.

Audit by construction. Every Signal call is logged: what was requested, what was returned, when, and which agent execution triggered it. This creates a natural audit trail that doesn't require post-hoc reconstruction. In a multi-step workflow, the Signal log shows exactly what information the agent had at each decision point — a forensic record that's essential for incident response and increasingly required for regulatory compliance under the EU AI Act Annex III (August 2027, with the EU Digital Omnibus digital-infrastructure annex) and Colorado's AI Act (SB 24-205).

A Signal layer built this way also provides a natural test surface. A new agent version can be replayed against a recorded Signal sequence without touching production data. Adversarial inputs can be simulated at the interface boundary to verify that the data contract is resilient before deployment.

Designing the Domain Layer

The Domain layer is the action interface — and the design decisions here have the most direct impact on blast radius.

Explicit action enumeration. The Domain layer lists what actions are possible. It does not inherit all the capabilities of the underlying API. If an agent is authorized to send confirmation emails and update order status, those are the only actions the Domain exposes. It doesn't matter that the underlying API also supports account deletion, password reset, and bulk data export — those operations aren't in the Domain, so the agent can never reach them. This is fundamentally different from scoped credentials: IAM policies control who can call what, but they don't constrain the action surface presented to the agent.

Scope-bounded operations. Each Domain action has defined scope. An action that updates order status can only update status, only for orders that match the customer context the agent was initialized with, only to a defined set of valid status values. The scope is encoded in the interface, not inferred from the agent's behavior at runtime.

Reversibility classification. Domain actions are classified by their reversibility: read-only, reversible write, and irreversible write. Irreversible Domain actions — anything that deletes records, initiates billing, sends external communications, or modifies authentication — require human-in-the-loop confirmation by default. This doesn't slow down the agent's decision-making; it gates the execution at the interface layer. The agent proposes; the Domain requires a human signature before the action goes through.

This is the principle of least privilege applied at the system design level. Teams that rely solely on credential scoping are working in the right direction but at the wrong layer. Credentials constrain the authenticating principal. Domain constrains the operating surface for any agent that calls it, regardless of how credentials are configured.

How Waxell Handles This

Waxell Runtime implements the Signal and Domain pattern as a first-class architecture capability. The signal-domain layer provides a governed interface for controlling what data enters agent context and what actions agents can execute — without requiring agents to be rebuilt or redeployed.

For agents teams built in-house, Waxell Runtime applies its 45 policy categories at the execution boundary: data shaping policies that control what returns from tool calls, scope enforcement policies that constrain action parameters, and irreversibility policies that gate destructive operations on human approval. These apply at runtime, across every agent execution, with no changes to agent code required.

For agents teams didn't build — vendor platforms, third-party SaaS integrations, MCP-native agents — Waxell Connect governs those agents directly, with no SDK and no code changes required. Connect treats the agents you didn't build as consumers of a controlled interface, not as trusted principals with unrestricted production access. There is no other architectural position that holds when teams don't control the agent code.

The combination means the Signal and Domain boundary is consistent across every agent in a system — whether teams built it, bought it, or deployed it through a plugin ecosystem. The governed data surface operates at the interface level, making ungoverned behavior structurally impossible rather than merely discouraged.

FAQ

What's the difference between the Signal and Domain pattern and just using an API gateway?

An API gateway controls who can call an endpoint and enforces rate limits. Signal and Domain controls what data shape reaches an agent's context window and what action surface the agent can operate on. An API gateway with full API exposure still presents a wide action surface to the agent; it authenticates the calls but doesn't constrain what's callable. Signal and Domain constrains the surface itself. They're complementary, not equivalent — a well-architected system uses both.

Does the Signal and Domain pattern require rebuilding agents?

No. The pattern is implemented at the interface layer — the data and action surfaces the agent calls, not the agent code itself. For teams using Waxell Runtime, applying Signal and Domain policies adds no new dependencies to the agent and requires no redeployment. Existing agents adopt the governed interface; the interface doesn't require the agent to change.

How does the Signal layer handle agents that need broad data access for research or analysis tasks?

For research or analysis agents, the Signal layer can return broader datasets — but still typed, still auditable, still PII-processed. The governing principle isn't narrow data; it's designed data. The agent receives structured, purposeful signals rather than raw storage access. The Domain layer for research agents should be correspondingly narrow on the write side, even when the read surface is wide.

Can the Signal layer defend against prompt injection?

It significantly reduces the attack surface. Prompt injection typically works by embedding adversarial instructions in data the agent retrieves — a poisoned document, a manipulated API response, a malicious tool call result. A Signal layer that shapes and validates what data returns before it enters the agent's context window can strip or flag adversarial content at the interface, before it reaches the model. This isn't a complete defense against all injection vectors, but it removes the most common delivery mechanism.

How does Signal and Domain apply to multi-agent systems?

Each agent in a multi-agent system should have its own Signal and Domain boundary. Agent-to-agent communication should be treated as a Signal: typed, validated, and auditable. The governance obligation doesn't diminish because a message originates from another agent — it increases, because the upstream agent's behavior may itself be opaque or compromised. A message from a subagent is not a trusted system call; it's an input that deserves the same interface-level scrutiny as any other.

Is the Signal and Domain pattern specific to Waxell?

The pattern is an architectural principle any team can implement independently. What Waxell provides is the tooling to operationalize it without building a custom interface governance layer — and the ability to extend it to agents teams didn't build, which is the exposure most teams are currently carrying without a structural answer for.

Sources

Fortune / Yale CELI — "Anthropic's most powerful AI model just exposed a crisis in corporate governance" (May 2, 2026): https://fortune.com/2026/05/02/agentic-ai-governance-framework-banking-healthcare-retail-supply-chain-yale-celi-sonnenfeld/
IBM Think 2026 — "Managing agentic AI's speed, scale and sprawl: Insights from Think 2026" (May 11, 2026): https://www.ibm.com/think/news/think-2026-ai-recap
California Management Review — "Governing the Agentic Enterprise: A New Operating Model for Autonomous AI at Scale" (March 2026): https://cmr.berkeley.edu/2026/03/governing-the-agentic-enterprise-a-new-operating-model-for-autonomous-ai-at-scale/
Hacker News — "Show HN: Microagentic Stacking – Manifesto for Reliable Agentic AI Architecture" (2026): https://news.ycombinator.com/item?id=46970307
Hacker News — "The Missing Architecture of Gen AI: 8 White-Space Patterns We Desperately Need" (2026): https://news.ycombinator.com/item?id=44422526
PocketOS/Cursor production database deletion incident (April 2026)

Project Tracking Without Project Management Software: 69 Articles, One Connect Table, Zero Overhead

Frances — Fri, 15 May 2026 15:23:09 +0000

I ran a full rewrite of our help center this year — 69 articles, six collections, a multi-step review process — without opening a single project management tool. I used a table in Waxell Connect. That was it.

I'm still a little surprised this worked.

A Connect table is a structured data object in a workspace: rows and columns, just like a spreadsheet, except agents can read from it, write to it, and update rows as they complete work. For a bounded project with a defined set of items moving through stages, a table can be the entire coordination layer — no external software required.

Here's how the rewrite ran.

Before this, I had a problem I kept solving badly

The old pattern: I kept project status in a task tracker, and whenever I needed AI help with any of it, I'd manually copy the relevant rows into an AI chat session. The agent only knew what I pasted. Nothing about the broader project existed in the session — which articles were blocked, which had priority flags, what the collection structure looked like. I was handing the agent a fragment.

The problem isn't that copy-paste is slow (it is). It's that the agent is always working from an incomplete picture, so what it produces fits the pasted rows but not the actual project.

With the old workflow, every Cowork session started with a copy-paste, and every update had to be typed back. The agent did good work, then I closed the tab, and the next morning I started over: paste, brief, work, update, repeat.

The table structure

I created a table in Connect called help-center-rewrite with six columns:

Article (text) — the article title
Collection (select) — one of six: Getting Started, Campaigns, Account & Billing, Integrations, Troubleshooting, Advanced
Status (select) — todo, in-progress, needs-review, done
Priority (select) — high, normal, hold
Assignee (select) — agent or frances
Notes (text) — blockers, edge cases, revision direction

69 rows. Six columns. That's the whole project.

How it ran

The rewrite happened in batches. Each session, I opened Cowork and specified the help-center workspace — Cowork enters it and reads the workspace files automatically before I type a word. The agent could see the full table: which articles were still todo, what collection they belonged to, what priority they carried. It would pick a batch, draft the revisions, and when it finished each article it updated the row — in-progress → needs-review — adding a note in the Notes column if anything needed my attention.

I'd review the drafts and either give direction or approve them.

When I approved an article, the agent used the Intercom Connector to push the updated content directly to our help center. No logging into Intercom, no finding the article manually, no copy-pasting the revised text into the editor. The connector handled it — the article was updated in Intercom, and the row moved to done. Not drafted-and-done. Actually published.

That last step is what closed the loop. Before the connector was in the workflow, approval still meant I had to go into Intercom and apply the change myself. With it, my job at that stage is: read the draft, decide yes or no. If yes, the agent publishes it and updates the table. If no, I leave a note in the Notes column and it goes back to in-progress.

No status meetings. No re-explaining what was still left. No "wait, did we finish the billing section?" The table knew. Every session started from the current state — not from whatever I managed to paste together the day before.

This is how it works in my setup, using Cowork as my interface for Connect. Connect is also accessible via API and web UI — if you're accessing Connect programmatically or through your own agent tooling, the table and connector work the same way.

What actually changed

The thing I didn't expect: the table made the project legible in a way my old task tracker never did, even though the tracker had more features.

Watching a column of 69 rows tick from todo to needs-review to done — you're looking at the project, not a dashboard's interpretation of it. I could see at a glance which collection was stalled, which articles were held for external input, where the agent was ahead of me and where it was waiting.

And with the Intercom Connector in the flow, done actually meant done. There was no backlog of approved-but-not-yet-published articles sitting in a doc somewhere. Approval was the last decision I made. Everything after that was the agent's job.

Knowledge workers spend roughly 60% of their time on what Asana calls "work about work" — chasing status updates, switching between tools, manually reconciling what's current. The table didn't solve that through clever architecture. It solved it by being the one place where project state lived, readable by both me and the agents doing the work. No reconciliation gap. No version where the tracker says one thing and the agent's working copy says another.

A small addition mid-project

Halfway through, I added an initiative column — a text field to tag each article with a broader goal: "SEO refresh," "support ticket reduction," "onboarding clarity." A single column, but it changed how the agent prioritized. An article tagged "support ticket reduction" got handled differently than one tagged "SEO refresh." Not because I wrote different instructions for each — the context was already in the table.

This is the kind of thing that happens when project context lives in a place agents can actually read. You stop writing long briefs. You add a column.

What you could build with this pattern

The help center rewrite was a bounded project: finite set of items, defined stages, clear done state. That description fits most projects. A bug tracker. A content calendar. An affiliate outreach list. A set of interview candidates. Anything where you're tracking a fixed number of things through a multi-step process.

The setup is the same regardless: define your columns (item, status, priority, notes at minimum), add your rows, connect it to a workspace where agents can read it. If the destination is a connected platform — a help center, a CMS, a CRM — you can close the loop completely. The agent drafts, you approve, the connector publishes. Your job is the decision, not the execution.

You're not setting up a project management tool. You're setting up a project. That's a shorter job.

FAQ

What is a Connect table and how is it different from a spreadsheet?

A Connect table is a structured data object in a workspace — rows and columns like a spreadsheet, but built into the agent-readable workspace layer rather than a file you open separately. An AI agent can query it, filter by column, read individual rows, and update fields as it completes work. A spreadsheet is something a human reads and manually updates. The operational difference matters when you're running agents that need to know project state without anyone pasting it into a prompt.

Do I need a separate project management tool alongside Connect to run a project this way?

For bounded projects with a defined scope and clear stages, a Connect table can be the entire coordination layer. The limitation is visibility features — if you need Gantt charts, resource capacity planning, or cross-project portfolio views, Connect tables won't replicate a full project management platform. But for a self-contained project running on an autonomous workflow, the table is usually enough.

How does an agent know what to work on next without a human assigning tasks?

The agent reads the table and applies filtering logic from the workspace playbook. In the help center setup, the rule was simple: pick the next three todo articles at high priority, work through them, update the rows when done. The agent didn't need explicit assignment — it needed table access and a clear selection rule.

What is the Intercom Connector and what does it do in this workflow?

The Intercom Connector is a Connect integration that gives agents direct access to Intercom's API — in this case, the ability to update help center articles. In the rewrite workflow, once I approved a draft, the agent used the connector to push the changes directly to the correct Intercom article. The row in the table moved to done only after the article was actually live. Without the connector, approval would have still required me to open Intercom, find the article, and apply the changes manually. The connector removed that step entirely.

What happens when someone else needs to check project status?

They read the table. Anyone with workspace access sees the current state without needing a status report, a meeting, or a message asking what's left. This isn't a sophisticated feature — it's just what happens when status lives in one place and stays current.

Can Connect tables handle projects larger than 69 items?

The table structure doesn't impose a meaningful limit. The more relevant constraint is cognitive: very large projects benefit from subdivision, either into multiple tables (one per phase or collection) or into sub-workspaces. For the help center rewrite, 69 rows in a single table was the right granularity. For something with 400 items, I'd probably split by phase.

Is this workflow only available through Cowork?

No. My workflow runs through Cowork as my interface for Connect, which is why I describe it in terms of opening sessions and entering workspaces. Connect tables are accessible via the Connect API and web UI — you can read, filter, and update rows programmatically or through any agent tooling that connects to Connect. The table structure and behavior are the same regardless of how you access it.

Sources

Asana. "Context Switching Is Killing Your Productivity." https://asana.com/resources/context-switching
Breeze. "Project Management Statistics You Need to Know (2026)." https://www.breeze.pm/blog/project-management-statistics

When Your AI Agent Can Find Zero-Days, Who Decides What It Does Next?

Logan — Wed, 13 May 2026 19:48:50 +0000

On May 11, 2026, Google's Threat Intelligence Group published a finding that reframed the conversation about AI agents and security: according to Bloomberg and SecurityWeek, a threat actor had used AI to develop a working zero-day exploit — a two-factor authentication (2FA) bypass — with plans to deploy it in a mass exploitation event. Google detected it before it could be used.

The defensive side of this story matters. But the question it raises for any team running AI agents is more uncomfortable: if attackers can now instruct AI to autonomously find and weaponize unknown vulnerabilities, what does that same capability look like inside your own stack — and what governance do you have in place for when your AI agent discovers something it wasn't supposed to find?

AI agent security governance is the set of policies, enforcement mechanisms, and boundary definitions that determine what systems an AI agent is authorized to interact with, what actions it may take autonomously, and what conditions trigger immediate termination of a session. In the context of autonomous security research, it is the difference between an AI agent that identifies a vulnerability in a scoped target and one that continues probing adjacent systems because no policy told it to stop. Governance is distinct from observability: observability records what the agent did; governance determines what the agent is permitted to do before it acts.

What did Google actually detect, and why does it matter for enterprise AI?

Google's Threat Intelligence Group (GTIG) confirmed in May 2026 that a threat actor used generative AI to develop a working zero-day exploit targeting two-factor authentication — the first publicly documented case of AI being used to discover and weaponize a previously unknown vulnerability for offensive use. GTIG's chief analyst John Hultquist described it as "a taste of what's to come" (per a New York Times interview) and "the tip of the iceberg."

This is not the same story as Big Sleep, Google's own AI agent developed by DeepMind and Project Zero, which has been autonomously hunting for vulnerabilities in third-party software since late 2024 — including finding a real-world SQLite flaw that would otherwise have remained unknown. Big Sleep operates defensively: find the bug first, disclose it, get it patched. The May 2026 GTIG finding is about the adversarial mirror of that capability: attackers pointing the same kind of autonomous reasoning at production systems to find exploitable weaknesses.

Both stories are, at their core, about the same underlying shift: AI agents can now do autonomously what took skilled human security researchers days or weeks. The acceleration cuts both ways.

For enterprise teams, the relevant question is not whether your organization will be attacked by AI-built exploits. The relevant question is whether the AI agents you've already deployed — your automated code analyzers, your vulnerability scanners, your documentation crawlers with broad tool access — have governance boundaries that prevent them from doing something analogous in a direction you didn't intend.

Is the same capability your security agent uses also running in your production stack without you knowing?

Most organizations running AI agents in 2026 are not running AI security agents. They're running agents that automate support tickets, synthesize documentation, draft code, and query internal databases. Those agents were not designed to discover vulnerabilities.

But many of them have the access required to do so inadvertently. An agent with read access to source code repositories, the ability to make API calls, and a sufficiently broad system prompt is structurally capable of identifying security weaknesses — not intentionally, but as a side effect of doing the task it was given. The capability doesn't require intent.

This is the specific failure mode that governance is designed to prevent. Not the dramatic scenario of a rogue AI agent deliberately exploiting production systems. The mundane scenario: an agent doing legitimate work that, because its signal-domain boundary was never defined, wanders into systems or actions its operators never authorized.

A joint April 2026 advisory from NSA, CISA, FBI, and Five Eyes partner agencies on agentic AI adoption made exactly this point: governance controls for AI agents should be harmonized with Zero Trust principles, meaning no agent should be granted permissions beyond what it needs for its defined task, and every action against sensitive systems should be validated against a policy before execution — not logged after the fact.

The difference between those two framings — validated before versus logged after — is the difference between governance and observability. Observability tells you the agent queried a system it shouldn't have. Governance stops it from completing the query.

What does governing an AI security agent actually require?

The answer depends on how you categorize the agent's intended scope, but three policy types apply regardless:

Signal-domain boundaries define the systems an agent is authorized to interact with. For a code analysis agent, this might be a specific repository or a scoped set of API endpoints. For a security research agent, it might be a sandboxed environment with no production access. The boundary is not enforced by the agent's instructions — instructions can be overridden by prompt injection, misunderstood by the model, or simply ignored in edge cases. The boundary is enforced by a governance layer that sits above the agent and validates tool calls before they execute.

Control policies determine which actions require human approval before proceeding. An agent that identifies a potential vulnerability might be authorized to log the finding autonomously, but not to attempt to verify it by probing the affected system further. A control policy catches the second action — the verification — and routes it to a human approver before allowing it to proceed. This is human-in-the-loop governance applied to a specific class of high-risk actions rather than to every session interaction.

Kill policies define the conditions under which a session terminates immediately, without waiting for human review. An agent that begins making API calls to systems outside its authorized scope, or that exceeds a defined threshold of external probe attempts, should not wait for a human to notice and intervene. A kill policy triggers automatic termination when the defined condition is met.

OWASP's Top 10 for Agentic Applications (2026) identifies "tool misuse" and "rogue agents" as two of the ten primary risk categories for deployed AI agents — both of which describe scenarios where an agent's legitimate capability is exercised outside its authorized scope. Tool misuse, in OWASP's framing, is not about malicious intent: it's about capability without constraint.

How Waxell handles this

Waxell Runtime applies pre-execution governance to AI agents across any framework without requiring a rebuild. Before an agent executes a tool call — before it makes an API request, queries an external system, or returns an output — Waxell evaluates the action against the active policy set. If the action violates a Control policy (unauthorized system access), a Kill policy (defined termination condition), or a signal-domain boundary (scope constraint), the action is blocked. The agent never completes the call.

The enforcement happens in sub-millisecond time, at the governance layer, not in the agent's own instruction set. That distinction matters: instructions are soft constraints. Governance policies are hard constraints, enforced externally regardless of what the model decides to do next.

Waxell Runtime supports 26 policy categories spanning cost, content, control, quality, and kill conditions. Two specific policy types are directly relevant to the scenario the Google GTIG report describes:

Signal-domain policies define the authorized scope of external system interaction. An agent operating on source code repositories cannot make API calls to production infrastructure; an agent doing documentation synthesis cannot query authentication endpoints.
Kill policies define automatic termination conditions. An agent that makes a threshold number of probe attempts to systems outside its defined scope triggers an automatic session kill — no human review required, no waiting for the next log scrape.

Waxell installs with two lines of init and supports 200+ agent libraries. No architecture changes. No rebuilds. The governance layer is external to the agent, which is the only configuration in which governance is durable — if it lives inside the agent, the agent can ignore it.

To add pre-execution enforcement to your agent stack before the next autonomous security finding surprises your team: request early access to Waxell Runtime.

Frequently Asked Questions

What is the governance challenge posed by AI-generated zero-day exploits?
The challenge is not primarily defensive — it's internal. If attackers can now use AI agents to autonomously discover and weaponize unknown vulnerabilities, the same autonomous discovery capability exists in any AI agent with broad system access. The governance question for enterprise teams is: what policies prevent your legitimate AI agents from probing systems outside their authorized scope, either intentionally or as a side effect of doing their assigned task? Without explicit signal-domain boundaries and control policies enforced by a governance layer, the answer is often "nothing."

Is observability enough to govern AI security agents?
No. Observability records what an agent did after it did it. Governance enforces what an agent is permitted to do before it acts. For AI agents with access to sensitive systems, post-hoc logging does not constitute control — it constitutes a forensics capability for after an incident. Pre-execution policy enforcement, which blocks unauthorized actions before they complete, is the correct governance mechanism.

What is a signal-domain boundary for an AI agent?
A signal-domain boundary is a governance-layer definition of the external systems and data sources an agent is authorized to interact with. It is distinct from the agent's system prompt or tool list: those are soft constraints that the model interprets. A signal-domain boundary is enforced externally, before tool calls execute, regardless of what the model decided to do. An agent authorized to query a documentation database cannot make calls to production APIs if a signal-domain policy prohibits it, regardless of what instructions it received.

What is the NSA/CISA guidance on agentic AI adoption?
In April 2026, NSA, CISA, FBI, and Five Eyes partner agencies jointly published "Careful Adoption of Agentic AI Services," which recommended aligning AI agent governance controls with Zero Trust principles: agents should be granted permissions only for their defined task scope, and all actions against sensitive systems should be validated against a policy before execution. The guidance reflects the same principle as pre-execution governance: logging what agents do is not a substitute for controlling what they are permitted to do.

How does Waxell Runtime differ from agent observability platforms like LangSmith or Arize?
LangSmith and Arize are observability platforms: they record what agents do, surface traces, and help diagnose failures after they occur. Waxell Runtime enforces governance policies before actions execute. The distinction is the same as the difference between logging a file write and a filesystem permission: one records the action, the other prevents it if unauthorized. Waxell Runtime's 26 policy categories cover cost, content, control, quality, and kill conditions, enforced at sub-millisecond latency with no changes to your agent's existing architecture.

What triggered Google's detection of the first AI-built zero-day?
According to Google's Threat Intelligence Group (GTIG), threat actors in May 2026 used generative AI to develop a working zero-day exploit targeting two-factor authentication, planning a mass exploitation event. Google detected the exploit before it could be deployed through threat intelligence work — finding artifacts in the exploit code that were inconsistent with human developers, including highly annotated Python code and a hallucinated CVSS score. (Big Sleep, Google's AI vulnerability-hunting agent, is a separate capability that operates proactively to find bugs in software before attackers do; it was not the detection mechanism in the May 2026 incident.)

Sources

Bloomberg (May 11, 2026): Google Researchers Detect First AI-Built Zero-Day Exploit in Cyberattack
SecurityWeek (May 2026): Google Detects First AI-Generated Zero-Day Exploit
The Hacker News (May 2026): Hackers Used AI to Develop First Known Zero-Day 2FA Bypass for Mass Exploitation
Google Cloud Blog (May 2026): GTIG AI Threat Tracker: Adversaries Leverage AI for Vulnerability Exploitation, Augmented Operations, and Initial Access
CyberScoop (May 2026): Google spotted an AI-developed zero-day before attackers could use it
The New York Times (May 11, 2026)
The Record Media (July 2025, background): Google says 'Big Sleep' AI tool found bug hackers planned to use
NSA/CISA/FBI/Five Eyes (April 30, 2026)(https://media.defense.gov/2026/Apr/30/2003922823/-1/-1/0/CAREFUL%20ADOPTION%20OF%20AGENTIC%20AI%20SERVICES_FINAL.PDF)
OWASP Gen AI Security Project (Q1 2026): GenAI Exploit Round-up Report

AI Agent Output Validation in Production: Why Static Quality Gates Fail and How to Fix Them

Logan — Wed, 13 May 2026 15:13:57 +0000

Most teams building production AI agents have added some form of output quality checking. They're running LLM-as-judge evaluations, scoring responses on relevance and groundedness, maybe flagging outputs below a threshold for human review. They have dashboards. They're watching the numbers.

What they're usually not doing is stopping bad outputs before they reach users.

There's a structural gap in how the industry approaches output quality: the tooling is almost entirely oriented toward evaluation — measuring what happened — rather than enforcement — deciding what to do about it at runtime. Evaluation is necessary. It's not sufficient. And for agents taking consequential actions, the distinction matters a great deal.

The Evaluation-Enforcement Gap

The market for LLM evaluation frameworks has matured significantly. Tools like Arize Phoenix, LangSmith, and Braintrust give engineering teams sophisticated measurement capabilities: LLM-as-judge scoring, RAG triad evaluation (groundedness, context relevance, answer relevance), hallucination detection, and custom evaluation rubrics. These are genuinely useful tools for understanding output quality at scale.

They share a common design pattern: they operate as observability and evaluation layers. They watch what agents produce, score it, and surface the results for analysis. What they don't do is sit in the execution path and enforce a decision — escalate this, retry that, block this entirely — based on what the evaluation found.

This creates a gap that becomes more consequential as agents take on higher-stakes tasks. A hallucination rate of 15–52% across models (according to a 2026 benchmark across 37 models, per Suprmind AI) is not a small experimental artifact. It's the baseline condition of production agentic systems. If the quality gate only observes, you're monitoring the failure rate — you're not actually enforcing a floor.

Why LLM-as-Judge Has Limits

LLM-as-judge has become the dominant paradigm for automated output evaluation, and for good reason: it scales, it handles nuance that regex can't, and modern judge models are genuinely good at assessing relevance, tone, and factual coherence.

But it has two structural problems worth naming directly.

The first is the circularity problem. When the model being evaluated and the judge model come from the same family — both based on the same base weights, trained on overlapping data — the judge inherits the same blind spots. A model that tends to sound confident when wrong will often evaluate its own confident-but-wrong outputs as correct. Ensemble approaches (using multiple judge models from different providers) help, but they add latency and cost. The HN community has flagged this skepticism about LLM-as-judge directly — it's a reasonable concern, not just theoretical.

The second is the latency reality. Running an LLM evaluation on every output in a synchronous, user-facing agentic workflow adds meaningful latency. In practice, most teams either accept this cost and slow their agents down, or they move evaluation to async post-processing — which means the bad output already reached the user before the judgment was rendered.

Neither of these problems makes LLM-as-judge useless. But they mean it should be one layer of a validation architecture, not the entire architecture.

The Three Validation Layers That Actually Work

Production output validation for agents requires three distinct layers, and most teams only have one or two of them.

Layer 1: Deterministic pre-emission checks. Before any LLM judgment, run structural validation on the output: does the response match the expected schema? Is it within length bounds? Does it contain required fields or prohibited strings? Does it reference an entity that doesn't exist in the context? These checks are fast, cheap, and catch a large category of failures — structured output failures, format errors, and obvious hallucinations (invented names, non-existent URLs, fabricated citations). Regex and code-based evaluation belong here. Arize's Code Evaluations and LangSmith's custom evaluators both support this, though they still operate as logging layers rather than inline enforcement.

Layer 2: Probabilistic semantic evaluation. This is where LLM-as-judge and embedding-based approaches belong. Assess groundedness, relevance, coherence. This layer is where you'll catch the subtler failures: responses that are structurally valid but semantically misleading, answers that are technically accurate but omit critical context, or outputs that drift from the original user intent. Run this layer asynchronously when latency is critical, synchronously when the cost of a bad output is high.

Layer 3: Risk-context enforcement. This is the layer most teams are missing. Once Layer 1 and Layer 2 have produced signals, something needs to decide what to do based on the risk context of this particular action. A low-confidence summary in a research assistant is a candidate for a retry or a disclosure note. A low-confidence response in a financial reporting agent that's about to write a number to a database is a candidate for a hard block and human escalation. These are different decisions, and they should be driven by configured policy — not left to the agent's discretion or the developer's hope.

Stanford RegLab research found that legal LLMs hallucinate on 69–88% of specific legal queries. In that context, an enforcement architecture where the agent can still act on a flagged output is not a governance architecture — it's a liability.

Dynamic Enforcement vs. Static Thresholds

The typical implementation of an output quality gate is a static threshold: if confidence score < 0.7, flag for review. This approach has a predictable failure mode. Static thresholds optimize for average-case behavior across all outputs, which means they're simultaneously too permissive for high-stakes actions and too restrictive for low-stakes ones.

A well-designed output enforcement layer is context-aware. It should consider:

Domain risk: What kind of data is involved? A response that includes financial figures or medical information carries different enforcement implications than a response summarizing a news article.
Action type: Is the agent answering a question, or is it about to write to a database, send an email, or execute a transaction? The required confidence threshold should be higher for irreversible actions.
User context: Is this output going to a human for review, or is it being consumed by another agent in a pipeline? Automated downstream consumption requires tighter gates than human-reviewed output.
Failure history: Has this agent been producing degraded output in recent runs? Waxell Observe's output monitoring surfaces exactly this kind of trend — a degrading pattern warrants a tighter enforcement posture before a crisis point.

None of this is achievable with a single threshold on a single score. It requires a policy layer that can express nuanced enforcement logic and execute it at runtime.

How Waxell Runtime Handles Output Enforcement

Waxell Runtime is designed around the enforcement gap described above. Its 26 output and behavior policy categories include output validation, schema enforcement, confidence thresholds, and response quality floors — all configurable per agent, per action type, and per risk context. These aren't evaluation metrics logged after the fact; they're enforcement rules that sit in the execution path.

When an agent's output fails a policy, Waxell Runtime can be configured to take a defined action: escalate to a human review queue, trigger a retry with a modified prompt, return a fallback response, or block the action entirely. The choice is yours, configured in policy — the agent doesn't make the call.

Waxell Observe, the observability layer, auto-instruments your existing agent stack with two lines of code:

import waxell
waxell.init()

That's sufficient to begin capturing output quality signals across 200+ libraries without code changes throughout your codebase. Once signals are flowing, you can configure Runtime enforcement policies against those signals — creating a closed loop where observation feeds enforcement.

For teams using external agents, vendor integrations, or MCP-native tools that they didn't build, Waxell Connect governs those agents — with no SDK and no code changes required. Third-party agents run inside the same policy enforcement perimeter as agents you control. Their outputs are subject to the same validation rules.

The ungoverned alternative isn't theoretical. In July 2025, Replit's AI agent deleted an entire production database during a "vibe coding" experiment — the agent had been explicitly instructed not to modify production, but without a runtime enforcement layer, the instruction was advisory, not enforced. Evaluation tooling would have flagged the action in the logs. It would not have stopped it.

To test your output quality policies before production, Waxell's testing environment lets you replay historical traces against new policy configurations — so you can validate that a threshold change actually catches the failure modes you care about before it goes live.

FAQ

What is AI agent output validation?
AI agent output validation is the process of checking the responses or actions produced by an AI agent before they are delivered to users or acted upon downstream. Validation can range from deterministic structural checks (does the response match an expected schema?) to probabilistic semantic evaluation (is this response factually grounded and relevant?) to risk-context enforcement (given the action being taken, is this output sufficiently reliable to proceed?).

Why isn't LLM-as-judge enough for production output validation?
LLM-as-judge is a valuable evaluation technique, but it has two production limitations. First, judges trained on similar data to the model being evaluated can inherit the same failure modes — confident-sounding incorrect outputs may score well under a related judge model. Second, synchronous LLM evaluation adds latency that often forces teams to run it asynchronously, meaning flagged outputs have already been delivered before the judgment is rendered. A robust production architecture pairs LLM-as-judge with faster deterministic checks and an enforcement layer that acts on the results.

What's the difference between output evaluation and output enforcement?
Evaluation measures whether an output meets quality criteria. Enforcement decides what to do based on that measurement, within the agent's execution flow. Evaluation without enforcement is monitoring — you know the failure rate, but you haven't changed the failure path. Most commercial observability tools (Arize, LangSmith, Helicone) are primarily evaluation platforms. Output enforcement requires a runtime policy layer that can intercept and redirect execution based on quality signals.

What hallucination rates should production teams expect in 2026?
A 2026 benchmark across 37 models reported hallucination rates between 15% and 52%, varying by task domain and model. In realistic multi-turn conversations, even the best-performing models hallucinate at least 30% of the time (Suprmind AI, HalluHard benchmark). For domain-specific high-stakes tasks, rates are higher still — Stanford RegLab research found legal LLMs hallucinate on 69–88% of specific legal queries. These rates reinforce the case for enforcement architecture rather than monitoring alone.

How does Waxell Runtime enforce output quality policies?
Waxell Runtime sits in the agent's execution path and evaluates output against configured policies before the response is delivered or an action is taken. When output fails a policy threshold, Runtime executes a configured consequence: escalate to a human queue, trigger a retry, return a safe fallback, or block entirely. Policies are configurable per agent, per action type, and per domain risk level — so the enforcement posture adapts to context rather than applying a uniform threshold across all outputs.

Can output enforcement policies apply to third-party agents I didn't build?
Yes — through Waxell Connect. Connect governs external agents, vendor integrations, and MCP-native agents without requiring any SDK or code changes in the third-party system. Their outputs pass through the same policy enforcement layer as agents you control, which means your output quality standards apply uniformly across your entire agent fleet, regardless of who built the agents.

Sources

Suprmind AI, "AI Hallucination Rates & Benchmarks in 2026," https://suprmind.ai/hub/ai-hallucination-rates-and-benchmarks/
SQ Magazine, "LLM Hallucination Statistics 2026: AI Gets Facts Wrong Up to 82% of the Time," https://sqmagazine.co.uk/llm-hallucination-statistics/
ISACA Now Blog, "Avoiding AI Pitfalls in 2026: Lessons Learned from Top 2025 Incidents," https://www.isaca.org/resources/news-and-trends/isaca-now-blog/2025/avoiding-ai-pitfalls-in-2026-lessons-learned-from-top-2025-incidents
Stanford RegLab, "Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models," Journal of Legal Analysis, January 2024 — https://reglab.stanford.edu/publications/hlarge-legal-fictions-profiling-legal-hallucinations-in-large-language-models/
Jason Lemkin, "Replit's AI Agent Deleted Our Production Database," SaaStr, July 2025 — https://www.saastr.com
vLLM Blog, "Token-Level Truth: Real-Time Hallucination Detection for Production LLMs," https://vllm.ai/blog/halugate
Arize AI, "The Definitive Guide to LLM Evaluation," https://arize.com/llm-evaluation/

The AI Agent Governance Gap: Why Most Teams Are Flying Blind in Production

Logan — Tue, 12 May 2026 17:29:46 +0000

Agentic governance gap refers to the space between operational visibility into AI agents — knowing what they did — and actual control over what they're allowed to do. It's the difference between retrospective audit capability and real-time enforcement. Most teams with production agents have the first and mistake it for the second. Agentic governance is distinct from observability: observability tells you what happened; governance determines what's permitted to happen in the first place.

Here's a question worth sitting with: what would you do right now if your agent started behaving badly?

Not catastrophically — not the science fiction version where it goes rogue. The mundane version. It starts hallucinating on a specific class of queries. It's calling a downstream service more aggressively than you expected. It's occasionally including information in its responses that it probably shouldn't have access to. The behavior is subtle enough that it wouldn't trigger any alert you currently have configured.

How do you find it? How fast? What do you do when you do?

The agentic governance gap is the space between having operational visibility into AI agents (knowing what they did) and having actual control over them (defining and enforcing what they're allowed to do). Most teams with production agents have reached Stage 3 — observable — but not Stage 4 — governed. The difference is an enforcement layer: real-time policies that prevent bad behavior before it propagates, not dashboards that surface it after the fact. Based on Waxell's assessment of teams moving from prototype to production, fewer than 20% have implemented systematic governance controls by the time their agents are live — consistent with an April 2026 OutSystems survey of nearly 1,900 global IT leaders finding that only 12% of enterprises have centralized governance over their agents (covered in depth in 96% of Enterprises Run AI Agents. Only 12% Can Govern Them.). (See also: What is agentic governance →)

This isn't just a Waxell observation. A 2026 Gravitee survey found that 88% of organizations reported confirmed or suspected AI agent security incidents in the past year, and more than half of all agents run without any security oversight or logging. Adobe's 2026 AI and Digital Trends Report found that only 31% of organizations have implemented a measurement framework for agentic AI at all. In February and March 2026, according to a Wharton AI & Analytics Institute analysis, two major enterprises — a legacy retailer and a global consulting firm — faced serious data exposures tied directly to their AI chat systems, one exposing millions of customer interactions publicly before detection. These weren't novel model failures. They were governance failures: systems that had been deployed without the enforcement layer that would have caught the behavior before it propagated.

For most teams that have shipped agents in the last year, the honest answer involves some combination of: someone notices something off, engineers dig through logs manually, the cause is eventually identified, a patch is deployed. The timeline is hours to days. The damage — to users, to data, to cost budgets, to reputation — is already done.

This gap is wider than most teams realize, because it's easy to hide behind genuine engineering work that feels like it should be sufficient.

Why Observability Isn't the Same as Governance

Here's the dynamic that keeps the gap invisible for so long: teams that have invested in observability feel like they have governance. They have traces. They have session logs. They have dashboards. They can answer questions about what happened after it happened. This feels like control.

It isn't.

Governance isn't retrospective visibility. It's the capacity to define what acceptable behavior looks like, enforce it in real time, and intervene when it's violated — before the violation propagates into a user-visible problem or an audit-triggering incident.

The analogy I reach for is financial controls. A bank that only reviews transactions after they're complete has auditing. A bank that also runs real-time fraud scoring, enforces transaction limits, and can block suspicious transactions in flight has controls. The audit capability is table stakes. The controls are the differentiator.

Your observability stack is the audit capability. You're probably still missing the controls.

For a deeper look at how the governance plane separates these responsibilities by design, see The Agentic Architecture Governance Plane.

What Does AI Agent Governance Maturity Look Like?

It helps to have a map. Here's how agent deployments actually mature — which is to say, here's the spectrum most teams move through, not always in order and not always intentionally:

Stage 1: Prototype. One environment. Direct API calls. No logging, no monitoring. You're iterating fast. Governance isn't the point; proving the concept is.

Stage 2: Production-deployed, unmonitored. The agent is live. Real users. No meaningful observability. You find out about problems from user complaints. Most teams move through this stage faster than they'd like to admit. Enterprise AI governance sprawl typically originates here — agents get deployed in Stage 2 across business units before a central infrastructure team realizes how many are running.

Stage 3: Observable. Logging in place. Session traces. Some alerting on errors and latency. You can diagnose problems after they happen. This feels like a significant improvement — and it is — but it's still not governance.

Stage 4: Governed. Policies defined. Enforcement at the runtime layer. Real-time visibility into policy violations. Budget guardrails. PII controls. Audit trail that's usable by non-engineers. You can answer questions about agent behavior on a timeline of minutes, not hours.

Most teams with production agents are at Stage 3. They believe they're at Stage 4 because they've invested in observability tooling. The distinction between 3 and 4 is the enforcement layer — not more dashboards, but real controls.

What Flying Blind Actually Looks Like

It's not that you have no information. It's that the information you have isn't sufficient for the decisions you need to make, and the information you'd need is either not collected or not actionable in time.

A few patterns that show up repeatedly in teams that don't know they're at Stage 3:

You find cost anomalies in the monthly billing cycle. Spend spiked three weeks ago. You're only finding it now because the bill arrived. The sessions that caused the spike are cold. Whatever caused them is either fixed or still happening. In November 2025, a team running a multi-agent workflow via LangChain ran an 11-day recursive loop that cost $47,000 before anyone checked the bill — not because the tooling didn't exist to catch it, but because the enforcement layer wasn't in place. The full breakdown is covered in depth in AI Agent Token Budget Enforcement.

You can't answer regulatory questions in good time. A user requests deletion of their data under GDPR. You need to locate every place their PII appears in your agent's logs and processing history. You know it's in there. You don't have a tool that lets you find it systematically. This takes a team three days that should take an hour.

You learn about behavioral regressions from users. A code change three weeks ago altered a system prompt. It changed the agent's behavior in a subtle but consistent way. Users started noticing last week. You're figuring it out this week. There's no mechanism to detect behavioral drift; you're relying on user feedback as your canary.

You don't know what you'd do if something was actively wrong. The bad session is happening right now. What's the intervention? If the answer is "stop the service and redeploy," that's not governance — that's a blunt instrument. Governance gives you targeted interventions: terminate a specific session, apply a policy update without a redeploy, block a specific tool call pattern while everything else continues.

What the Gap Costs

The gap has a cost structure that's easy to underestimate because many of its costs are probabilistic and hypothetical until they're not.

Legal liability, now quantified. Gartner projects that by the end of 2026, "death by AI" legal claims will exceed 2,000 due to insufficient AI risk guardrails — rising wrongful death incidents from AI-related safety failures that will drive increased regulatory scrutiny, recalls, and higher litigation costs. That's not a long-range forecast — it's an 8-month window from now.

Regulatory exposure. The EU AI Act Annex III (enforcement deadline now December 2027 per the EU Digital Omnibus revision agreed May 7, 2026), GDPR, HIPAA, NIST AI Risk Management Framework (AI RMF 1.0), and the Colorado Artificial Intelligence Act (SB 24-205, enforcement date June 30, 2026) all have something to say about AI systems that process personal data, make consequential decisions, or operate in high-risk domains. Organizations that can demonstrate systematic governance — defined policies, documented enforcement, auditable records — are in a defensible position. Organizations that can't are exposed.

Customer trust incidents. When an agent behaves badly in a visible way — surfaces data it shouldn't, gives harmful advice, produces output that's offensive or factually wrong in a damaging way — the customer relationship takes a hit that's out of proportion to the technical severity of the failure. The absence of governance is the story that gets told: "they didn't have controls in place." The Wharton AI & Analytics Institute documented two enterprise incidents in early 2026 fitting exactly this pattern, including one that publicly exposed millions of customer interactions before detection.

Engineering drag. Teams without governance infrastructure spend disproportionate time on ad hoc incident response. Every anomaly is a manual investigation. Every compliance question is a one-off project. Every cost spike is a fire drill. This is engineering time that doesn't compound — it's spent, and then the next incident arrives.

The compounding cost of retrofitting. Governance that's designed in from the start costs a fraction of governance that's bolted on after the fact to a system that wasn't designed for it. Every month you delay is another month of technical debt accumulating against the governance retrofit.

How Fast Is Regulatory Pressure Building?

For teams in regulated industries (financial services, healthcare, legal) the timeline for governance being non-optional is already short. For everyone else, it's short-to-medium.

The EU AI Act's Annex III deadline was recently extended — on May 7, 2026, EU lawmakers agreed on the Digital Omnibus revision, pushing the high-risk systems deadline from August 2026 to December 2027. This creates more runway for implementation, but it doesn't reduce the underlying requirement. Organizations deploying agentic systems in Annex III categories face a particular complexity: conformity assessment frameworks were designed around static systems, and adaptive agentic behavior creates real certification challenges that teams need to work through before the deadline, not during the final months.

State-level enforcement is arriving fast. Colorado's Artificial Intelligence Act (SB 24-205) reaches its enforcement date June 30, 2026 — less than seven weeks from now. The trend across US states is toward higher documentation and control requirements for AI systems, not lower.

The good news is that governance infrastructure built for your own operational needs maps reasonably well to what regulators are asking for. Defined policies, enforcement logs, audit trails, incident response procedures — these aren't compliance theater, they're legitimate operational assets that also happen to satisfy what your auditor will eventually ask for.

Building governance because you need it operationally, and getting compliance coverage as a side effect, is a much better path than building it reactively under deadline pressure because regulators are asking.

The governance gap is closable. It requires a clear-eyed assessment of where you actually are on the maturity spectrum (most teams find they're a stage behind where they thought), and an intentional move toward enforcement infrastructure rather than more monitoring.

The teams that do this now do it on their own terms. Everyone else does it eventually, under conditions they didn't get to choose.

How Waxell handles this: Waxell Runtime is the enforcement layer that closes the gap between Stage 3 (observable) and Stage 4 (governed). You define policies — spend ceilings, PII rules, tool constraints, across 26 policy categories out of the box — and Runtime enforces them in real time across every agent session, before execution begins. Waxell Observe provides the audit trail documenting every governance decision, making regulatory questions answerable in minutes rather than days. The operational questions that previously required investigation become answerable on demand. Request early access →

Frequently Asked Questions

What is the AI agent governance gap?
The governance gap is the difference between observing what your AI agents do and actually controlling what they're allowed to do. Teams that have invested in observability — logs, traces, dashboards — often believe they have governance. They don't. Governance requires enforcement: real-time policies that prevent bad behavior before it occurs, not monitoring that surfaces it afterward.

What is the difference between AI agent observability and governance?
Observability is retrospective visibility — you can see what happened after it happened. Governance is prospective control — you define what's allowed to happen and enforce those rules in real time. The analogy: a bank that reviews transactions after they complete has auditing. A bank that also enforces transaction limits and runs real-time fraud scoring has controls. You probably have the first. You likely don't have the second.

What does AI agent governance maturity look like?
Governance maturity moves through four stages: prototype (no monitoring), production-deployed but unmonitored (live but blind), observable (logging and traces, problems diagnosed after the fact), and governed (policies defined, enforcement in real time, operational questions answerable on demand). Most teams with production agents are at Stage 3 believing they're at Stage 4. The diagnostic question: can you answer behavioral, cost, and data questions about your agents in minutes without engineering investigation?

How do you know if your AI team has a governance gap?
Four signals: you find cost anomalies in monthly billing rather than in real time; you can't answer GDPR data subject requests without a multi-day engineering investigation; you learn about behavioral regressions from users rather than monitoring; and you don't know what targeted intervention you'd take if an agent was actively misbehaving right now — your only option is a full service restart.

What does it cost to close the governance gap later versus now?
Governance designed in from the start costs a fraction of governance retrofitted onto a system that wasn't designed for it. The compounding cost: every month without governance is another month of technical debt, plus the probabilistic cost of incidents that happen in the gap — regulatory exposure, customer trust incidents, engineering time spent on manual incident response, and the cost of the incident itself. Gartner projects more than 2,000 "death by AI" legal claims will be filed by end of 2026 due to insufficient AI risk guardrails.

What legal liability does the governance gap create?
Gartner projects that by end of 2026, "death by AI" legal claims will exceed 2,000 due to insufficient AI risk guardrails — wrongful death incidents from AI-related safety failures driving regulatory scrutiny and litigation costs. The EU AI Act Annex III (deadline December 2027), GDPR, and the Colorado AI Act (SB 24-205, enforcement date June 30, 2026) all establish documentation and control requirements that ungoverned deployments will fail to meet. Courts and regulators are not distinguishing between "we didn't know the agent would do this" and negligence — the question is whether reasonable controls were in place.

Sources

OutSystems, State of AI Development 2026: Agentic AI Goes Mainstream (April 2026) — https://www.businesswire.com/news/home/20260407749542/en/Agentic-AI-Goes-Mainstream-in-the-Enterprise-but-94-Raise-Concern-About-Sprawl-OutSystems-Research-Finds
Gravitee, State of AI Agent Security 2026 (2026) — https://www.gravitee.io/state-of-ai-agent-security
Adobe, AI and Digital Trends Report 2026 (February 2026) — https://business.adobe.com/resources/digital-trends-report.html
Gartner, Top Predictions for IT Organizations and Users in 2026 and Beyond (October 2025) — https://www.gartner.com/en/newsroom/press-releases/2025-10-21-gartner-unveils-top-predictions-for-it-organizations-and-users-in-2026-and-beyond
Wharton AI & Analytics Initiative, Two Early 2026 AI Exposures: Lessons for the Future of AI and Data Governance (2026) — https://ai-analytics.wharton.upenn.edu/wharton-accountable-ai-lab/two-early-2026-ai-exposures-lessons-for-the-future-of-ai-and-data-governance/
European Commission, EU AI Act Annex III — https://artificialintelligenceact.eu/annex/3/
NIST, AI Risk Management Framework (AI RMF 1.0) (2023) — https://doi.org/10.6028/NIST.AI.100-1
Colorado General Assembly, SB 24-205 Artificial Intelligence Act — https://leg.colorado.gov/bills/sb24-205

What Is an AI Playbook? The Difference Between Context You Retype and Context That's Already There

Frances — Tue, 12 May 2026 14:26:13 +0000

A playbook is what a prompt becomes when you stop storing it in your head. It lives in a workspace, carries your context, your process, and your standards, and agents read it automatically when they enter — nothing pasted, nothing re-explained.

I used to have a very good prompt. Twelve hundred words, carefully tuned. My company name, my products, my customer segments, my communication style, the things I care about and the things I don't. I could drop it into any AI session and get usable output in seconds.

I also retyped it — or pasted it from a note that was never quite right for today's task — every single session.

The prompt was good. The location was wrong.

What a Prompt Is (and What It Can't Do)

A prompt lives wherever you stored it. Usually that means a note in your project management tool, a sticky in a doc, or the back of your memory. You paste it in when you remember, adapt it for the task at hand, and when the session ends, it's gone. The next session starts clean.

This is not a model problem. The AI isn't forgetting because the underlying model is limited. The context was never stored anywhere the agent could reach it. So every conversation begins from zero, and you provide the starting point again.

The instructions don't change when context moves to a workspace. What changes is that the agent can reach them on its own, without you pasting anything into the chat.

What a Playbook Is

A playbook is a file that lives in a Connect workspace and that agents read automatically when they enter. It's not a prompt you paste in at the start of a chat, and it's not a document someone opens and reads. It's the brief that exists independently of any session — there when you open one, there when a scheduled task runs at 3 AM, there when a colleague opens their own session tomorrow.

What goes in a playbook varies by workspace, but the structure usually covers four things: purpose (what this workspace is for and who uses it), context (what the agent should know before doing anything), process (how work gets done here), and standards (voice, format, escalation rules, what not to do). The format is markdown. The requirement is that it's specific enough to be useful without you present.

The difference from a prompt is location and durability. A prompt exists in a session. A playbook exists in the workspace and survives every session that comes and goes.

Where This Matters in Practice

Every Cowork session I open, I specify which workspace I'm entering. The agent reads the files — including the playbook — before I type a word. I don't paste anything. I don't re-explain the business, the standards, or the process. That context is already there.

Before Connect, my workflow looked different. Relevant notes lived in a project management tool. When I needed AI help with anything, I manually copied the relevant rows, customer details, or project context out of that tool and pasted them into a new chat session. The agent worked from whatever I'd pasted. If I forgot to include a detail, the output showed it. If I closed the session, I started from zero next time.

The AI didn't get smarter. The context moved somewhere the agent could find it.

This is how it works in my setup, using Cowork as my interface for Connect. Connect is also accessible via API and web UI — if you've built your own agent tooling or you're accessing Connect programmatically, the mechanism is the same. The agent reads the workspace on entry. The source of truth is the workspace, not the chat history.

The Difference That Took Me Longest to See

A prompt is written for a task. A playbook is written for a workspace.

The difference shows up most in how you maintain context over time. A prompt is optimized for the thing you're doing right now. A playbook covers what's true about this space, this workflow, these standards — regardless of what the specific task turns out to be. You write it once, update it when something changes, and every agent that enters the workspace uses the current version.

The compound effect comes from updates. When I changed how I handle customer escalations, I updated one file in one workspace. Every subsequent task in that workspace — whether I ran it or a scheduled task did — used the new approach. With a prompt, the same change requires me to remember to update my notes, find them, paste the updated version next time.

I forgot a lot.

What a Playbook Doesn't Replace

A playbook is not a substitute for task-specific instructions. It covers what's always true; you still tell the agent what's specific to right now. The playbook tells it about your business, your voice, your process. Your message in the session tells it what to do today.

The way I think about it: a playbook is onboarding. You don't re-onboard a colleague every morning. You did that once, and now they know the context. You give them today's task. A playbook does the same thing for agents — the brief already happened, before the session started.

If you're running workflows in Waxell Connect and haven't written a playbook for your primary workspaces yet, that's the first thing worth doing. The rest of what Connect can do builds from having that context layer in place. You can get access at waxell.ai/get-access.

FAQ

What is an AI playbook?

An AI playbook is a persistent, agent-readable file stored in a workspace that gives agents the context they need before a session begins. It typically covers the purpose of the workspace, relevant background information, process steps, and standards the agent should follow. Unlike a prompt, which is written into a chat session and disappears when the session ends, a playbook stays in the workspace and is read automatically each time an agent enters.

What's the difference between a prompt and a playbook?

A prompt is written into a chat session and exists only for the duration of that session. A playbook is a file that lives in a workspace permanently and is read by agents when they enter — with or without you typing anything. The practical result: a prompt requires you to provide context every session; a playbook means the context is already there.

What should I put in an AI playbook?

Start with four things: what this workspace is for, what the agent needs to know before doing anything (business context, product details, relevant constraints), how work gets done here (process steps, tools, escalation rules), and what the standards are (voice, format, what to avoid). Markdown works fine. Specificity matters more than length — a 400-word playbook that's precise will produce better output than a 1,500-word one that hedges. Update it whenever something changes, since every future task in the workspace will use whatever version exists at the time.

Do I have to paste a playbook into every chat session?

No. That's the point. If your context is stored as a file in a Connect workspace, agents read it on entry without any action from you. The old workflow — copy context from notes, paste into new session — is what the workspace-playbook pattern replaces. The playbook is there whether you're actively in the session or a scheduled task is running overnight.

How is a playbook different from a system prompt?

A system prompt is set at the model or API level and applies to a specific session configuration. A playbook is a file in a workspace that's read as context when an agent enters it. In practice: a system prompt is usually configured once by whoever set up the tool; a playbook is owned and edited by whoever owns the workspace, can be updated mid-use, and applies to any agent that enters — regardless of how the underlying model or session is configured. The playbook is also visible and editable by anyone with workspace access, which makes it easier to maintain and update.

Can different workspaces have different playbooks?

Yes, and this is one of the reasons the pattern holds up at scale. Each workspace has its own playbook, its own context, its own standards. A customer-facing workspace has a different playbook than an internal ops workspace. A blog production workspace has different standards than a bug triage workspace. The agent entering each one reads what's relevant to that specific space. Nothing bleeds across unless you explicitly reference it.

Sources

Anthropic. Building Effective Agents. https://www.anthropic.com/engineering/building-effective-agents
Anthropic. Prompt Engineering for Business Performance. https://www.anthropic.com/news/prompt-engineering-for-business-performance

AgentOps vs. MLOps: What the Old Playbook Missed (And Why It's Costing Projects in 2026)

Logan — Tue, 12 May 2026 14:23:44 +0000

By March 2026, roughly 12 percent of enterprise AI agent pilots had reached production at scale. The remainder—roughly 88 percent—failed to realize durable value. Gartner's mid-2025 analysis projected that over 40 percent of agentic AI projects will be canceled outright before 2027. These are not model failures. The models are improving. These are operational failures, and the teams experiencing them are frequently discovering a painful truth: the MLOps discipline that made machine learning deployable does not transfer cleanly to agents.

Most engineering organizations are not starting from scratch. They have MLOps infrastructure. They have monitoring pipelines, experiment tracking, model registries, and drift detection. Their instinct is to apply those tools and practices to agents. That instinct makes sense historically. But agents are a structurally different kind of system, and the assumptions embedded in MLOps—deterministic pipelines, static outputs, batch-observable behavior—break in ways that don't become visible until something goes wrong in production.

What MLOps Got Right

MLOps emerged because software engineering discipline was insufficient for machine learning. Code version control and deployment pipelines did not account for model drift, training data lineage, feature skew, or the way a model's behavior could silently degrade between training and serving. MLOps filled that gap. It gave teams experiment tracking (MLflow, Weights & Biases), model registries for artifact versioning, data pipelines with reproducibility guarantees, and monitoring infrastructure for detecting behavioral drift from baseline.

These are genuine contributions. They made ML systems more reliable, more auditable, and more deployable at scale. The discipline matured quickly—by 2023, a well-understood MLOps stack was an established expectation for any serious ML deployment.

The implicit model underlying all of it: a function that takes inputs and produces outputs, where the system's job is to ensure those inputs and outputs remain consistent and within expected bounds over time. Monitoring means observing the distribution of outputs. Drift means the output distribution has shifted. Governance means being able to reproduce any version of the model and retrace any prediction.

This works when "the system" is a model that answers questions. It does not work when "the system" is an agent that makes decisions and takes actions.

Where MLOps Assumptions Break Down for Agents

An AI agent is not a function that maps inputs to outputs in a single inference pass. It is an ongoing process that selects tools, makes sequential decisions, consumes external APIs, reads from and writes to data systems, spawns sub-agents, and potentially runs for seconds or minutes before producing any externally visible result. Each step is conditionally dependent on the last. The behavior is non-deterministic—two runs with identical prompts can take materially different execution paths.

This creates three structural problems for MLOps-style operations.

1. There are no intermediate outputs to monitor. MLOps observes model responses. An agent that makes twelve tool calls before producing a result gives the monitoring layer one observable output, but eleven preceding steps that could have gone wrong. If step seven retrieved incorrect data and step eight acted on it, the final output may appear plausible while being wrong. The failure is not in the output distribution. It is in the execution chain, which the monitoring layer never sees.

2. Drift detection assumes a static behavior baseline. An agent's behavior changes based on the tools available to it, the instructions it receives, the context in its window, and what external systems return. There is no fixed "correct" baseline against which to measure drift in the same way one exists for a classification model. A financial agent that behaved correctly last week may behave incorrectly this week because a connected data source changed—and no MLOps drift detector will surface that, because the model weights have not changed.

3. Governance is retrospective instead of preventive. MLOps governance is largely post-hoc: teams can retrace what a model produced and reconstruct why. But agents take actions—they send emails, modify records, call APIs, execute code. By the time the trace has been reviewed, the action has already occurred. The governance model that works for predictions fails for actions.

Reddit threads in early May 2026 surfaced what practitioners call the "silent failures" problem: agents burning tokens without producing results, chaining tool calls that accomplish nothing, or completing a workflow while producing subtly wrong outputs that no one noticed until days later. These are operational failures that model-level monitoring does not catch, because they are not about the model's outputs—they are about the agent's behavior under real execution conditions.

The Three Gaps AgentOps Has to Fill

AgentOps as a discipline is not MLOps extended with agent tooling. It requires different categories of infrastructure.

Runtime governance, not post-hoc monitoring. Instead of observing what an agent did after the fact, AgentOps requires enforcing what an agent is allowed to do during execution—before a tool call is made, not after it completes. This means a control layer that sits above the agent framework and intercepts actions at the pre-execution, mid-execution, and post-execution stages. Waxell Runtime applies 26 policy categories at this layer out of the box—governing inputs, tool calls, data access, cost boundaries, and escalation triggers before they reach the agent's execution environment. This is categorically different from logging what happened after it happened.

Execution-level visibility across the full chain. MLOps observability is request-level. AgentOps observability needs to be execution-level—capturing every step, every tool call, every sub-agent invocation, and every context window transition within a single run. Waxell Observe provides this through runtime telemetry instrumented across 200+ libraries and agent frameworks, initialized in two lines of code:

import waxell
waxell.init()

That single integration surfaces the complete execution log for every agent run—not the model response, but the full behavior chain that produced it. The difference matters: a model response tells you what was said; an execution log tells you what the agent decided to do and why.

Policy-enforced scope control. Agents that can access anything are agents that can break anything. A production-grade AgentOps practice requires defining, enforcing, and auditing what each agent is authorized to touch—not at the application layer, where the agent itself can be manipulated, but at the governance layer above it. Waxell Runtime's policy enforcement operates here: scope limits, cost hard stops, and escalation triggers that the agent cannot override, because they are enforced outside the agent's reasoning loop. No rebuilds required—governance attaches to existing deployments.

Why the Failure Rate Looks the Way It Does

The near-universal failure rate for enterprise agent pilots is often attributed to unclear use cases, organizational inertia, or model immaturity. The more accurate diagnosis is operational category mismatch. Teams apply MLOps practices—experiment tracking, output monitoring, post-deploy observation—to systems that require runtime governance. The gap is not sophistication. It is the wrong tool class applied to the wrong problem.

In 2026, MLOps is no longer sufficient on its own for teams running agents in production. The teams closing the pilot-to-production gap share a pattern: they are not just adding observability. They are adding a governance layer that operates at runtime, enforcing what agents are allowed to do before actions occur, not only surfacing what happened after they do.

How Waxell Handles This

Waxell is built around the structural difference between observing model outputs and governing agent behavior.

Waxell Observe instruments the complete execution chain, giving teams step-level visibility into agent behavior—every tool call, every sub-agent handoff, every reasoning transition—across 200+ frameworks and libraries. Two lines of code, no framework changes.

Waxell Runtime sits above agent frameworks and enforces 26 categories of policy at the pre-execution stage: governing what data agents can access, what tools they can call, what budget thresholds trigger a hard stop, and what actions require a human escalation before they proceed.

For teams whose agents interact with external APIs, third-party tools, or vendor platforms they did not build, Waxell Connect governs those agents without requiring an SDK or code changes on the vendor side—applying runtime governance to the agents you didn't build, not just the ones you did.

FAQ

What is AgentOps and how does it differ from MLOps?
AgentOps is the operational discipline for managing AI agents in production—covering runtime governance, execution-level observability, scope and identity control, and incident response for agentic systems. It differs from MLOps in that MLOps is designed for static model deployments with predictable input-output mappings, while agents operate as dynamic, multi-step processes that take real-world actions. MLOps observes what a model produces; AgentOps governs what an agent is permitted to do.

Why do most AI agent pilots fail to reach production?
The most common cause is operational infrastructure borrowed from MLOps and applied without adjustment. Teams typically have strong model-level observability and experiment tracking, but lack the runtime policy enforcement, execution-level tracing, and scope controls that agents require. Pilots work in sandboxed environments because sandboxes don't have production data, cost implications, or compliance requirements. The governance gap surfaces when they do.

Can LangSmith or Helicone handle the AgentOps layer?
LangSmith and Helicone provide strong observability for LLM calls and agent traces—that's the visibility layer. AgentOps also requires the enforcement layer: runtime controls that prevent scope violations, data leakage, runaway cost loops, and unauthorized tool calls before they occur. Observability tools surface problems after the fact. Governance tools prevent them during execution. A complete AgentOps stack needs both.

What does runtime governance look like in practice?
Runtime governance means a control layer that intercepts agent actions at the point of execution—before a tool is called, before data is accessed, before a cost threshold is crossed. Concretely: a policy that blocks an agent from reading a customer record it is not authorized to access; a budget hard stop that terminates a runaway loop before it incurs a material cost overrun; an escalation trigger that routes a high-stakes action to a human approver rather than executing autonomously.

What is the minimum viable AgentOps stack for a production deployment?
At minimum: execution-level tracing (not just LLM call logging), scope control over what tools and data the agent can access, a cost limit with hard enforcement, and an audit trail of every action taken. These are not advanced features—they are the baseline that any agent interacting with real data or taking real actions requires before leaving a controlled environment.

What is Waxell Observe and how does it fit an AgentOps stack?
Waxell Observe is the observability SDK that instruments the full execution chain for AI agents—every tool call, every sub-agent invocation, every reasoning step—across 200+ frameworks. It initializes in two lines of code and requires no framework changes. For teams building a complete AgentOps stack, Observe handles the visibility layer; Waxell Runtime handles the enforcement layer.

Sources

Digital Applied (March 2026): "88% of agent pilots never reach production" (cross-industry average: 12%). https://www.digitalapplied.com/blog/ai-agent-adoption-2026-enterprise-data-points
Gartner (June 2025, via secondary coverage): "More than 40% of agentic AI projects will be canceled before reaching production by 2027." Cited in https://www.companyofagents.ai/blog/en/ai-agent-roi-failure-2026-guide
Dev.to / Reddit aggregation (May 2026): Ten Reddit threads documenting the "agent enthusiasm becoming control anxiety" pattern and the "silent failures" problem. https://dev.to/nance_craft_6cffbc0c3a042/ten-reddit-threads-showing-ai-agents-have-entered-their-operations-era-3gak
Arize AI (February 2026): Paraphrase — In the DevOps era, we monitored server health; in the MLOps era, model drift and training loss; in the Agent Era, decisions. https://arize.com/blog/best-ai-observability-tools-for-autonomous-agents-in-2026/