DEV Community

정상록
정상록

Posted on

Anthropic Managed Agents Architecture: Decoupling Brain from Hands for Scalable AI Agents

Anthropic Managed Agents Architecture: Decoupling Brain from Hands

Anthropic published "Scaling Managed Agents: Decoupling the brain from the hands" on April 8, 2026. It's the most detailed look at production AI agent infrastructure I've read this year. Here's the architecture breakdown.

The Problem: Monolithic Agents Don't Scale

The initial approach was obvious — put everything in one container. Session state, the orchestration loop, and code execution all ran in a single process. It worked for prototypes.

In production, it didn't. A container crash meant lost session data. Debugging was impossible. Scaling required dedicated containers per agent, wasting resources during idle time.

The Solution: 3 Virtualized Components

Anthropic separates the agent into three independent components, each with distinct lifecycle characteristics:

1. Session — Durable Memory

Append-only event log
Lives outside Claude's context window
Supports: getEvents(), rewind, slice, positional access
Enter fullscreen mode Exit fullscreen mode

Session is the single source of truth. Everything that happens gets logged here. The key insight: it's separate from context engineering. The Harness decides what to pull from Session into Claude's context window.

2. Harness — Stateless Orchestrator

Calls Claude API → routes tool calls → writes to Session
Stateless by design → crash and recover from Session
Transforms Session events → Claude's context window
Enter fullscreen mode Exit fullscreen mode

Statelessness is the critical design choice. Any Harness instance can pick up any Session and continue from where it left off. This is what makes horizontal scaling trivial.

3. Sandbox — Disposable Execution

Container-based isolated execution
"Cattle, not pets" — disposable, replaceable
Lazy initialization — provisioned only when needed
Enter fullscreen mode Exit fullscreen mode

Sandboxes spin up only when the agent actually needs to execute code. No idle resource consumption.

Brain vs Hands: The Core Abstraction

This is where the architecture gets elegant:

  • Brain = Claude + Harness (reasoning)
  • Hands = Sandbox + Tools (execution)
  • Session = Event Log (memory)

The interface between Brain and Hands is:

execute(name, input)  string
Enter fullscreen mode Exit fullscreen mode

That's the entire contract. The Harness doesn't know whether the Sandbox is a container, a phone, or a Pokémon emulator. Any tool that implements this interface is a valid "Hand."

Brains can even pass Hands to one another — the foundation for multi-agent coordination.

Performance Impact

The numbers after decoupling are significant:

Metric Improvement
p50 TTFT ~60% reduction
p95 TTFT >90% reduction
Resource utilization Lazy init → pay for what you use

The p95 improvement is the real story. Tail latency dropped by over 90%, which means consistent user experience even under load.

Security: Architecture-Level Enforcement

The Brain/Hands separation naturally enforces security:

  • Credentials never reach Sandbox — fundamental design principle
  • Git tokens: wired into local remotes during init, agent never handles them
  • MCP tools: accessed via dedicated proxy with session-scoped tokens
  • OAuth tokens: stored in secure vault, not in sandbox environment

Even if a Sandbox is compromised, credentials remain safe. Security isn't a layer — it's a consequence of the architecture.

Session as External Context

For long-running agents, context management is the biggest challenge. Claude's context window is finite, but agent execution can span hours or days.

The solution: Session as external context. The Session log exists outside Claude's context window. The Harness transforms events before passing them to Claude, selecting only what's relevant.

Session (all events) 
  → Harness transforms 
    → Claude context window (relevant subset)
Enter fullscreen mode Exit fullscreen mode

This cleanly separates recoverable storage from context engineering.

Many Brains, Many Hands

The ultimate payoff of decoupling:

  • Multiple Brains: stateless Harness instances scale horizontally
  • Multiple Hands: each tool implements execute(name, input) → string
  • Brain-to-Brain handoff: agents can share tools and pass work to each other
  • Any MCP server: standard protocol support out of the box

Business Context

  • Pricing: Standard API token pricing + $0.08 per session-hour
  • Early adopters: Notion, Rakuten, Asana (10x faster deployment reported)
  • Status: Public beta (April 8, 2026)
  • Coming soon: Multi-agent coordination, self-evaluation (research preview)

Key Takeaways for Developers

  1. Externalize state — Make your agent loop stateless. Put all state in a durable event log.
  2. Simple interfaces winexecute(name, input) → string is all you need for tool abstraction.
  3. Treat containers as cattle — If your agent can't survive a container restart, your architecture is wrong.
  4. Security follows architecture — Separate credentials from execution environments by design, not by policy.

The OS analogy is apt: just as operating systems virtualized hardware to enable software innovation, agent infrastructure needs the same abstraction layer. Managed Agents is Anthropic's answer to that challenge.

Full post: https://www.anthropic.com/engineering/managed-agents

Top comments (0)