RAXXO Studios

Posted on Apr 9 • Originally published at raxxo.shop

Claude Managed Agents: Build and Deploy AI Agents at Scale

#ai #productivity #claudecode #automation

Managed Agents is Anthropic's new hosted agent infrastructure, now in public beta
Architecture splits brain, hands, and session into independent pieces for 60% faster cold starts
Four API calls get you from zero to a running agent with bash, file ops, and web access
Custom tools and MCP servers extend agents beyond built-in capabilities
Best for long-running async work where you don't want to build your own orchestration layer

Claude Managed Agents: How to Build and Deploy AI Agents at Scale

Anthropic shipped something on April 8th that isn't a model upgrade or a chat feature. Claude Managed Agents is a fully hosted infrastructure for building, deploying, and running autonomous AI agents. The announcement pulled 12 million views on X in under 24 hours. 39,000 bookmarks. Developers have clearly been waiting for this.

I've been building agent workflows with the Messages API for months. Custom loops, manual tool execution, state management headaches. Managed Agents takes all of that off your plate. You define what the agent does, Anthropic handles how it runs. Here's everything I've learned after digging through the docs and testing the beta.

What Claude Managed Agents Actually Is

The simplest explanation: it's a hosted agent runtime. Instead of writing your own agent loop (call Claude, parse tool use, execute tool, send result back, repeat), you configure an agent once and let Anthropic's infrastructure handle the orchestration.

The system gives Claude access to real tools inside a cloud container. Bash execution, file read/write/edit, glob and grep for searching, web fetch and web search for pulling external data. These aren't simulated. The agent runs actual shell commands in a real Linux environment.

You interact through four concepts:

Agent: The configuration. Model choice, system prompt, which tools are available, any MCP servers you want to connect.
Environment: The container template. What packages are installed, what network access is allowed, what files are mounted.
Session: A running instance. One agent config can spawn thousands of sessions. Each session has its own filesystem and conversation history.
Events: The communication layer. You send user messages as events. Claude streams back responses, tool calls, and status updates via server-sent events (SSE).

The beta launched April 8, 2026, and it's enabled by default for every API account. No waitlist. You add the managed-agents-2026-04-01 beta header to your requests and you're in.

The Architecture That Makes It Fast

This is where Managed Agents gets interesting from an engineering perspective. Anthropic published a detailed breakdown of how they decoupled the system into three independent layers. They call them the Brain, the Hands, and the Session.

The Brain is Claude plus the agent harness. It handles reasoning, tool selection, and response generation. It runs as stateless instances that can scale horizontally. No container needs to be running for Claude to think.

The Hands are the execution environments. Containers, custom tools, MCP servers. They're treated as interchangeable resources with a simple interface: execute(name, input) returns a string. The key insight is that containers only spin up when Claude actually needs to run something. Not before. This alone cut time-to-first-token by 60% at p50 and over 90% at p95.

The Session is an append-only event log stored outside both the brain and the hands. If the harness crashes, the session survives. If a container dies, the session survives. Agents can query their own history through getEvents() to rebuild context. Anthropic described the shift as turning every component into "cattle rather than pets." Replaceable without losing state.

The security model follows directly from this separation. Your API credentials, vault tokens, and secrets never exist inside the sandbox where Claude's generated code runs. Authentication happens at a layer the agent can't touch. That's a real improvement over most DIY agent setups where you end up mounting .env files into containers and hoping for the best.

Compare this to what building agents looked like six months ago. You'd spin up a container, install dependencies, drop in your API keys, start an agent loop, and pray that nothing crashed between tool calls. If the process died mid-task, you lost everything. With Managed Agents, the session log persists independently. The harness restarts. The container reprovisions. Your work picks up where it stopped. That reliability gap is the difference between a demo and something you'd actually run in production.

Building Your First Agent in Four Steps

The API surface is clean. Four calls and you have a running agent. I'll show both Python and TypeScript since those are the two SDKs most developers reach for.

Step 1: Create an agent.


from anthropic import Anthropic

client = Anthropic()

agent = client.beta.agents.create(
    name="Code Reviewer",
    model="claude-sonnet-4-6",
    system="You review code for bugs, security issues, and style problems. Be direct.",
    tools=[
        {"type": "agent_toolset_20260401"},
    ],
)

The agent_toolset_20260401 type enables all built-in tools at once. If you want to restrict access (say, no web browsing), you disable specific tools through a configs array:


tools=[
    {
        "type": "agent_toolset_20260401",
        "configs": [
            {"name": "web_fetch", "enabled": False},
            {"name": "web_search", "enabled": False},
        ],
    },
]

Step 2: Create an environment.


const environment = await client.beta.environments.create({
  name: "reviewer-env",
  config: {
    type: "cloud",
    networking: { type: "unrestricted" },
  },
});

The environment is your container template. You choose whether it gets network access, what packages are pre-installed, and what files are available. An unrestricted network lets the agent fetch URLs and install packages on the fly. Lock it down for sensitive workloads.

Step 3: Start a session.


session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
    title="Review PR #847",
)

One agent config, many sessions. Each session gets its own container and conversation state.

Step 4: Send a message and stream the response.


const stream = await client.beta.sessions.events.stream(session.id);

await client.beta.sessions.events.send(session.id, {
  events: [
    {
      type: "user.message",
      content: [
        {
          type: "text",
          text: "Clone the repo at github.com/example/app and review src/auth.ts for security issues",
        },
      ],
    },
  ],
});

for await (const event of stream) {
  if (event.type === "agent.message") {
    for (const block of event.content) {
      process.stdout.write(block.text);
    }
  } else if (event.type === "agent.tool_use") {
    console.log(`\n[Tool: ${event.name}]`);
  } else if (event.type === "session.status_idle") {
    console.log("\nDone.");
    break;
  }
}

The agent runs autonomously from here. It decides which tools to use, executes them in the container, and streams results back. You can also send additional events mid-execution to steer the agent or interrupt it entirely.

Beyond the built-in toolset, you can define custom tools that Claude calls but your application executes. The pattern works like client-side tool use in the Messages API. You define the schema, Claude emits structured requests, your code handles execution, and you send results back as events. This opens up database queries, internal API calls, Slack notifications, or anything else your system can do.

MCP servers are also supported at the agent level. If you already have MCP-compatible tool providers running, point the agent at them and Claude picks up the available tools automatically.

The CLI tool (ant) gives you the same capabilities from the terminal. Install via Homebrew on macOS or download the binary on Linux. Useful for testing agents interactively before wiring them into application code.

Here's something I appreciate about the design: agents are versioned. Every time you update an agent's configuration (new system prompt, different tools, model change), it gets a new version number. Running sessions keep using the version they started with. New sessions pick up the latest. No surprise behavior changes mid-task. No "why did my agent start acting different" debugging sessions at 2am.

When Managed Agents Makes Sense (and When It Doesn't)

Managed Agents isn't a replacement for the Messages API. It's a different tool for a different problem.

Use Managed Agents when:

Your tasks run for minutes or hours, not seconds. Code reviews, data analysis, report generation, multi-step research.
You need a real execution environment. If the agent needs to install packages, run scripts, or manipulate files, managed containers handle that.
You don't want to build orchestration infrastructure. The agent loop, tool execution, state management, and error recovery all come built in.
Your sessions are stateful. Files persist across interactions within a session. Conversation history is maintained server-side.

Stick with the Messages API when:

You need sub-second response times for chat interfaces. Managed Agents adds overhead from container provisioning. For snappy conversational UX, direct API calls are still faster.
You want full control over the agent loop and tool execution. Some teams need custom retry logic, specific error handling, or non-standard tool patterns that a managed harness can't accommodate.
Your tools are all client-side and you don't need a server environment.
You're doing simple request/response work that doesn't require autonomy.

There's also a middle ground worth mentioning. You can use Managed Agents with the built-in toolset disabled and only custom tools enabled. This gives you the session management, event streaming, and state persistence without giving Claude direct access to a container. Useful if you want the orchestration but need to keep all execution on your own infrastructure.

Rate limits are reasonable for most workloads: 60 create operations per minute, 600 read operations per minute. Organization-level spend limits still apply.

Three features are still in research preview and worth watching. Outcomes let you define success criteria so the agent knows when it's actually done (not just when it runs out of things to try). Multiagent orchestration allows multiple agents to collaborate on a task, each with their own specialization. Persistent memory carries learned context across sessions so agents don't start from scratch every time. You can request access to these through Anthropic's form, but expect them to evolve before going GA.

The SDK support is unusually broad. Python, TypeScript, Java, Go, C#, Ruby, and PHP all have first-class support. That's seven languages at beta launch, which signals Anthropic is serious about this being production infrastructure, not a demo.

Pricing follows the standard Claude API model. You pay for the tokens Claude processes plus compute time for the container environment. No separate Managed Agents fee. The built-in prompt caching and compaction optimizations help keep costs predictable for long-running sessions.

Bottom Line

Managed Agents takes the infrastructure problem off your desk. The three-layer split (brain, hands, session) produces real performance numbers. 60% faster cold starts at p50 isn't a slide deck claim when you can trace it directly to containers not spinning up until they're needed.

Four API calls to a running agent. Seven SDKs at launch. A CLI for quick testing. Custom tools and MCP servers when you need to go beyond the defaults. And your credentials stay out of the sandbox entirely.

If you've been building agent workflows with raw API calls and duct-tape orchestration, give this a look. The beta is open to every API account. Add the header and start building.

The 39,000 bookmarks on that announcement tell you developers have been waiting for exactly this. I've already started migrating two of my longer-running workflows from custom agent loops to Managed Agents. The initial results are promising: less code to maintain, faster execution, and no more container babysitting.

If you're building anything that involves Claude making decisions and taking actions over multiple steps, this is the infrastructure to evaluate right now. Not next quarter. Right now. The beta is live, the docs are solid, and the API is stable enough to build on.

What I'm watching now is whether Anthropic ships the research preview features (multiagent, outcomes, memory) before competitors catch up with their own managed agent offerings. That timeline matters more than any feature list.