DEV Community

Jangwook Kim
Jangwook Kim

Posted on • Originally published at jangwook.net

>-

agent = client.beta.agents.create(
    name="code-review-agent",
    model="claude-sonnet-4-6",
    system="You review Python code for security and performance.",
    tools=[{"type": "agent_toolset_20260401"}],
)
Enter fullscreen mode Exit fullscreen mode

That's it. Agent created.

If that one line were truly all there is to it, this post wouldn't need to exist. The real questions come next — environment setup, session management, streaming event handling, and most importantly: what does this actually cost in production?

Anthropic launched Claude Managed Agents in public beta on April 8, 2026. It's a fully managed service where you don't build your own agent runtime. If you've ever hand-rolled an agent loop, you know how much that means. I'm one of those people, so I went ahead and wired up the API directly.

What Happens When You Build Agent Loops From Scratch

Routing tool execution results back to the model, retry logic on failure, deciding what to do when the context window fills up, timeout handling, sandboxing. This is all infrastructure you solve before you write the actual agent logic. In my experience, this surrounding code takes longer than the agent itself.

I covered the orchestrator-subagent pattern in 5 Claude Code Agentic Workflow Patterns, but even when the pattern is clear, implementing and maintaining it in production code is a different challenge.

Managed Agents is Anthropic saying: we'll take that part off your plate.

How It Works: Three Concepts Are All You Need

The core model has three pieces: Agent, Environment, and Session.

An agent is a reusable bundle of system prompt + allowed tools. An environment is the isolated sandbox where the agent runs. A session is the actual execution unit.

from anthropic import Anthropic

client = Anthropic()

# Step 1: Define your agent (create once, reuse)
agent = client.beta.agents.create(
    name="code-review-agent",
    model="claude-sonnet-4-6",
    system="You review Python code for security issues and performance problems.",
    tools=[{"type": "agent_toolset_20260401"}],
)

# Step 2: Create an execution environment
environment = client.beta.environments.create(
    name="prod-env",
    config={
        "type": "cloud",
        "networking": {"type": "unrestricted"},
    },
)

# Step 3: Start a session (the actual execution unit)
session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
    title="Review PR #482",
)
Enter fullscreen mode Exit fullscreen mode

Once a session is live, you send messages and receive responses via SSE stream.

with client.beta.sessions.events.stream(session.id) as stream:
    client.beta.sessions.events.send(
        session.id,
        events=[{
            "type": "user.message",
            "content": [{"type": "text", "text": "Review this Python file: ..."}],
        }],
    )
    for event in stream:
        if event.type == "agent.message":
            print(event.content)
Enter fullscreen mode Exit fullscreen mode

All endpoints require the managed-agents-2026-04-01 beta header, though the Python SDK sets this automatically.

What struck me when I actually wired this up: the interface is cleaner than I expected. The agent_toolset_20260401 built-in toolset activates file reading, web search, and code execution in one shot. Compare that to defining tools one by one in a custom loop — the difference is noticeable.

The Costs: You Have to Do the Math Yourself

$0.08 per hour sounds cheap at first glance. Run it 24 hours and that's $1.92. Run it all month and that's $57.60 — before token costs.

A code review agent handling 10 sessions per day looks like this:

  • 5 minutes average per session × 10 sessions = 50 minutes/day
  • Monthly runtime: ~25 hours → $2
  • Sonnet 4.6 token cost per session: ~$0.05–$0.15
  • Estimated monthly total: $20–50

That's reasonable. The problem is always-on agents. 24/7 means $58/month just in runtime, with token costs stacking on top. If your workload isn't batch-predictable, design for event-driven sessions — open them only when needed.

Two Things That Still Bother Me

I think this is useful, but two things haven't left my head.

Vendor lock-in. Your agent config, session format, and environment container specs are all tied to Anthropic's implementation. Migrating to another model or infrastructure later means re-implementation. This month alone, Anthropic changed policy to cut off third-party tool access for Claude Pro/Max subscribers. Handing over your infrastructure means you're also accepting their future decisions.

The actual scope of the public beta. The most interesting features in the announcement — multi-agent coordination and self-evaluation — aren't in the public beta. They require a separate research preview request. Compared to directly configuring and running agent teams yourself, what you get today in Managed Agents is essentially managed execution for single agents.

When It Makes Sense

If your team doesn't have engineering bandwidth to invest in agent infrastructure, this is worth trying now. A two-person team spending time maintaining an agent loop is paying a high opportunity cost — managed is the right call.

On the other hand, if you're running a multi-model strategy or already have custom orchestration logic, there's no rush. Wait for the public beta to reach GA, when multi-agent features go generally available, and evaluate again then.

Wiring up the API takes 30 minutes. Existing API customers get free beta access. The production decision can wait.

Top comments (0)