EU managed sandboxes for AI agents, in private beta

#ai #webdev #productivity #programming

Your agent suggested rm -rf node_modules && rm -rf .git to fix a build error. You laughed, but only because you weren't running it. The thing about agent-generated shell commands is that they look reasonable until they don't, and most of them shouldn't run on your laptop or your production box - they should run inside something disposable, where the worst case is "the sandbox crashes" and not "I'm git-forensicing my own repo on a Friday night."

So we built managed sandboxes for AI agents. Private beta opening, waitlist live today at orkestr.eu/sandboxes. EU-hosted.

What this is

An HTTP API. Your agent calls POST /v1/sandboxes, gets back a fresh sandbox ID, then calls /exec to run shell commands, /files to read or write files, /pause and /resume to snapshot a session and bring it back later. When the agent's done, it calls DELETE and the whole thing - kernel, filesystem, processes - is gone.

Each sandbox is a dedicated VM. Not a container. Not a v8 isolate. Each one boots its own kernel, mounts its own rootfs, and lives in its own slice of hardware. Cold start is around 150ms; from a warm pool, under 30ms. When two agents are running in two sandboxes on the same physical host, they cannot reach each other - the boundary is hardware, not a namespace the kernel decides to trust.

Why an EU sandbox, now

Anthropic shipped Managed Agents, which lets Claude orchestrate an agent loop while tool execution happens in a sandbox provider you configure. The launch listed four supported providers: Cloudflare, Daytona, Modal, Vercel. All four are headquartered in the US.

That's a problem if you're an EU company sending agent-generated code somewhere. Standard Contractual Clauses cover transfers in principle, but if you're a Berlin fintech or a Paris insurance startup or a German healthcare company explaining to a procurement team why your agent's working data is processed by a US entity, you have a real conversation ahead of you. The EU sandbox alternative had to exist; we built ours.

So we did. The orkestr legal entity is in the EU. The hardware is in the EU. Your sandbox snapshots, environment variables, and runtime memory never leave the EU. GDPR DPA on request, signed by the same company that runs the runtime - not a US parent's subsidiary three layers down.

How it fits next to what's out there

If you've used E2B, Daytona, Modal sandboxes, or Cloudflare Sandboxes, the shape is familiar: REST API, Python and JS SDKs, exec / files / snapshot primitives. Here's what the Python SDK looks like:

from orkestr import Sandbox

with Sandbox.create(template="python-3.12") as sbx:
    sbx.files.write("/workspace/main.py", "print(sum(range(1_000_000)))")
    result = sbx.exec("python /workspace/main.py")
    print(result.stdout)  # 499999500000

Long-running agent sessions can pause and resume across requests, even across workers:

sbx = Sandbox.create(template="node-22", network="restricted")
snapshot_id = sbx.pause()
# ...minutes or hours later, from any process:
sbx = Sandbox.resume(snapshot_id)

There's also an MCP server you can drop into Claude Code or Cursor, and the REST API works fine as a configured sandbox provider for Claude Managed Agents. If your agent can call a tool, it can call this.

What's in the beta

Four templates at launch: python-3.12, python-3.12-bare, node-22, ubuntu-24.04. Each sandbox sized at 1 vCPU and 1 GB of RAM by default; quick one-shot calls cost fractions of a cent. Three network modes: off (no egress, the safe default for LLM-generated code), restricted (allowlist for package registries and common APIs), and open (full egress, gated behind verified payment).

Snapshots are native. Resuming on the same host is under half a second; cross-host is a few seconds because the memory file has to fly between data centres.

The compute runs on dedicated bare-metal hardware in Germany and Finland. Hardware virtualisation on every sandbox, no exceptions. Containers are fine for trusted workloads, but the moment an LLM is generating shell commands you haven't seen yet, sharing the host kernel becomes an awkward conversation. We didn't want to have it.

What's missing in private beta

The honest list:

GPU sandboxes. Not in v1. The CPU product has to work first.
Persistent volumes across sandbox lifetimes. Use pause/resume for now.
Custom Docker images as templates. Coming.
Multi-region routing. Two regions are live; we do not auto-route yet.
Detailed per-sandbox observability for end users. Building.

If you hit something not on this list that feels broken, it's probably a bug, so please tell us.

Pricing, briefly

Per-second CPU and RAM, no minimums, no per-invocation tax. We're keeping the full price list off the page until the first ten design partners have run real workloads against it. The numbers will move once we see what real usage looks like, and we'd rather quote the final price than walk one back.

The catch (because there's always one)

We're letting a small group in each week. The waitlist asks what you're building and what volume you're expecting, which is how we're triaging. Design partners get a discount on the first three months once paid usage opens, and a direct line to us for SDK feedback.

If you've been waiting for an EU sandbox for whatever agent you're building, the waitlist is here. If you have questions before signing up, email me (stefan at orkestr dot eu). I read everything.

Top comments (1)

Stefan Iancu • May 19

What’s the most critical feature missing from your sandbox?