Every AI agent that executes code needs a sandbox. And teams building one often end up writing the same thing: a Python wrapper around subprocess.run(["docker", "run", ...]) with a growing list of security flags they keep forgetting to set.
The Problem
Here's what a typical "sandbox" looks like in most agent codebases:
import subprocess
import json
result = subprocess.run(
["docker", "run", "--rm", "--network=none",
"--memory=512m", "--cpus=1",
"--read-only", "--security-opt=no-new-privileges",
"--pids-limit=64",
"python:3.12-slim",
"python3", "-c", "print('hello')"],
capture_output=True, text=True, timeout=300
)
print(result.stdout)
This works. Until it doesn't:
- Someone forgets --network=none and your agent starts making HTTP requests.
- The timeout handling is a mess when Docker itself hangs
- Parsing stdout/stderr gets fragile fast
- Cleanup on crash? Good luck
- Want to swap Docker for Firecracker? Rewrite everything
What We Built
Roche is a sandbox orchestrator that replaces all of that with:
from roche_sandbox import Roche
with Roche().create(image="python:3.12-slim") as sandbox:
result = sandbox.exec(["python3", "-c", "print('hello')"])
print(result.stdout)
That's it. The sandbox is created with secure defaults, the command runs, and the sandbox is destroyed when the context manager exits. Even if your code throws an exception.
What "Secure Defaults" Actually Means
When you call Roche().create() with no arguments, you get:
| Setting | Default | Why |
|---|---|---|
| Network | Disabled | LLM-generated code should not make HTTP calls |
| Filesystem | Read-only | No persistent writes, no dropping payloads |
| Timeout | 300 seconds | No infinite loops eating your CPU |
| PID limit | 64 | No fork bombs |
| Privileges | no-new-privileges | No privilege escalation |
Every one of these can be overridden when you need to:
sandbox = roche.create(
image="python:3.12-slim",
network=True, # enable network
writable=True, # writable filesystem
timeout_secs=600, # longer timeout
memory="1g", # memory limit
cpus=2.0, # CPU limit
)
But you have to opt in. Dangerous capabilities are never on by default.
Async Support
If you're building an async agent (most are), there's AsyncRoche:
from roche_sandbox import AsyncRoche
async def run_code(code: str) -> str:
roche = AsyncRoche()
async with (await roche.create()) as sandbox:
result = await sandbox.exec(["python3", "-c", code])
return result.stdout
Using It With Agent Frameworks
Roche doesn't care what framework you use. Here's a quick example with OpenAI Agents:
from agents import Agent, Runner, function_tool
from roche_sandbox import Roche
roche = Roche()
@function_tool
def execute_python(code: str) -> str:
"""Execute Python code in a secure sandbox."""
with roche.create() as sandbox:
result = sandbox.exec(["python3", "-c", code])
if result.exit_code != 0:
return f"Error:\n{result.stderr}"
return result.stdout
agent = Agent(
name="Coder",
instructions="You can run Python code using execute_python.",
tools=[execute_python],
)
Same pattern works with LangChain, CrewAI, Anthropic tool use, AutoGen, etc. The sandbox logic stays the same regardless of the framework.
Swapping Providers
The whole point of Roche is that provider choice is a config change, not a rewrite:
# Docker (default)
roche = Roche(provider="docker")
# Firecracker microVMs (stronger isolation)
roche = Roche(provider="firecracker")
# WebAssembly (lightweight, fast)
roche = Roche(provider="wasm")
Your create / exec / destroy calls don't change. The security defaults adjust per provider but stay safe.
Architecture (For the Curious)
The core is a Rust library (roche-core) with a SandboxProvider trait:
Your Code (Python/TS/Go)
|
v
SDK (roche-sandbox on PyPI)
|
v
CLI subprocess or gRPC daemon (roched)
|
v
roche-core (Rust)
|
v
Docker / Firecracker / WASM
The SDKs communicate with the Rust core either by shelling out to the roche CLI (zero setup) or through a gRPC daemon (roched) that adds sandbox pooling for faster acquisition.
You don't need to install Rust. pip install roche-sandbox is enough if you have Docker on your machine.
Getting Started
pip install roche-sandbox
from roche_sandbox import Roche
with Roche().create() as sandbox:
out = sandbox.exec(["python3", "-c", "import sys; print(sys.version)"])
print(out.stdout)
Requirements: Python 3.10+ and Docker.
Links
The whole thing is Apache-2.0. Contributions welcome.
Top comments (0)