by AIO Sandbox Team
Modern AI agents are no longer just generating text—they are expected to write files, modify code, and execute commands.
But doing this directly on your local machine or production systems is risky and hard to control.
This is where AIO Sandbox comes in. It provides an isolated, programmable environment where agents can safely:
create and manipulate files
run shell commands
execute code
produce artifacts
and many more...
Unlike a typical docker container, which often requires manual configuration for tool-chaining, the AIO Sandbox integrates a browser, a shell, and a file system into a single environment designed for AI agents. This unified architecture ensures that artifacts remain persistent and accessible across every stage of an AI-driven workflow executing within the sandbox.
In this first post, we’ll focus on the two most fundamental capabilities:
🧩 Filesystem (state)
⚙️ Shell (execution)
By the end, you’ll see how these combine into a complete runtime for agents.
🌐 Multi-language SDK Support
While this tutorial uses Python, AIO Sandbox is not limited to Python developers.
The agent-sandbox SDK also supports:
TypeScript / JavaScript
Go (Golang)
👉 This makes it easy to integrate AIO Sandbox into a wide range of agent frameworks, backend services, and developer stacks.
🛠️ Prerequisites
Python 3.12+
A running AIO Sandbox instance at
http://localhost:8080
Docker Command:
docker run --security-opt seccomp=unconfined --rm -it -p 8080:8080 ghcr.io/agent-infra/sandbox:latest
- Python SDK installed
pip install agent-sandbox
🧠 Mental Model
Think of AIO Sandbox as a remote, disposable Linux machine that your agent controls via APIs.
Filesystem → where data and artifacts live
Shell → how actions are executed
Simple Flow:
Agent → API → Sandbox → Filesystem + Shell
Autonomous Data Processing & Validation Agent
Rather than presenting APIs for these services in isolation, we will demonstrate the following end-to-end comprehensive agent workflow example operating within the sandbox environment.
This example simulates an agent that executes the following workflow:
Create some data
Read it (look at it)
Write a script file (process.py)
List files (see what exists)
Run the script
Read the output file
Check if output looks correct
Notice bad data & Fix the data
Run the script again
Read updated output
Find files created
Download final result
What this use case demonstrates
This use case demonstrates a realistic agent loop enabled using AIO Sandbox File & Shell primitives:
Read → Execute → Read → Validate → Fix → Re-run → Export
This workflow makes each AIO Sandbox File & Shell primitive feel purposeful:
-
File primitives
-
write_filecreates data and code -
read_filelets the agent inspect inputs and outputs -
list_pathgives workspace awareness -
replace_in_filelets the agent repair bad input -
search_in_filevalidates expected output -
find_filesdiscovers generated artifacts -
download_fileexports results out of the sandbox
-
-
Shell primitive
-
exec_commandruns the actual processing job
-
from agent_sandbox import Sandbox
client = Sandbox(base_url="http://localhost:8080")
# --------------------------------------------------
# 1. Setup workspace
# --------------------------------------------------
home_dir = client.sandbox.get_context().home_dir
app_dir = f"{home_dir}/data_agent"
data_path = f"{app_dir}/data.txt"
script_path = f"{app_dir}/process.py"
report_path = f"{app_dir}/report.txt"
print("Sandbox home directory:", home_dir)
print("App directory:", app_dir)
# --------------------------------------------------
# 2. Create raw input data (with an intentional error)
# --------------------------------------------------
client.file.write_file(
file=data_path,
content="""10
20
INVALID
40
50
""",
)
print("\nCreated raw input data.")
# --------------------------------------------------
# 3. Read and inspect input data
# --------------------------------------------------
data_preview = client.file.read_file(file=data_path)
print("\nRaw input data:")
print(data_preview.data.content)
# --------------------------------------------------
# 4. Write processing script
# --------------------------------------------------
client.file.write_file(
file=script_path,
content="""numbers = []
with open("data.txt") as f:
for line in f:
try:
numbers.append(int(line.strip()))
except:
print("Skipping invalid line:", line.strip())
total = sum(numbers)
avg = total / len(numbers)
report = f\"\"\"Report Summary
--------------
Valid Count: {len(numbers)}
Total: {total}
Average: {avg}
\"\"\"
with open("report.txt", "w") as f:
f.write(report)
print(report)
""",
)
print("\nCreated processing script.")
# --------------------------------------------------
# 5. List workspace contents
# --------------------------------------------------
workspace = client.file.list_path(
path=app_dir,
recursive=True,
)
print("\nWorkspace contents:")
for entry in workspace.data.files:
print("-", entry.path)
# --------------------------------------------------
# 6. Execute the processing script
# --------------------------------------------------
result = client.shell.exec_command(
command=f"cd {app_dir} && python3 process.py"
)
print("\nFirst execution output:")
print(result.data.output)
print("Exit code:", result.data.exit_code)
# --------------------------------------------------
# 7. Read generated report
# --------------------------------------------------
report = client.file.read_file(file=report_path)
print("\nGenerated report:")
print(report.data.content)
# --------------------------------------------------
# 8. Validate report contents
# --------------------------------------------------
search = client.file.search_in_file(
file=report_path,
regex=r"Average: .*",
)
print("\nReport validation result:")
print(search)
# --------------------------------------------------
# 9. Detect bad input and fix it
# --------------------------------------------------
data_check = client.file.read_file(file=data_path)
if "INVALID" in data_check.data.content:
print("\nDetected invalid data. Fixing input file...")
client.file.replace_in_file(
file=data_path,
old_str="INVALID",
new_str="30",
)
# Read input again after fix
updated_data = client.file.read_file(file=data_path)
print("\nUpdated input data:")
print(updated_data.data.content)
# --------------------------------------------------
# 10. Re-run the processing script
# --------------------------------------------------
result = client.shell.exec_command(
command=f"cd {app_dir} && python3 process.py"
)
print("\nSecond execution output:")
print(result.data.output)
print("Exit code:", result.data.exit_code)
# --------------------------------------------------
# 11. Read final report again
# --------------------------------------------------
final_report = client.file.read_file(file=report_path)
print("\nFinal report:")
print(final_report.data.content)
# --------------------------------------------------
# 12. Find generated text artifacts
# --------------------------------------------------
artifacts = client.file.find_files(
path=app_dir,
glob="*.txt",
)
print("\nDiscovered artifacts:")
print(artifacts)
# --------------------------------------------------
# 13. Download final report to local machine
# --------------------------------------------------
with open("final_report.txt", "wb") as f:
for chunk in client.file.download_file(path=report_path):
f.write(chunk)
print("\nFinal report downloaded locally as final_report.txt")
🎯 Key Insight
AIO Sandbox gives agents a safe, programmable runtime:
Files → memory/state
Shell → actions
Together, they enable real-world workflows like:
code generation and execution
data processing
automation pipelines
tool orchestration
🚀 What’s Next
Thanks for reading! Hope it was helpful! This is just the beginning. In upcoming posts, we’ll dive into topics such as:
🌐 Browser automation (CDP-based)
🔌 MCP tool integration
📓 Jupyter / notebook execution
🤖 OpenClaw integration
🎯 Reinforcement learning inside sandbox
💬 Final Thoughts
AIO Sandbox bridges the gap between:
“LLM that generates text”
and
“Agent that can actually do things”
And it does so safely, reproducibly, and programmatically.
Top comments (0)