JerryLin

Posted on Apr 30

Cube Sandbox is Now Open Source: Why We Built a Fast and Secure Sandbox for AI Agents？

#ai

With the rapid advancement of LLM capabilities, AI Agent applications are experiencing explosive growth. Recently, Anthropic unveiled its latest Managed Agent architecture, which completely decouples Agents into three core components: Session, Harness, and Sandbox. This confirms a key industry consensus: the best practice for supporting complex tool calls and code execution in Agents is to isolate them in a fully independent sandbox environment.

However, as the critical component in the Agent architecture that carries code execution, a Sandbox must simultaneously meet two extremely demanding requirements: impenetrable security isolation and extreme concurrent elasticity for "spin up and tear down on demand".

Existing infrastructure faces a severe trade-off:

Containers: Fast startup, high concurrency, but the shared kernel mechanism cannot defend against malicious code escapes generated by LLMs, resulting in extremely poor multi-tenant security.
Traditional Virtual Machines: Hardware-level isolation is secure enough, but heavy OS overhead leads to minute-level cold starts and hundreds of MB memory usage, making them completely unsuitable for the transient, high-density scheduling demands of Agents.

To break this boundary between performance and security, we have built a high-performance open-source secure sandbox service from scratch based on RustVMM and KVM — Cube Sandbox. Today, the project is fully open-sourced on GitHub:

GitHub - TencentCloud/CubeSandbox: Instant, Concurrent, Secure & Lightweight Sandbox for AI Agents.

What is CubeSandbox?

Cube Sandbox is a high-performance, out-of-the-box secure sandbox service built on RustVMM and KVM. It supports both single-machine deployment and easy scaling to multi-machine cluster services. It is also the industry’s first open-source sandbox service that combines hardware-level isolation with sub-100ms startup.

Instead of wrapping Docker with an extra layer, we developed Cube Sandbox based on CloudHypervisor. Through a series of innovations and tests, we have outperformed traditional industry solutions in multiple dimensions. A brief comparison is shown below:

Metric	Data	What it means
Cold Start	<60ms	2.5–50x faster than traditional solutions, faster than a blink of an eye
Memory per Instance	<5MB	6x lower overhead than traditional solutions
Isolation Level	KVM Hardware-level	Each sandbox has an independent Guest OS kernel, not sharing the host kernel
Concurrency Capacity	2000+ per machine	P95 remains within 137ms for 50 concurrent creations
E2B Compatibility	Native support	No business code changes required; just change one environment variable; OpenAI Python SDK also works seamlessly
Deployment Scenarios	Single-machine & Cluster	Can be experienced on a single machine as a personal Agent assistant, or easily scaled into a high-concurrency cluster service

Technical Principles: How to Achieve Speed, Lightweight, and Security?

Cube Sandbox follows a clear top-down layered architecture, divided into control plane and data plane. The core components are as follows:

CubeAPI: E2B-compatible REST API gateway. Switching from E2B Cloud to Cube Sandbox only requires changing environment variables such as URL.
CubeMaster: Orchestration scheduler that receives E2B API requests and distributes them to corresponding Cubelets, responsible for resource scheduling and cluster state maintenance.
CubeProxy: Reverse proxy and request routing component that forwards requests from SDK clients to corresponding sandbox instances by parsing the <port>-<sandbox_id>.<domain> format in the Host header.
Cubelet: Local scheduling component on compute nodes that manages the full lifecycle of all sandbox instances on a single node.
CubeVS: Kernel-level forwarding based on eBPF, providing complete network isolation mechanisms and security policy support at the network layer.
CubeHypervisor & CubeShim: Virtualization layer of Cube Sandbox. CubeHypervisor manages KVM MicroVMs, and CubeShim implements the containerd Shim v2 interface to integrate sandboxes into the container runtime.

Cube Sandbox Layered Architecture Diagram

Here are some core technical points in our self-developed Cube Sandbox:

1) Self-developed Lightweight VMM (CubeVM)

This is the core of the entire project. Instead of directly using Firecracker (AWS open-source MicroVM), we implemented a VMM from scratch in Rust, optimized specifically for AI Agent scenarios.

Why build our own? Because Firecracker is a general-purpose MicroVM, and its startup process includes many steps unnecessary for Agent sandbox scenarios. We performed full-link tailoring and optimization for this scenario:

Minimized device model: Only retain virtual devices essential for sandbox scenarios (virtio-net, virtio-blk, serial), removing all unnecessary peripheral simulations.
Customized Guest Kernel: A tailored Linux kernel that retains only the minimal feature set required for Agent execution, drastically shortening the kernel boot path.
User-space interrupt handling: Critical I/O paths are completed in user space, reducing kernel-mode switching overhead.

2) Resource Pool Pre-creation + Snapshot Cloning (Key to <60ms Cold Start)

Fast cold start is not achieved by “the VM itself starting quickly” (which only reaches hundreds of milliseconds), but through a complete resource pool + snapshot mechanism:

Specifically:

Resource pool pre-creation: A batch of already started “blank sandboxes” are maintained in the background. When a request arrives, it is directly taken from the pool, skipping the entire startup process.
Snapshot cloning: Based on Copy-on-Write (CoW) technology, new instances are cloned instantly from a template sandbox. Memory pages are allocated physical memory only when actually written.

This is why a single instance uses <5MB memory: most memory pages are shared with the template (read-only), and only pages actually written during Agent execution occupy additional memory.

3) Network Isolation and Security Control

Security is not just VM isolation; the network layer is also critical:

Each sandbox has an independent virtual network stack.
Kernel-level outbound traffic filtering is implemented based on eBPF, allowing fine-grained control over which external addresses each sandbox can access.
Supports dynamic delivery and updating of network policies without restarting the sandbox.

4) E2B Protocol Compatibility

E2B is currently the de facto standard protocol in the AI Agent sandbox field, used by products such as Manus, Perplexity, and Hugging Face. We natively compatible with the E2B protocol at the API layer, which means:

Projects already using the E2B SDK only need to change the endpoint to the CubeSandbox address, with no other code changes.
No need to learn a new SDK or modify business logic.
It provides an “open-source alternative” to E2B: better performance, fully open-source, and self-hostable.

Currently, Cube Sandbox is continuously expanding its ecosystem capabilities and building compatibility and integration solutions with mainstream Agent frameworks and the open-source community.

Event-level snapshot rollback capability is also coming soon — sub-100ms state rollback, providing additional protection for the unpredictable behavior of Agents.

For detailed technical principles, please refer to: https://km.woa.com/articles/show/657197

Case Studies

At present, Cube Sandbox has been verified by large-scale real workloads in Tencent Cloud’s internal production environments and external customers — it has supported tens of billions of calls, powering stable operation of hundreds-of-millions user products such as Tencent Yuanbao. In more complex scenarios, it has also enabled a leading domestic model application vendor to schedule hundreds of thousands of sandbox instances in minutes under Agentic RL training, effectively solving pain points such as malicious code execution, data leakage, and resource abuse caused by Agent autonomy. Here are several typical cases:

Case 1: Leading Domestic Model Application — Leap in Code Execution Experience

Pain Point: AI Agents need to execute code in real time and return results. Traditional solutions either use Docker (shared kernel, high security risks) or VMs (startup takes seconds, users wait). Moreover, sandboxes are not recycled in time after code execution, leading to serious resource waste during peak hours.

After accessing CubeSandbox: Each request is assigned an independent micro virtual machine with hardware-level isolation, maximizing security. Resource pool pre-creation + snapshot cloning reduce sandbox delivery to <60ms, and code running latency drops from seconds to hundreds of milliseconds — users feel “Agent responds instantly”. Meanwhile, on-demand creation and immediate destruction reduce resource usage by 95%.

Case 2: Leading Domestic Model Vendor — Qualitative Leap in RL Training Efficiency

Pain Point: Agentic RL training requires massive sandboxes for code execution experiments — each episode needs an independent isolated environment, which is destroyed and recreated after running. Traditional solutions are extremely slow to spin up training sandboxes; the cumulative waiting time for thousands of episodes is huge, causing massive GPU computing power idling.

After accessing CubeSandbox: <5MB memory per instance increases single-machine concurrency by dozens of times, and <60ms startup ensures almost zero waiting between episodes. Training sandboxes that previously took 30 minutes to spin up are now ready in 1 minute, greatly improving training efficiency. Each episode runs in a clean independent environment, eliminating the risk of residual files contaminating training results.

Case 3: Secure Agent Tool Invocation

Pain Point: In addition to running code, Agents also need to call various external APIs, search, and perform file operations. Preventing data leakage and unauthorized access is a critical security red line in enterprise scenarios.

CubeSandbox Solution: Each sandbox has an independent network stack, with no external network access by default. Fine-grained outbound whitelists are implemented through eBPF kernel-level traffic control — only domains required for business are allowed, others are blocked. Policies can be dynamically delivered without restarting sandboxes, and all outbound requests are fully auditable and traceable.

How to Deploy and Experience

Environment Requirements:

Linux system (KVM support required)
Recommended: OpenCloudOS 9
Hardware must support virtualization

Deployment Process:

We have simplified the deployment process to 4 steps, and users can achieve faster access and deployment through one-click deployment scripts:

Start development VM (skip if you already have an x86_64 bare-metal Linux server)

Clone the repository and start a disposable OpenCloudOS 9 development VM:

git clone https://github.com/tencentcloud/CubeSandbox.git

cd CubeSandbox/dev-env

./prepare\_image.sh

./run\_vm.sh

Open a new terminal:

cd CubeSandbox/dev-env && ./login.sh

Start Cube Sandbox Service

Execute inside the logged-in VM:

curl -sL https://cnb.cool/CubeSandbox/CubeSandbox/-/git/raw/master/deploy/one-click/online-install.sh | MIRROR=cn bash

Create Code Interpreter Sandbox Template

After installation, create a code interpreter template using the pre-built image:

cubemastercli tpl create-from-image \\

&#x20; \--image ccr.ccs.tencentyun.com/ags-image/sandbox-code:latest \\

&#x20; \--writable-layer-size 1G \\

&#x20; \--expose-port 49999 \\

&#x20; \--expose-port 49983 \\

&#x20; \--probe 49999

Wait for the command to complete, and the template status will become READY. Record the template_id from the output for the next step.

Run Your First Agent Code

Install Python SDK:

yum install -y python3 python3-pip

pip config set global.index-url https://mirrors.ustc.edu.cn/pypi/simple

pip install e2b-code-interpreter

Set environment variables:

export E2B\_API\_URL="http://127.0.0.1:3000"

export E2B\_API\_KEY="dummy"

export CUBE\_TEMPLATE\_ID="\<your-template-id>"

Run code in an isolated sandbox:

import os

from e2b\_code\_interpreter import Sandbox

with Sandbox.create(template=os.environ\["CUBE\_TEMPLATE\_ID"]) as sandbox:

&#x20;   result = sandbox.run\_code("print('Hello from Cube Sandbox, safely isolated!')")

&#x20;   print(result)

For more variable descriptions and examples, see Quick Start — Step 4.

Millisecond-level Startup

Want to explore more? Check out the examples/ directory, covering code execution, Shell commands, file operations, browser automation, network policies, pause/resume, OpenClaw integration, RL training, and other scenarios.

💡 Special Recommendation: Cube Sandbox has been perfectly adapted to OpenCloudOS 9 (OC9). We strongly recommend internal colleagues to build an ultra-fast, secure Agent execution environment based on the native OC9 + Cube combination.

Conclusion & Invitation to Collaborate

We open-source Cube Sandbox completely because we firmly believe: in the era of intelligent agents, high-performance, high-security underlying infrastructure should not be monopolized by closed-source commercial APIs — it should become an open, self-hostable industry cornerstone.

The project has just been released and is still in the early stage of rapid iteration. We sincerely welcome internal architects, R&D colleagues, and product colleagues to check out our code, put forward suggestions, share ideas, and build together with us.

👉 GitHub Open Source Repository: GitHub - TencentCloud/CubeSandbox

👉 Quick Start Guide: CubeSandbox/docs/zh/guide/quickstart.md

If this project inspires or helps your business, please light up a Star 🌟 on GitHub! If you have any Bug feedback or Feature requests during experience or integration, welcome to join our internal WeChat group below for feedback and communication, or submit Issue/PR in the repository. Let’s build the underlying secure cockpit foundation for the intelligent agent era together!

DEV Community