Recently I’ve been experimenting with AI agents that can execute code, install packages, and run shell commands.
One thing quickly became uncomfortable:
most of this code runs directly on the host machine.
If an AI agent runs something unexpected — deletes files, installs a malicious package, or misconfigures the environment — it can affect the entire system.
Containers help, but they still share the host kernel. I started wondering:
What if every AI agent ran inside its own lightweight virtual machine instead?
That idea led me to build a small project called BunkerVM.
The idea
Instead of executing agent commands on the host machine, BunkerVM launches a Firecracker microVM and runs the agent inside it.
The flow looks like this:
AI Agent
↓
BunkerVM runtime
↓
Firecracker microVM
↓
Isolated Linux environment
If anything goes wrong, the VM can simply be destroyed and the host remains untouched.
Why Firecracker?
Firecracker is a lightweight virtualization technology originally developed by AWS.
It powers services like AWS Lambda and AWS Fargate.
Compared to traditional VMs:
- extremely fast startup (~1–2 seconds)
- minimal resource overhead
- strong isolation
That makes it a good fit for running short-lived agent environments.
What BunkerVM does
Currently BunkerVM:
- boots a minimal Linux VM in ~2 seconds
- provides a sandbox with Python, bash, git, curl
- communicates with the host using vsock
- allows agents to run tools safely inside the VM
If something breaks or behaves unexpectedly, the VM can simply be terminated.
Current limitations
This is still an early project and there are plenty of rough edges.
Some current limitations:
- single VM instance
- requires Linux with KVM
- requires sudo privileges
- limited configuration options
Future improvements could include:
- persistent VM sessions
- snapshot support
- multiple sandbox instances
- remote execution API
Trying it out
If you're curious, the project is open source.
GitHub:
https://github.com/ashishgituser/bunkervm
The goal right now is simply to experiment with safer ways to run AI-generated code.
I'm curious about your approach
If you're building tools that execute AI-generated code, how are you isolating it?
Containers?
VMs?
Something else?
Would love to hear how others are approaching this problem.
Top comments (0)