ashish chaudhary

Posted on Mar 8

Running AI Agents Safely with Firecracker MicroVMs (Introducing BunkerVM)

#opensource #ai #linux #security

Recently I’ve been experimenting with AI agents that can execute code, install packages, and run shell commands.

One thing quickly became uncomfortable:
most of this code runs directly on the host machine.

If an AI agent runs something unexpected — deletes files, installs a malicious package, or misconfigures the environment — it can affect the entire system.

Containers help, but they still share the host kernel. I started wondering:

What if every AI agent ran inside its own lightweight virtual machine instead?

That idea led me to build a small project called BunkerVM.

The idea

Instead of executing agent commands on the host machine, BunkerVM launches a Firecracker microVM and runs the agent inside it.

The flow looks like this:

AI Agent
   ↓
BunkerVM runtime
   ↓
Firecracker microVM
   ↓
Isolated Linux environment

If anything goes wrong, the VM can simply be destroyed and the host remains untouched.

Why Firecracker?

Firecracker is a lightweight virtualization technology originally developed by AWS.

It powers services like AWS Lambda and AWS Fargate.

Compared to traditional VMs:

extremely fast startup (~1–2 seconds)
minimal resource overhead
strong isolation

That makes it a good fit for running short-lived agent environments.

What BunkerVM does

Currently BunkerVM:

boots a minimal Linux VM in ~2 seconds
provides a sandbox with Python, bash, git, curl
communicates with the host using vsock
allows agents to run tools safely inside the VM

If something breaks or behaves unexpectedly, the VM can simply be terminated.

Current limitations

This is still an early project and there are plenty of rough edges.

Some current limitations:

single VM instance
requires Linux with KVM
requires sudo privileges
limited configuration options

Future improvements could include:

persistent VM sessions
snapshot support
multiple sandbox instances
remote execution API

Trying it out

If you're curious, the project is open source.

GitHub:
https://github.com/ashishgituser/bunkervm

The goal right now is simply to experiment with safer ways to run AI-generated code.

I'm curious about your approach

If you're building tools that execute AI-generated code, how are you isolating it?

Containers?
VMs?
Something else?

Would love to hear how others are approaching this problem.

DEV Community