Siddhant Khare

Posted on Jan 10

Containers aren’t a sandbox for AI agents

#ai #security #containers #programming

Where containers stop being simple

Containers are sold as a solved abstraction. You package a filesystem, declare a process, and the world becomes reproducible. That story is mostly true - until the moment you ask the container to do something that leaks across the kernel boundary.

That moment is usually accidental.

You start by “just adding a dependency.” Maybe a browser for tooling. Maybe an emulator. Maybe a sandbox that needs stronger isolation. The Dockerfile grows a few lines. Everything still builds. Tests still pass. And then, quietly, you hit the edge of what containers can actually promise.

I hit that edge while working on a containerized IDE environment - one that wasn’t just compiling code, but running a full graphical toolchain and emulators inside a browser-accessible container. On paper, it was still “just Docker.” In practice, it forced a confrontation with an uncomfortable truth:

containers don’t virtualize the kernel; they borrow it.

Once you internalize that, a lot of container folklore collapses.

Userland is easy. The kernel is not

The first category of problems is deceptively straightforward. You want to harden behavior inside the environment - prevent certain protocol handlers, restrict what happens when a user clicks a link, reduce accidental escape hatches. That lives squarely in userland.

You install packages.
You write config files.
You control defaults.

This feels like progress because it is progress. It’s policy expressed as files, and containers are excellent at that.

Then comes the second category of problems, which look similar but are fundamentally different.

You want acceleration.
You want virtualization.
You want isolation stronger than namespaces.

So you install QEMU. You add configuration files that reference KVM. You write the incantations that every blog post seems to recommend. The image builds fine.

And nothing actually changes.

Because at this point, you are no longer configuring the container. You are attempting to configure the host kernel from inside a process that does not own it.

No amount of Dockerfile cleverness can cross that boundary.

Nested virtualization, device access, hardware acceleration - these are not properties of images. They are properties of the execution environment. They depend on CPU flags, kernel modules, hypervisor configuration, and runtime privileges. A container can only benefit from them if the host explicitly allows it to.

This is the moment many container designs quietly break. Not because the idea was wrong, but because the abstraction was overextended.

The same boundary shows up in agent systems

This matters far beyond IDEs or emulators.

Modern AI systems increasingly rely on agents - processes that don’t just think, but act. They run tools. They clone repositories. They install dependencies. They execute arbitrary code. Often concurrently. Often on behalf of users.

At first glance, containers seem perfect for this:

One container per agent.
Clean filesystem.
Resource limits via cgroups.
Tear down when done.

This works - until you care about any of the following:

Running untrusted code.
Preventing lateral movement.
Controlling outbound network behavior.
Enforcing strict filesystem policies.
Supporting Docker-in-Docker–like workflows.
Providing hardware acceleration safely.

At that point, you rediscover the same boundary: containers are not security sandboxes; they are process isolation with a shared kernel.

If your agent needs to cross into host-level capabilities - starting sibling containers, accessing /dev/kvm, mounting filesystems, manipulating network namespaces - you are back in the world of privileges, devices, and kernel trust.

The IDE problem and the agent problem are the same problem wearing different clothes.

Strong isolation is not a container problem

There is a recurring mistake in infrastructure design: trying to solve policy problems with packaging tools.

Containers are packaging plus lightweight isolation. They are fantastic for reproducibility and deployment. They are not a complete security boundary.

Once you accept that, architecture decisions become clearer.

If your agents run trusted code, containers may be enough.

If your agents run untrusted code, containers are probably insufficient.

That’s when other tools appear:

MicroVMs (Firecracker, Kata).
Sandboxed runtimes (gVisor).
Ephemeral execution environments.
Strict syscall filters and egress policies.

These systems are slower to spin up and harder to operate, but they draw the boundary in the right place: at the kernel interface, not inside it.

What looks like extra complexity is often just honesty about where isolation actually comes from.

The real product is policy

The most important lesson from all of this is subtle:
the hard part is not running code - it’s deciding what that code is allowed to do.

Opening links.
Accessing the network.
Reading from disk.
Writing artifacts.
Using hardware acceleration.

Every meaningful system ends up encoding policy, whether explicitly or by accident. Containers make it easy to ship policy as configuration, but they don’t remove the need to reason about it.

Agent orchestration systems that scale will not be defined by clever prompts or clever scheduling. They will be defined by:

Clear trust boundaries.
Explicit execution contracts.
Reproducible but constrained environments.
Observability that maps actions back to intent.

That’s not an AI problem. That’s an infrastructure problem we’ve been solving for decades - just under different names.

Containers are still the right starting point

None of this is an argument against containers.

Containers are still the best default abstraction we have. They let us experiment cheaply, reason locally, and iterate fast. They are the right place to start.

But they are not the place to stop.

Every serious system eventually reaches the point where “just put it in Docker” stops being an answer and starts being a question. When that happens, the mistake is not hitting the limit - it’s pretending the limit isn’t there.

The moment you need kernel features, hardware guarantees, or hostile-code isolation, the architecture must change.

The good news is that this boundary is predictable. You can see it coming if you know what to look for.

The bad news is that you can’t paper over it with a Dockerfile.

Found this useful? I write about AI infrastructure, security, and the engineering challenges of building production AI systems. Connect with me on LinkedIn or Twitter/X.

*Built by Siddhant Khare

Top comments (3)

nick walton • Jan 11

The discussion about AI containment is unfortunately only on the lips of security engineers. None of the IDE or Agent developers are talking or thinking about containment. The exception may surprisingly be Anthropic who discuss and support installation of Claude 2.0 in dev environments running inside virtual machines.

Cline don't address containment so of course don't have a security model. Neither do AI first editors like Zed and when asked about security and isolation are silent or trivialise security.

VSCode and Microsoft don't address containment specifically because they don't want consumers separating AI from their private life — Microsoft wants Copilot everywhere in all aspects of people's lives.

Coming back to containers it is possible that Incus LXC could provide adequate boundaries but the problem becomes apparent when accessing the VM or lxc contained MCP environment with an Agent panel in something like an IDE or editor via SSH. Because the developers don't have a security model for remote dev there is no telling what their apps allow back onto the host from an isolated guest.

Siddhant Khare • Jan 11

I agree. Containment is mostly being treated as a security-engineering problem, not a product or architecture one.

Most IDEs & agent tools assume a trusted local environment, so isolation never really enters the design. Once agents start executing code, that assumption quietly breaks.

You’re right that VMs or LXC can help, but w/o a clear security model for remote dev & IDE integration, SSH and agent panels become a backchannel back to the host. At that point the isolation is mostly cosmetic.

Until agent and IDE developers define an explicit execution and trust boundary, any underlying containment mechanism is easy to undermine.

nick walton • Jan 26

The DevPod open source project creates and manages VSCode Workspaces based on various OCI runtimes, including Docker and Podman (installed as Providers).

DevPod CLI and it's Workspaces can be run inside a virtual machine.

The Workspaces are setup inside Devcontainers and DevPod installs Open VSCode Server into each Container instance:
github.com/gitpod-io/openvscode-se...

Each Open VSCode Server instance is accessed via https using a browser and functions like Codespaces, providing full extension functionality.

Claude Code CLI and its extension can be installed into each Workspace and run in good isolation from the host (host > vm > container > claude).

The official DevPod project has been put on hold indefinitely but one fork shows great promise:
github.com/skevetter/devpod