Max Kryvych

Posted on Apr 10

Running AI Coding Agents Safely: Sandboxing + Reproducible Dev Environments (mise)

#ai #docker #cybersecurity #privacy

Imagine giving a colleague full, unsupervised access to your entire development machine — your personal files, SSH keys, cloud credentials — just to help with some coding tasks.

You wouldn’t do that.

Yet, when we run AI coding agents locally, that’s effectively what we’re doing by default.

There’s a better way. This article is about what that looks like in practice — based on what I’ve been experimenting with so far.

Your agent is an optimizer, not a rule-follower

When working with agents you might have noticed that if an agent hits a blocker, it tries to find a way around it. Permissions help with obvious cases, but they don’t really solve the underlying problem.

Give an agent a goal and it will find a path.

In one concrete example, I tried blocking access to environment variables. The agent responded by generating a Python script to fetch them instead.

We can't block the agent from generating Python. Why would we?

This isn't malicious behavior. The agent isn't trying to attack you — it's trying to complete the task you gave it. If one path is blocked, it tries another. If that's blocked, it tries a third. It has more patience (and creativity) for this than you have for writing rules.

The result is a game of whack-a-mole you will eventually lose.

Block environment variables → it uses the filesystem
Block filesystem → it uses network calls
Block network → it finds a binary that makes them

There’s always another path.

This is why I don’t think the answer is “better guardrails.”
It’s reducing the surface area entirely.

Put it in a box

Instead of trying to outsmart the agent, isolate it.

Docker Sandboxes run your AI agent in a microVM — a lightweight virtual machine with its own kernel, filesystem, and network stack. It's not a container (which shares the host kernel). It's a separate machine boundary.

The agent runs inside. Your host is outside. Full stop.

What the agent can access:

The project folder you explicitly mount
The tools and versions you baked into the sandbox image
Network requests through a proxy you control

What it can't access:

Other projects on your machine
Your home directory, dotfiles, credentials
The host filesystem at all (except the mounted project)
Raw network — all traffic goes through a policy-enforced proxy

This isn't a permission system the agent can negotiate with. It's an isolation boundary.

How Docker Sandboxes compare

Approach	Isolation	Docker access	Use case
Sandboxes (microVMs)	Strong (VM boundary)	Isolated daemon	Autonomous agents
Container with socket mount	Partial (namespaces)	Shared host daemon	Trusted tools
Docker-in-Docker	Partial (privileged)	Nested daemon	CI/CD pipelines
Host execution	None	Host daemon	Manual development

Containers are fast but not truly isolated. VMs are isolated but slow. Sandboxes try to hit a middle ground — enough isolation to matter, without killing iteration speed.

Also worth calling out: this isn’t really about Docker specifically.
Docker Sandboxes are just one implementation. The idea is what matters — run agents inside something that enforces a real boundary.

Give the agent your exact setup

Isolation is only half the problem. The other half is reproducibility.

When the agent runs in a sandbox, it starts from a clean environment. If that environment doesn’t match yours, things break in subtle ways:

Different runtime versions
Missing CLIs
Slightly different system behavior

That leads to “works for me” — but in reverse.

This is where mise (or similar tools) comes in.

mise is a polyglot version manager — think nvm, pyenv, and others combined. You define your environment once (mise.toml), and both you and the agent get the same setup.

In this setup, mise is doing one job:

Make the sandbox environment behave like your dev environment — without copying your entire machine.

The useful trick here is baking mise into the sandbox image:

Tools are already installed
Versions are already correct
No setup time when the agent starts

The agent can immediately run, build, and test — without waiting or guessing.

Putting it together: sbx-toolkit

I built sbx-toolkit as a thin wrapper around this idea. It’s still evolving, but the goal is to make the setup composable and repeatable.

Two scripts: one for setup, one for running.

Setup — once per machine:

./sbx-setup --agent claude-code --config ~/.claude

This bakes your agent config and mise toolchain into a base image.

Runtime — per project:

Add a .sbx.toml:

[sandbox]
agent = "claude"
template = "localhost:5000/sbx-toolkit:mise-claude-code"
network_policy = "balanced"
required_secrets = ["ANTHROPIC_API_KEY"]
allowed_domains = ["api.github.com"]

Then:

sbx-start

At this point:

The sandbox spins up
The environment is already configured
Network rules are enforced
Only required secrets are injected

The payoff: less babysitting

Without isolation, you end up supervising the agent:

Watching commands
Checking access
Approving steps

That defeats the point.

With isolation + reproducible environment:

The agent has what it needs from the start
It can’t reach outside its boundary
The environment behaves predictably

You can shift from:

“Let me monitor every step”

to:

“Let me review the result”

That’s where this starts to feel useful.

Tradeoffs (and what I’m still figuring out)

This setup is not perfect. I’m still iterating on it.

Setup complexity

More moving parts:

sandbox
image
toolchain
config

Environment management

Still deciding what works best:

bake configs into the image
or inject them per project

Performance

There’s some overhead compared to running directly on the host.

Usability vs security

This is the real tradeoff.

More isolation → more friction
Less isolation → more risk

There’s no universal answer yet.

Final thought

Agents are not rule-followers — they’re optimizers.

If you give them access, they will use it in ways you didn’t expect.

For me, the direction that makes sense is:

isolate them properly
give them a clean, reproducible environment
reduce what they can touch

This isn’t a finished solution — but it’s already better than running everything directly on your machine.

DEV Community