Sandboxing AI Agent Filesystems: Containers vs Virtual FS Layers

#agents #ai #tutorial #devops

If you've ever wired up an AI agent to do real work, you've probably hit the same wall I did: filesystem access is a minefield. Give it too much rope and it'll happily rm -rf something important. Lock it down too hard and it can't actually do anything useful.

I've been bouncing between three approaches over the last year — raw FS access with allowlists, container-based isolation, and most recently a virtual filesystem layer. Each has real tradeoffs. The trending strukto-ai/mirage project pitches itself as a unified virtual filesystem for AI agents, which got me thinking about when this approach actually makes sense versus the alternatives. I'll be honest up front: I've only skimmed Mirage's repo and poked at the examples, so treat my notes on it as provisional rather than a deep review.

Why this is harder than it looks

When a coding agent says "read this file," what should that actually do? In a naive setup, the agent process can read anything the host user can read. That's fine for a throwaway VM. It's terrifying on a dev laptop with SSH keys and tokens sitting around.

The three things I want from any FS access layer:

Bounded blast radius — the agent can't escape its assigned working set
Reversibility — I can review and roll back changes before they hit disk for real
Predictable paths — the agent sees the same paths whether it's running locally, in CI, or on a remote sandbox

Most setups give you one or two of these. Getting all three is where the design choices get interesting.

Approach 1: Raw FS with allowlists

This is the baseline. You hand the agent a working directory and trust it to behave.

# Naive approach: agent gets a working dir, full access inside it
from pathlib import Path

WORK_DIR = Path("/tmp/agent-workspace").resolve()

def safe_read(rel_path: str) -> str:
    # Re-resolve every call to defeat symlink shenanigans
    target = (WORK_DIR / rel_path).resolve()
    if not target.is_relative_to(WORK_DIR):
        raise PermissionError("path escapes workspace")
    return target.read_text()

def safe_write(rel_path: str, content: str) -> None:
    target = (WORK_DIR / rel_path).resolve()
    if not target.is_relative_to(WORK_DIR):
        raise PermissionError("path escapes workspace")
    target.write_text(content)

Where this works: quick experiments, throwaway scripts, anything where the workspace is already disposable.

Where it falls over: symlinks (an agent that creates link -> /etc and then writes through it can slip past a sloppy check), TOCTOU races, and the simple fact that "undo the last 30 minutes of agent work" becomes a git stash scavenger hunt.

Approach 2: Container isolation

The next step up is putting the whole agent in a container with a bind-mounted workspace.

# Run the agent inside a container, only mount what it needs
docker run --rm \
  --network=none \
  -v "$PWD/workspace:/work:rw" \
  -v "$PWD/readonly-context:/ctx:ro" \
  --read-only \
  --tmpfs /tmp:size=512m \
  agent-image:latest

This is what I default to for anything touching real code. The blast radius is genuinely bounded — even if the agent goes off the rails, it can only mess up /work.

The downside is startup cost and the friction of getting tooling into the container. Every new language runtime, every binary the agent might invoke, has to be pre-baked into the image or installed at runtime. I've spent more time debugging "why doesn't node exist in here" than I'd like to admit.

Approach 3: A virtual filesystem layer

This is where projects like Mirage come in. The pitch, as I read it, is that the agent talks to a virtual filesystem API instead of the real FS, and the layer underneath decides what actually happens — overlay changes in memory, commit them on confirmation, expose a consistent path namespace across backends. Check the official repo before relying on specifics; the project looks early and the API surface may shift.

Conceptually, the pattern looks like this:

# Sketch of the virtual FS pattern (not Mirage's exact API)
fs = VirtualFS(
    root="./project",   # underlying real directory
    mode="overlay",     # writes go to an overlay, not the real FS
)

# Agent calls look like normal FS ops
fs.write("src/app.py", new_content)
fs.read("README.md")

# But changes are staged, not committed
diff = fs.pending_changes()  # inspect what the agent did
fs.commit()                  # apply to real FS
# or
fs.discard()                 # throw it all away

What I like about this model:

Review-before-apply is built in. The agent can do 50 file edits and I get to see the diff before any of them touch disk.
Path consistency. The agent always sees ./src/app.py, regardless of whether the backend is a local dir, an object store, or an in-memory overlay.
Cheaper than containers for the common case of "edit some files, run some checks."

What I'm cautious about:

It's another abstraction layer. When something breaks, you're now debugging the agent, the VFS, and the underlying storage.
Isolation is logical, not physical. If the agent shells out to a subprocess, that subprocess sees the real FS unless you also wrap exec calls. A container actually contains; a virtual FS doesn't, by itself.
It's new. I haven't tested Mirage thoroughly enough to vouch for edge cases like large binary files, partial writes, or concurrent agents on the same overlay.

Side by side

	Raw FS + allowlist	Container	Virtual FS layer
Setup cost	Lowest	Highest	Medium
Blast radius	Workspace dir (if careful)	Container boundary	Logical workspace
Subprocess isolation	None	Yes	None (unless wrapped)
Review before apply	Manual (git)	Manual (git)	Built into the model
Startup latency	None	Seconds	Milliseconds
Good for	Quick scripts	Real code changes	Iterative agent loops

How I'd pick today

If I'm running a coding agent against a repo I care about, I'm still reaching for containers first. The physical isolation is just too valuable when an agent decides to get creative with find -delete.

If I'm building an interactive loop — agent proposes changes, I approve, agent continues — a virtual FS layer is genuinely better. The commit/discard semantics map directly onto the workflow, and you skip the container startup tax on every iteration.

If I'm prototyping and the workspace is already disposable, raw FS with a path-resolution check is fine. Don't over-engineer it.

A migration sketch

If you're currently on raw FS and want to try a VFS layer, the migration is less invasive than you'd expect:

# Before: direct FS calls scattered through the agent's tools
def read_file_tool(path: str) -> str:
    return Path(path).read_text()

def write_file_tool(path: str, content: str) -> None:
    Path(path).write_text(content)

# After: same interface, FS calls go through the virtual layer
def read_file_tool(path: str) -> str:
    return fs.read(path)

def write_file_tool(path: str, content: str) -> None:
    fs.write(path, content)  # staged, not yet on disk

# New control surface: review/commit between agent steps
def step_complete():
    show_diff(fs.pending_changes())
    if user_approves():
        fs.commit()
    else:
        fs.discard()

The tool interface barely changes. What changes is the control loop around it — you now have a place to insert review and approval that you didn't have before.

That's the real reason I'm watching this category. Containers won the "how do we sandbox processes" question a decade ago. The "how do we sandbox an agent's intentions before they become actions" question is still wide open, and a virtual filesystem is one of the more interesting answers I've seen lately.