clearloop for CrabTalk

Posted on Mar 15 • Originally published at openwalrus.xyz

Workspace as sandbox: a simpler model for agent isolation

#ai #architecture #opensource #openwalrus

The sandbox survey found that every
production agent system either gates individual commands (Claude Code,
Cursor, Codex CLI) or gates the environment (Devin, OpenHands). Both
have real tradeoffs. Per-command approval interrupts flow. Container
isolation cuts agents off from the host resources that make them useful
— especially authenticated browser sessions.

There's a third option hiding in the operating system itself: make the
agent a real OS user, and keep the runtime completely unaware of it.

The model

One system user — walrus — is the agent's identity. All agents, all
tasks, all workspaces live under this user's home directory. The walrus
runtime runs as this user. Standard Unix file permissions enforce the
boundary. No Landlock, no seccomp, no sandbox library in the runtime
code. Zero lines of sandbox logic.

Diagram — see original post

The human user (alice) decides what the agent can see by setting ACLs
on her own files or copying resources into the workspace's shared/
directory. The agent can't read anything outside its home unless alice
explicitly grants access. This isn't a new abstraction — it's how Unix
has worked since the 1970s.

Why zero sandbox logic in the runtime

The sandbox is the OS, not the code

Claude Code, Cursor, and Codex CLI all embed sandbox logic in their
runtimes — generating Seatbelt profiles, configuring Landlock rules,
writing seccomp BPF programs. This means maintaining three platform-
specific implementations, debugging sandbox policy issues, and
accepting the security risk of bugs in their own sandbox code.

The OS user model sidesteps all of this. The runtime doesn't know or
care about sandboxing. It runs as walrus, and the OS handles isolation.
File permissions, process ownership, resource limits — these are kernel-
enforced mechanisms that have been audited for decades. No sandbox code
to write means no sandbox bugs to ship.

Cross-platform for free

Unix file permissions work identically on macOS, Linux, and every BSD.
No platform-specific sandbox implementation to maintain. No Seatbelt on
macOS, Landlock on Linux, WSL2 workarounds on Windows. The same model,
the same commands, the same behavior everywhere.

Pluggable setup, not pluggable runtime

The sandbox setup — creating the user, configuring firewall rules,
setting ACLs — is a one-time operation that happens outside the runtime.
This makes it pluggable by design: walrus sandbox init is just a
command that wraps the platform-specific setup steps. Anyone can write
their own init script, customize the user configuration, add network
rules, or skip the whole thing. The runtime doesn't change.

`walrus sandbox init`

A single command that sets up the OS user and workspace structure.
Requires sudo once, then never again.

$ walrus sandbox init
[sudo] password for alice:
Creating system user 'walrus'...
  macOS: sysadminctl -addUser _walrus -home /var/walrus -shell /bin/bash
  Linux: useradd --system --home /var/walrus --shell /bin/bash walrus
Creating home directory /var/walrus/
Creating /var/walrus/workspaces/
Creating /var/walrus/.runtimes/
Done. All walrus agents will now run as user 'walrus'.

That's it. No LaunchDaemon, no systemd unit, no firewall rules by
default. The init command does the minimum: create the user, create the
home directory. Everything else is optional and additive.

Diagram — see original post

What init does NOT do

No network firewall rules (opt-in via walrus sandbox network init)
No runtime service/daemon installation (opt-in via walrus sandbox service init)
No Chrome profile copying (the user does this manually or via a share command)
No Landlock/seccomp/Seatbelt configuration (the OS user is the sandbox)

Each of these is a separate, optional command. The runtime works the
same regardless of which you've run. This is the
less code, more skills principle applied
to infrastructure: the runtime is minimal, the setup is extensible.

Without init

If the user never runs walrus sandbox init, the runtime runs as the
current user — same as Claude Code or Aider today. No isolation, no
friction. The sandbox is purely opt-in. The runtime code path is
identical either way.

Sharing host resources

The human user and the agent need to exchange files. The mechanisms
depend on the resource type.

Project files: ACLs

The user grants the walrus user access to a project directory:

# macOS
chmod +a "walrus allow read,write,execute,delete,add_file,add_subdirectory" \
    ~/projects/my-app

# Linux
setfacl -R -m u:walrus:rwx ~/projects/my-app

No root needed — the file owner sets ACLs on their own files. The agent
reads and writes the project directory. getfacl (Linux) or ls -le
(macOS) shows exactly what's shared.

A convenience wrapper:

$ walrus sandbox share ~/projects/my-app
Granting walrus read/write access to /Users/alice/projects/my-app...
Done. The agent can now access this directory.

Credentials: read-only ACLs

$ walrus sandbox share --read-only ~/.ssh/id_ed25519
Granting walrus read-only access to /Users/alice/.ssh/id_ed25519...
Done.

The agent can use the SSH key but can't modify or delete it.

Browser profiles: copy into workspace

For resources that can't be safely shared concurrently, copy them:

$ walrus sandbox share --copy ~/.config/google-chrome/Profile\ 1 \
    --into workspaces/task-42/chrome-profile

Copying Chrome profile into /var/walrus/workspaces/task-42/chrome-profile/...
Using reflink (copy-on-write)... Done.

# The agent launches Chrome with its own copy:
chrome --headless --user-data-dir=/var/walrus/workspaces/task-42/chrome-profile/

The agent starts with the user's session state (cookies, saved logins)
but changes are isolated. Two agents get independent copies. On APFS
(macOS) and Btrfs (Linux), cp --reflink=auto makes this near-instant.

Listing and revoking

$ walrus sandbox shared
/Users/alice/projects/my-app        read-write
/Users/alice/.ssh/id_ed25519        read-only
(copied) chrome-profile → task-42   isolated copy

$ walrus sandbox unshare ~/projects/my-app
Revoking walrus access to /Users/alice/projects/my-app...
Done.

Where it breaks

Network isolation is separate

Unix file permissions don't restrict network access. The walrus user
can curl anything by default. For network control, you need additional
setup:

$ walrus sandbox network init
Setting up per-user firewall rules for walrus...
  Linux: iptables -A OUTPUT -m owner --uid-owner walrus -j DROP
  macOS: adding pf rule for user _walrus
Default: deny all outbound. Configure allowlist in ~/.walrus/network.toml

This is a separate, optional init step. The runtime doesn't enforce
network policy — it delegates to the OS firewall. If the user hasn't
run walrus sandbox network init, network is unrestricted.

Process-level resources

Some host resources aren't files. Display access, GPU, audio, D-Bus
sessions — a separate OS user doesn't get these automatically.

For headless browser automation (CDP), this is fine — headless Chrome
doesn't need a display. For visual Computer Use, the user would need to
grant display access:

# X11
xhost +SI:localuser:walrus

# Or run a virtual framebuffer under the walrus user
Xvfb :99 &
export DISPLAY=:99

This is additional setup, not something the runtime handles.

The sudo prompt

Creating an OS user requires root. Every developer tool that creates
service accounts does this — Docker, Postgres, MySQL — but it's still
a friction point for "download and run" tooling.

The design makes this explicitly opt-in: walrus sandbox init is a
separate command, not part of walrus install. Without it, walrus runs
as the current user with no isolation. The sudo prompt only appears when
the user actively chooses isolation.

Kernel isolation is shallow

Like every other local sandbox approach (Landlock, Seatbelt, user
namespaces), the OS user shares the host kernel. A kernel exploit
escalates to root. For local developer tooling the threat model is
"agent does something unintended" — acceptable. For multi-tenant
platforms running untrusted agent code, not acceptable.

The design principle

The walrus runtime has zero sandbox logic. The sandbox is the operating
system. Setup is a pluggable command that runs once. Every walrus sandbox share
and walrus sandbox unshare command is just a thin wrapper around setfacl /
chmod +a. The runtime doesn't check ACLs, doesn't enforce policies,
doesn't generate sandbox profiles. It runs as whatever user launched it
— if that user is walrus, isolation exists. If not, it doesn't.

This means:

No sandbox bugs in the runtime. The attack surface is the OS kernel, not our code.
No platform-specific code paths. The same runtime binary works on macOS and Linux.
No configuration to get wrong. The sandbox is either set up (user exists) or it isn't. No SBPL profiles, no BPF programs, no TOML policy files that the runtime interprets.
Full user control. The human user decides what to share using standard Unix tools. They can inspect, modify, or revoke permissions at any time without touching the walrus runtime.

Prior art

Sandvault is the clearest
prior art — a macOS tool that creates a per-human-user agent account
and runs commands via ssh sandvault-$USER@localhost, adding Seatbelt
restrictions on top.

Alcoholless
(NTT Labs) runs programs as a separate macOS user, syncing changed files
back on exit.

Both add runtime sandbox logic on top of the OS user. Our design
intentionally doesn't — the OS user is the entire sandbox layer.

Open questions

One walrus user or one per human user? A single system-wide walrus
user is simpler. But on a shared machine, Alice's agent and Bob's agent
would share a home directory. Per-human-user accounts (walrus-alice,
walrus-bob) provide isolation but multiply setup complexity. Sandvault
chose per-human-user. For a single-developer machine (the primary walrus
use case), one user seems right.

Should walrus sandbox share wrap ACLs or teach ACLs? A wrapper command is
more convenient. But it hides what's happening, and users may not know
how to debug permissions. An alternative: walrus sandbox share prints the
raw setfacl / chmod +a command and asks the user to run it. Full
transparency, slightly more friction.

How do skills declare resource needs?
A skill that needs Chrome access could declare needs: [browser] in its
metadata. Before running the skill, the runtime checks whether a Chrome
profile exists in the workspace. If not, it prompts: "This skill needs
a browser profile. Run walrus sandbox share --copy <path> to provide one."
The runtime doesn't enforce — it informs.

Is the no-sandbox fallback good enough? Without walrus sandbox init,
the agent runs as the current user with full access. This matches Aider's
model and what most developers do today. But it means the default is
zero isolation. Should walrus warn on every run without sandbox init?
Or is that the kind of nag that teaches users to ignore warnings?

DEV Community