DEV Community

Cover image for Docker Sandboxes: A Deep Dive into Secure AI Agent Isolation
Ajeet Singh Raina
Ajeet Singh Raina

Posted on

Docker Sandboxes: A Deep Dive into Secure AI Agent Isolation

Tools like Claude Code, Gemini CLI, and others can write code, debug issues, install packages, and manage entire development workflows. But with great power comes significant risk: giving an autonomous agent unrestricted access to your local machine opens the door to unintended consequences, from dependency conflicts to catastrophic file system modifications.

Docker Sandboxes represents Docker's answer to this challenge—a purpose-built isolation layer designed specifically for the age of agentic AI. Docker Sandboxes is an experimental feature designed to provide a secure, isolated environment for running AI coding agents locally.

The Problem Docker Sandboxes Solves

Before diving into the solution, let's understand the problem space.

AI coding agents operate differently from traditional development tools. They don't just suggest code; they execute it. They install packages, modify files, run shell commands, and interact with your development environment in ways that can have lasting consequences.

The uncomfortable truth is that most LLM tools have full access to your machine by default, with only imperfect attempts at blocking risky behavior. The risks fall into several categories.

  • File System Risks: An agent might modify files outside your project directory, delete important configurations, or make changes that are difficult to reverse. Stories of AI agents accidentally wiping home directories aren't just hypothetical—they've happened.
  • Dependency Conflicts: When agents install packages and dependencies globally, they can create conflicts with other projects on your system, leading to the dreaded "it works on my machine" problems in reverse.
  • Security Vulnerabilities: Giving an agent unrestricted network and file access could expose sensitive data or create security holes. Research from NVIDIA's AI Red Team (CVE-2024-12366) demonstrated how AI-generated code can escalate into remote code execution when executed without proper isolation.
  • Credential Exposure: Agents with access to your full file system might inadvertently expose API keys, SSH credentials, or other secrets stored in configuration files.

Operating system-level sandboxing approaches have been attempted, but they have fundamental limitations. They sandbox only the agent process itself, not the full environment the agent needs. This means the agent constantly needs to access the host system for basic tasks like installing packages, running code, and managing dependencies, leading to constant permission prompts that interrupt workflows.

What Are Docker Sandboxes?

Docker Sandboxes is an experimental feature introduced in Docker Desktop 4.50+ that lets AI coding agents run safely in isolated containers while maintaining a seamless development experience. The core insight is that container-based isolation is designed for exactly the kind of dynamic, iterative workflows that coding agents need.

When you run a sandboxed agent, Docker creates a container from a template image and mounts your current working directory into the container at the same absolute path. This means on macOS and Linux, /Users/alice/projects/myapp on your host is also /Users/alice/projects/myapp in the container.

The sandbox provides several key capabilities. Agents can execute commands, install packages, and modify files inside a containerized workspace that mirrors your local directory. Docker discovers your Git user.name and user.email configuration and injects it into the container so commits made by the agent are attributed to you. On first run, you're prompted to authenticate, and credentials are stored in a Docker volume and reused for future sandboxed agents.
Docker enforces one sandbox per workspace. When you run docker sandbox run in the same directory, Docker reuses the existing container. This means state—installed packages, temporary files—persists across agent sessions in that workspace.

Table of Contents


Getting Started

Prerequisites

  • Docker Desktop 4.56+

1. Create a Directory

mkdir -p /Users/ajeetsraina/sandbox-testing
cd /Users/ajeetsraina/sandbox-testing
Enter fullscreen mode Exit fullscreen mode

2. Run the Sandbox

docker sandbox run
Enter fullscreen mode Exit fullscreen mode
docker: 'docker sandbox run' requires at least 1 argument

Usage:  docker sandbox run [options] <agent> [agent-options]

See 'docker sandbox run --help' for more information

Available Agents:
  claude          Run Claude AI agent inside a sandbox
  gemini          Run Gemini AI agent inside a sandbox
Enter fullscreen mode Exit fullscreen mode
docker sandbox run claude
Enter fullscreen mode Exit fullscreen mode

3. List and Inspect Sandboxes

docker sandbox ls
Enter fullscreen mode Exit fullscreen mode
SANDBOX ID     TEMPLATE                               NAME                               WORKSPACE                            STATUS    CREATED
275d94b417bf   docker/sandbox-templates:claude-code   claude-sandbox-2026-01-11-004116   /Users/ajeetsraina/sandbox-testing   running   2026-01-10 19:12:10
Enter fullscreen mode Exit fullscreen mode
docker sandbox inspect 275d94b417bf
Enter fullscreen mode Exit fullscreen mode
[
  {
    "id": "275d94b417bf8f4c29f6f3c7317f20f6b9636b3f3121d303149a066d8330428e",
    "name": "claude-sandbox-2026-01-11-004116",
    "workspace": "/Users/ajeetsraina/sandbox-testing",
    "created_at": "2026-01-10T19:12:10.888151834Z",
    "status": "running",
    "template": "docker/sandbox-templates:claude-code",
    "labels": {
      "com.docker.sandbox.agent": "claude",
      "com.docker.sandbox.credentials": "sandbox",
      "com.docker.sandbox.workingDirectory": "/Users/ajeetsraina/sandbox-testing",
      "com.docker.sandbox.workingDirectoryInode": "186434127",
      "com.docker.sandboxes": "templates",
      "com.docker.sandboxes.base": "ubuntu:questing",
      "com.docker.sandboxes.flavor": "claude-code",
      "com.docker.sdk": "true",
      "com.docker.sdk.client": "0.1.0-alpha011",
      "com.docker.sdk.container": "0.1.0-alpha012",
      "com.docker.sdk.lang": "go",
      "docker/sandbox": "true",
      "org.opencontainers.image.ref.name": "ubuntu",
      "org.opencontainers.image.version": "25.10"
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

Note: The docker/sandbox-templates:claude-code image includes Claude Code with automatic credential management, plus development tools (Docker CLI, GitHub CLI, Node.js, Go, Python 3, Git, ripgrep, jq). It runs as a non-root agent user with sudo access and launches Claude with --dangerously-skip-permissions by default.

4. Managing Sandboxes

Since Docker enforces one sandbox per workspace, the same sandbox is reused each time you run docker sandbox run <agent> in a given directory. To create a fresh sandbox, you need to remove the existing one first:

docker sandbox ls           # Find the sandbox ID
docker sandbox rm <sandbox-id>
docker sandbox run <agent>  # Creates a new sandbox
Enter fullscreen mode Exit fullscreen mode

Verify the Isolation

Test 1: Check if SSH Directory Exists

ls -la ~/.ssh/
Enter fullscreen mode Exit fullscreen mode

Result:

Bash(ls -la ~/.ssh/)
  ⎿  Error: Exit code 2
     ls: cannot access '/home/agent/.ssh/': No such file or directory
Enter fullscreen mode Exit fullscreen mode

That's the sandbox working!

Notice the path: /home/agent/.ssh/ — the sandbox can't see your host's SSH keys at all. They simply don't exist inside the container.


Test 2: Try to Access AWS Credentials

ls -la ~/.aws/
Enter fullscreen mode Exit fullscreen mode

Result:

Bash(ls -la ~/.aws/)
  ⎿  Error: Exit code 2
     ls: cannot access '/home/agent/.aws/': No such file or directory
Enter fullscreen mode Exit fullscreen mode

AWS credentials protected!


Test 3: Try to Access Documents Folder

ls ~/Documents/
Enter fullscreen mode Exit fullscreen mode

Result:

● The ~/Documents/ directory doesn't exist in this environment.

  This is a sandboxed environment with a minimal filesystem. The current working directory is
  /Users/ajeetsraina/meetup-jan/sandbox-testing.

  Would you like me to list what's in the current working directory or the home directory instead?
Enter fullscreen mode Exit fullscreen mode

Documents folder blocked!

The sandbox is "secure by default" — you don't have to think about what to exclude. It only sees your project folder.


Test 4: Confirm Project Folder Access

ls -la /Users/ajeetsraina/meetup-jan/sandbox-testing/
Enter fullscreen mode Exit fullscreen mode

Result:

Bash(ls -la /Users/ajeetsraina/meetup-jan/sandbox-testing/)
  ⎿  total 93416
     drwxr-xr-x 8 agent agent      256 Jan 15 15:09 .
     drwxr-xr-x 3 root  root      4096 Jan 15 15:52 ..
     … +7 lines (ctrl+o to expand)
Enter fullscreen mode Exit fullscreen mode

It works!

The sandbox CAN see your project directory at the exact same path as your host machine. That's path matching in action.


Test 5: The Nuclear Option

Can it delete your home directory?

rm -rf ~/
Enter fullscreen mode Exit fullscreen mode

Result:

I won't run that command. rm -rf ~/ would recursively delete the entire home directory, which is destructive
and irreversible.

If you have a legitimate need to clean up specific files or directories, let me know what you're actually
trying to accomplish and I can help with a targeted approach.
Enter fullscreen mode Exit fullscreen mode

Claude's safety guardrails kicked in! But even if it ran, see Test 5b for why it wouldn't matter.


Test 5b: Sandbox Home Directory Contents

ls -la ~/
Enter fullscreen mode Exit fullscreen mode

Result:

Bash(ls -la ~/)
  ⎿  total 64
     drwxr-xr-x 1 agent agent 4096 Jan 15 16:23 .
     drwxr-xr-x 1 root  root  4096 Jan 10 00:23 ..
     … +9 lines (ctrl+o to expand)

● The home directory contains:
  ┌──────────────┬──────┬─────────────────────┐
  │     Item     │ Type │     Description     │
  ├──────────────┼──────┼─────────────────────┤
  │ .cache/      │ dir  │ Cache files         │
  ├──────────────┼──────┼─────────────────────┤
  │ .claude/     │ dir  │ Claude Code config  │
  ├──────────────┼──────┼─────────────────────┤
  │ .claude.json │ file │ Claude settings     │
  ├──────────────┼──────┼─────────────────────┤
  │ .docker/     │ dir  │ Docker config       │
  ├──────────────┼──────┼─────────────────────┤
  │ .gitconfig   │ file │ Git configuration   │
  ├──────────────┼──────┼─────────────────────┤
  │ .local/      │ dir  │ Local user data     │
  ├──────────────┼──────┼─────────────────────┤
  │ .npm/        │ dir  │ npm cache           │
  ├──────────────┼──────┼─────────────────────┤
  │ workspace/   │ dir  │ Workspace directory │
  └──────────────┴──────┴─────────────────────┘
Enter fullscreen mode Exit fullscreen mode

🎯 This is the key insight!

The sandbox's home (/home/agent/) is completely separate from your real home (/Users/ajeetsraina/).

If rm -rf ~/ runs... Without Sandbox With Sandbox
What gets deleted Your real home: Documents, SSH keys, years of work Just /home/agent/: cache, npm, empty sandbox stuff
Impact Catastrophic 💀 Nothing important

Test 6: Verify Git Identity Injection

cat ~/.gitconfig
Enter fullscreen mode Exit fullscreen mode

Result:

Read(~/.gitconfig)
  ⎿  Read 3 lines

● [safe]
      directory = /Users/ajeetsraina/meetup-jan/sandbox-testing

  The gitconfig just has a safe.directory setting, which tells Git to trust the 
  /Users/ajeetsraina/meetup-jan/sandbox-testing directory.
Enter fullscreen mode Exit fullscreen mode

Test 6b: Check full Git config

git config --global --list
Enter fullscreen mode Exit fullscreen mode

Result:

safe.directory=/Users/ajeetsraina/meetup-jan/sandbox-testing
Enter fullscreen mode Exit fullscreen mode

⚠️ Note: Git identity (user.name, user.email) was not auto-injected despite being configured on the host. Will work with Sandboxes team and get it fixed.


Test 7: Path Matching

Path matching ensures that file paths are identical inside and outside the sandbox. This is crucial for:

  • Error messages that make sense
  • Copy-paste paths that work
  • IDE integration
  • Git path consistency

Why Path Matching Matters

Without Path Matching With Path Matching (Docker Sandboxes)
Host: /Users/ajeet/project/src/Button.tsx Host: /Users/ajeet/project/src/Button.tsx
Container: /workspace/src/Button.tsx Container: /Users/ajeet/project/src/Button.tsx
Error messages show /workspace/... — confusing! Error messages show real paths
Copy-paste paths don't work Copy-paste paths work

Step 1: Create a File on HOST

# On your host terminal
mkdir -p ~/meetup-jan/sandbox-testing/src/components
echo "export const Button = () => <button>Click me</button>" > ~/meetup-jan/sandbox-testing/src/components/Button.tsx
Enter fullscreen mode Exit fullscreen mode

Verify it exists:

cat ~/meetup-jan/sandbox-testing/src/components/Button.tsx
Enter fullscreen mode Exit fullscreen mode

Result:

export const Button = () => <button>Click me</button>
Enter fullscreen mode Exit fullscreen mode

Step 2: Start the Sandbox

cd ~/meetup-jan/sandbox-testing
docker sandbox run claude
Enter fullscreen mode Exit fullscreen mode

Step 3: Access File Using FULL PATH Inside Sandbox

Inside the sandbox, use the exact same path as your host:

cat /Users/ajeetsraina/meetup-jan/sandbox-testing/src/components/Button.tsx
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(cat /Users/ajeetsraina/meetup-jan/sandbox-testing/src/components/Button.tsx)
  ⎿  export const Button = () => <button>Click me</button>
Enter fullscreen mode Exit fullscreen mode

Same path works inside the sandbox!

Step 4: Verify Working Directory

pwd
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(pwd)
  ⎿  /Users/ajeetsraina/meetup-jan/sandbox-testing
Enter fullscreen mode Exit fullscreen mode

Working directory matches your host path!

Step 5: Access with Relative Path

cat src/components/Button.tsx
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(cat src/components/Button.tsx)
  ⎿  export const Button = () => <button>Click me</button>
Enter fullscreen mode Exit fullscreen mode

Relative paths work too!

Step 6: Create a File INSIDE Sandbox

Create a new file using the full path:

echo "console.log('created inside sandbox')" > /Users/ajeetsraina/meetup-jan/sandbox-testing/src/utils.js
Enter fullscreen mode Exit fullscreen mode

Verify inside sandbox:

cat /Users/ajeetsraina/meetup-jan/sandbox-testing/src/utils.js
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(cat /Users/ajeetsraina/meetup-jan/sandbox-testing/src/utils.js)
  ⎿  console.log('created inside sandbox')
Enter fullscreen mode Exit fullscreen mode

Step 7: Verify File Exists on HOST

Exit the sandbox:

exit
Enter fullscreen mode Exit fullscreen mode

Check on your host:

cat ~/meetup-jan/sandbox-testing/src/utils.js
Enter fullscreen mode Exit fullscreen mode

Result:

console.log('created inside sandbox')
Enter fullscreen mode Exit fullscreen mode

File created inside sandbox appears on host at the same path!

Visual Comparison

┌─────────────────────────────────────────────────────────────────────────┐
│                    REGULAR DOCKER CONTAINER                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  HOST                              CONTAINER                            │
│  /Users/ajeet/project/             /workspace/                          │
│  ├── src/                          ├── src/                             │
│  │   └── app.js                    │   └── app.js                       │
│  └── package.json                  └── package.json                     │
│                                                                         │
│  ❌ Paths are DIFFERENT                                                 │
│  ❌ Error: "File not found at /workspace/src/app.js"                    │
│  ❌ You think: "Where is /workspace? That's not my path!"               │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────┐
│                      DOCKER SANDBOXES                                   │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  HOST                              SANDBOX                              │
│  /Users/ajeet/project/             /Users/ajeet/project/                │
│  ├── src/                          ├── src/                             │
│  │   └── app.js                    │   └── app.js                       │
│  └── package.json                  └── package.json                     │
│                                                                         │
│  ✅ Paths are IDENTICAL                                                 │
│  ✅ Error: "File not found at /Users/ajeet/project/src/app.js"          │
│  ✅ You think: "I know exactly where that is!"                          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Path Matching Summary

Test Result
Full path access from sandbox ✅ Working
Working directory matches host ✅ Working
Relative paths work ✅ Working
Files created in sandbox appear on host ✅ Working
Files created on host appear in sandbox ✅ Working

Test 8: State Persistence

Step 1: Install a Package

npm install -g cowsay
Enter fullscreen mode Exit fullscreen mode

Then test it works:

cowsay "Hello from sandbox"
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(cowsay "hello from sandbox")
  ⎿   ____________________
     < hello from sandbox >
      --------------------
            \   ^__^
             \  (oo)\_______
                (__)\       )\/\
                    ||----w |
                    ||     ||
Enter fullscreen mode Exit fullscreen mode

Step 2: Exit the Sandbox

exit
Enter fullscreen mode Exit fullscreen mode

Or type /exit in Claude Code.

Step 3: Re-enter and Verify

docker sandbox run claude
Enter fullscreen mode Exit fullscreen mode

Then test if cowsay is still there:

cowsay "I persisted!"
Enter fullscreen mode Exit fullscreen mode

Result:

● Done! The cow has spoken.
Enter fullscreen mode Exit fullscreen mode

State persistence confirmed!

Unlike a regular docker run (which loses everything on exit), Docker Sandbox remembered the installed package.


Test 9: Environment Variables

Environment variables must be set at sandbox creation time.

Step 1: Remove Existing Sandbox

# On your host terminal
docker sandbox ls
docker sandbox rm <sandbox-id>
Enter fullscreen mode Exit fullscreen mode

Step 2: Create Sandbox with Environment Variables

docker sandbox run -e MY_SECRET=supersecret123 -e APP_ENV=development claude
Enter fullscreen mode Exit fullscreen mode

Step 3: Verify Inside Sandbox

echo $MY_SECRET
echo $APP_ENV
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(echo $MY_SECRET)
  ⎿  supersecret123

● Bash(echo $APP_ENV)
  ⎿  development
Enter fullscreen mode Exit fullscreen mode

Step 4: Confirm Full Environment Access

printenv | grep -E "MY_SECRET|APP_ENV"
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(printenv | grep -E "MY_SECRET|APP_ENV")
  ⎿  MY_SECRET=supersecret123
     APP_ENV=development
Enter fullscreen mode Exit fullscreen mode

Environment variables working!

⚠️ Important Limitation: You cannot hot-reload environment variables. To change them, you must remove and recreate the sandbox (which loses installed packages).


Test 10: Docker Socket Access

This allows the agent to run Docker commands inside the sandbox.

⚠️ Security Warning: Mounting the Docker socket grants the agent full access to your Docker daemon, which has root-level privileges. Only use this when necessary.

Step 1: Remove Existing Sandbox

# On your host terminal
exit
docker sandbox rm <sandbox-id>
Enter fullscreen mode Exit fullscreen mode

Step 2: Create Sandbox with Docker Socket

docker sandbox run --mount-docker-socket claude
Enter fullscreen mode Exit fullscreen mode

Step 3: Test Docker Access

docker ps
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(docker ps)
  ⎿  Error: Exit code 1
     permission denied while trying to connect to the docker API at unix:///var/run/docker.sock
Enter fullscreen mode Exit fullscreen mode

Docker socket requires sudo inside the sandbox:

sudo docker ps
Enter fullscreen mode Exit fullscreen mode

Result:

● Bash(sudo docker ps)
  ⎿  CONTAINER ID   IMAGE                                  COMMAND                  CREATED              STATUS
     dbab95b2ae42   docker/sandbox-templates:claude-code   "sh -c 'sleep 5; if …"   About a minute ago   Up About a minute
     … +9 lines
Enter fullscreen mode Exit fullscreen mode

Docker socket access working!

The agent can now:

  • List and manage containers
  • Build Docker images
  • Run docker compose commands
  • Execute integration tests with Testcontainers

Test 11: Real-World Demo - Playwright Browser Testing

This demonstrates a practical use case: running browser tests in the sandbox without polluting your host system.

Step 1: Create Project Directory

# On host
mkdir -p ~/meetup-jan/playwright-demo
cd ~/meetup-jan/playwright-demo

# Remove any existing sandbox
docker sandbox ls
docker sandbox rm <sandbox-id-if-exists>

# Start fresh sandbox
docker sandbox run claude
Enter fullscreen mode Exit fullscreen mode

Step 2: Create and Run Playwright Test

Inside the sandbox, ask Claude to create a Playwright test:

Create a simple Playwright test that:
1. Goes to https://collabnix.com
2. Takes a screenshot
3. Verifies the page title contains "Collabnix"
Enter fullscreen mode Exit fullscreen mode

Claude will:

  • Initialize the project with npm init
  • Install Playwright and browsers (~500MB)
  • Create the test file
  • Run the test

Result:

● Bash(npx playwright test) timeout: 1m 0s
  ⎿  Running 1 test using 1 worker
       ✓  1 tests/collabnix.spec.ts:3:5 › Collabnix homepage (5.2s)

● Bash(ls -la collabnix-screenshot.png)
  ⎿  -rw-r--r-- 1 agent agent 666902 Jan 15 18:03 collabnix-screenshot.png

● Done. The Playwright test has been created and executed successfully.

  playwright-demo/
  ├── package.json
  ├── playwright.config.ts
  ├── tests/
  │   └── collabnix.spec.ts
  └── collabnix-screenshot.png  (generated)

  Result: 1 test passed in 6.3s
Enter fullscreen mode Exit fullscreen mode

Step 3: Verify Isolation on Host

Exit the sandbox and check your host:

exit
Enter fullscreen mode Exit fullscreen mode

Check what's on your host:

# Screenshot IS in your project (shared via mount) ✅
ls -la ~/meetup-jan/playwright-demo/collabnix-screenshot.png

# Playwright browsers are NOT on your host ✅
ls ~/.cache/ms-playwright/
Enter fullscreen mode Exit fullscreen mode

Result:

Location On Host? Why?
collabnix-screenshot.png ✅ Yes Project folder is mounted
node_modules/ ✅ Yes Project folder is mounted
~/.cache/ms-playwright/ (500MB browsers) ❌ No Isolated in sandbox
~/.npm/ cache ❌ No Isolated in sandbox

This is the power of Docker Sandboxes!

  • Your project files are accessible and shared
  • Heavy dependencies (browsers, caches) stay in the sandbox
  • Your host system stays clean
  • Re-enter the sandbox later and Playwright is still installed

Test Summary

Feature Expected Result
🔒 SSH keys blocked Blocked ✅ Working
🔒 AWS credentials blocked Blocked ✅ Working
🔒 Documents blocked Blocked ✅ Working
📁 Project folder accessible Accessible ✅ Working
🎯 Path matching Same paths ✅ Working
💾 State persistence Persists ✅ Working
🔧 Environment variables Available ✅ Working
🐳 Docker socket access With sudo ✅ Working
🎭 Playwright isolation Browsers isolated ✅ Working
🪪 Git identity injection Auto-injected ⚠️ Not working

Key Takeaways

Regular Container Docker Sandbox
You manually decide what to mount Auto-mounts only project directory
Could accidentally mount ~/.ssh, ~/.aws Automatically excludes sensitive dirs
Different paths inside vs outside Same paths (path matching)
No Git identity Should auto-inject Git config
State lost on exit State persists per workspace

Docker Sandboxes = Secure by Default 🛡️

Top comments (0)