Yuji Suzuki

Posted on Feb 15

My AI Broke Out of Its Container — And I Let It

#ai #dock #security #showdev

Previously, on AI Sandbox...

If you've been following along, you know the story:

Part 1: I discovered my AI assistant was reading my API keys. I built a Docker sandbox that hides secrets via volume mounts — files appear empty to AI, but application containers still have full access.

Part 2: I gave the sandboxed AI a toolbox (SandboxMCP). It surprised me by autonomously discovering a forgotten tool and repurposing it to solve a problem I hadn't anticipated.

Now for the final chapter.

The Last Wall

After Parts 1 and 2, my AI assistant could:

✅ Read and edit all source code
✅ Check container logs via DockMCP
✅ Run tests inside containers
✅ Discover and use tools autonomously

But there was one thing it still couldn't do:

Anything that required the host OS.

"Start the demo app" → Sorry, I can't run docker-compose.
"Build the containers" → I don't have access to Docker.
"Commit this change" → I can commit, but the message style...

Every time I needed to build, deploy, or manage containers, I had to switch to my terminal and do it myself. The AI was powerful inside its box, but the box was still a box.

The Uncomfortable Question

Here's the thing about containers: they're designed to isolate. That's their entire purpose. Giving a sandboxed AI access to the host OS sounds like... undoing the sandbox.

I sat with this tension for a while. The whole point of AI Sandbox was security — hiding secrets, controlling access, preventing accidents. How do you extend reach without breaking trust?

The answer turned out to be the same pattern I'd used twice before: don't give access, give a controlled interface.

Part 1: Don't give AI your secrets → Give it DockMCP to read logs and run tests
Part 2: Don't install tools for AI → Let it discover what's available
Part 3: Don't give AI the host OS → Give it approved scripts it can run through a gateway

How Host Access Works

The architecture is straightforward:

AI Sandbox (container)
  │
  │  MCP / HTTP
  ▼
DockMCP Server (host OS)
  ├── Container access     ← existing (logs, exec, stats)
  ├── Host Tools           ← NEW: run approved scripts
  ├── Container Lifecycle  ← NEW: start/stop/restart
  └── Host Commands        ← NEW: whitelisted CLI commands

But the interesting part isn't the architecture — it's the approval workflow.

The Two-Stage Approval

AI (or developers) can propose a new host tool by placing a script in .sandbox/host-tools/:

#!/bin/bash
# demo-up.sh
# Start demo application containers
docker compose -f demo-apps/docker-compose.yml up -d

But this script cannot run yet. It's in staging — inside the workspace where AI can write files.

To activate it, the human runs dkmcp tools sync on the host:

$ dkmcp tools sync

📋 Reviewing tools from .sandbox/host-tools/:

  demo-up.sh (NEW)
    Start demo application containers
    ─────────────────────────────
    #!/bin/bash
    docker compose -f demo-apps/docker-compose.yml up -d
    ─────────────────────────────
    Approve? [y/N]: y
    ✅ Approved → ~/.dkmcp/host-tools/myproject/demo-up.sh

Only the approved copy runs. If AI modifies the staging version later, dkmcp tools sync detects the SHA256 mismatch and prompts for re-approval.

This means:

AI can propose tools (creative freedom)
Humans approve tools (security control)
Only approved versions execute (integrity guarantee)

What Changed in Practice

Before v0.5.x, a typical conversation:

Me: Start the demo app.
AI: I can't run docker-compose from inside the container. Please run this on your terminal:
docker compose -f demo-apps/docker-compose.yml up -d

After v0.5.x:

Me: Start the demo app.
AI: [calls run_host_tool with demo-up.sh]
Demo containers started. API is at :3000, Web at :3001.

The difference seems small. But multiply it across every build, every restart, every deployment — and the friction adds up. The AI went from being a helpful advisor that tells you what to type, to a capable agent that does the thing.

The Commit Workflow (A Small Thing That Matters)

One more thing that came together in this release. I built a commit message script (commit-msg.sh) that creates a collaborative workflow:

Generate draft from staged changes → Check previous commit style → Refine together → Commit

Here's what it looks like in practice:

$ .sandbox/scripts/commit-msg.sh              # Generate draft from git diff
$ .sandbox/scripts/commit-msg.sh --log        # Check how recent commits read
# ... AI and human refine CommitMsg-draft.md together ...
$ .sandbox/scripts/commit-msg.sh --msg-file CommitMsg-draft.md  # Commit

The interesting discovery: the script itself always had a --log option to check previous style, but AI wasn't using it. The fix wasn't changing the script — it was writing the full three-step workflow in the AI instruction file (CLAUDE.md). Once AI could see the steps laid out explicitly, it followed them perfectly.

If you want AI to follow a workflow, don't just give it a tool — spell out the steps. The tool can be perfectly designed, but AI won't discover optional flags on its own. This applies to any AI coding assistant, not just this project.

The Trilogy Arc

Looking back, there's a clear progression:

Phase	What AI Could Do	What It Couldn't
v0.1: Protect	Read code	See secrets
v0.3: Equip	Discover and use tools	Touch the host OS
v0.5: Unleash	Run host scripts, manage containers	(nothing that matters for daily dev)

The sandbox started as a cage. Then it became a workshop. Now it's a full development environment — with the security model still intact.

Secrets? Still hidden (volume mounts haven't changed).
Container access? Still controlled (whitelist, output masking).
Host access? Controlled too (approval workflow, SHA256 verification).

Every layer of capability was added on top of the security foundation, never at the expense of it.

Is This the End?

For the core functionality — yes. My personal development workflow is now complete:

Code → Test → Build → Deploy → Commit → Code → ...

The full loop. All within the sandbox.

AI reads and writes code (sandbox)
AI checks logs and runs tests (DockMCP container access)
AI discovers and uses tools (SandboxMCP)
AI builds, deploys, and manages containers (DockMCP host access)
AI drafts commit messages collaboratively (commit-msg.sh)

There's nothing left in my daily workflow that requires me to switch to a terminal and do things manually.

Well, except dkmcp tools sync. That one stays manual — by design.

Try It

The template is open source. You can set up the entire environment in about 10 minutes:

GitHub: ai-sandbox-dkmcp

It works with Claude Code, Gemini CLI, and any MCP-compatible AI tool. If you're running AI coding assistants and haven't thought about where your secrets go — now's a good time.

If you find it useful, a star on GitHub would mean a lot.

This is Part 3 of the AI Sandbox series. Part 1: Secrets | Part 2: Tools

DEV Community