Kenneth Phang

Posted on Mar 5

Should You Run Your AI Assistant Inside Docker? I Researched It So You Do Not Have To

#docker #devops #ai #security

I run OpenClaw — an open-source AI assistant that connects to WhatsApp, Telegram, and Discord with full system access (shell, files, browser, voice). It currently runs directly on the host with Docker sandbox containers for tool isolation.

I spent a week researching whether to move the entire gateway into Docker. Here is what I found — and why I decided not to.

The Three Options

Option A: Host-Based (Current)

Gateway runs on the host as a systemd service. Sandbox containers handle tool execution.

┌──────────────────────────────────────┐
│              HOST                     │
│  ┌────────────────────────────────┐  │
│  │  OpenClaw Gateway (systemd)   │  │
│  │  - WhatsApp connection        │  │
│  │  - Telegram bot               │  │
│  │  - API keys in ~/.openclaw    │  │
│  └──────────┬─────────────────────┘  │
│             │ Docker socket          │
│             ▼                        │
│  ┌────────────────────────────────┐  │
│  │  Sandbox Containers           │  │
│  │  - Tool execution only        │  │
│  │  - Filesystem isolation       │  │
│  └────────────────────────────────┘  │
└──────────────────────────────────────┘

Option B: Full Docker

Everything in Docker — gateway, tools, sandbox.

Option C: Hybrid

Gateway in Docker, sandbox as sibling containers via Docker socket.

Option C sounded like the sweet spot. It is not. Here is why.

The Docker Socket Paradox

This is the core problem that killed Option C for me.

OpenClaw needs to spawn Docker containers for sandboxed tool execution. If the gateway is inside Docker, it needs access to the Docker daemon. The standard approach: mount /var/run/docker.sock.

But mounting the Docker socket gives the container root-equivalent access to the entire host.

An attacker (or a rogue AI prompt injection) could:

Create a privileged container mounting / (host root filesystem)
Read /etc/shadow, SSH keys, every API key on the server
Spawn cryptomining containers
Delete all containers, volumes, and images

Even a read-only mount does not help — the Docker API is still fully writable through the socket.

Goal:     Isolate gateway from host for security
Requires: Docker socket to spawn sandbox containers
Result:   Docker socket gives root access to host
Net:      Security improvement = 0

You containerize for security, then hand back all the access via the socket. It is like locking your front door but leaving the key under the mat.

Mitigations Exist But Add Complexity

Docker socket proxy restricts which API endpoints are accessible. But it adds another container to manage, must be carefully configured, and still allows container creation (which is the attack vector).

Sysbox runtime allows nested containers without socket mount or privileged mode. But it requires host-level installation and is not widely supported.

Podman is daemonless and rootless. But OpenClaw sandbox currently assumes Docker.

WhatsApp Pairing Breaks in Docker

OpenClaw uses the Baileys library (WhatsApp Web protocol). Pairing requires a QR code scan from your phone. Inside Docker:

Session data must be volume-mounted. If the volume mapping breaks on restart, you re-pair.
Docker NAT layer adds latency to WebSocket connections.
I already see frequent gateway disconnects (HTTP 499, 428, 503) on the host. Docker networking would make this worse.
Container restart without proper volume persistence = lost WhatsApp session = manual re-pair.

Workarounds exist (noVNC for QR display, phone number OTP pairing) but they add fragility to something that currently just works.

Networking Gets Complicated Fast

The gateway needs outbound access to:

WhatsApp servers (WebSocket)
Telegram API
Anthropic, OpenAI, Google APIs
npm registry

Plus:

Health check endpoint on port 18789
Inter-container communication with sandbox containers
DNS resolution (flaky inside Docker)
Tailscale VPN needs --cap-add NET_ADMIN and /dev/net/tun

Every layer of Docker networking adds 1-2ms latency and a potential failure point.

Persistent State is a Minefield

Everything in ~/.openclaw/ needs volume mounts:

volumes:
  - ./config:/home/openclaw/.openclaw
  - ./workspace:/home/openclaw/.openclaw/workspace
  - ./media:/home/openclaw/.openclaw/media
  - /var/run/docker.sock:/var/run/docker.sock

Risks:

UID/GID mismatches between host user and container user
Volume corruption if the container crashes mid-write
Backup complexity increases (Docker volumes vs simple tar)
File ownership conflicts when accessing workspace from both host and container
Media files (voice messages, images) need careful permission handling

Debugging Gets Painful

Task	Host	Docker
View logs	`journalctl --user -u openclaw-gateway`	`docker logs openclaw-gateway`
Restart	`systemctl --user restart openclaw-gateway`	`docker restart openclaw-gateway`
Shell	Direct SSH	`docker exec -it openclaw-gateway bash`
Edit config	`vim ~/.openclaw/openclaw.json`	Edit volume or exec in
Install plugin	`openclaw plugins install ...`	Exec in + restart container

You lose systemd watchdog, journald integration, and the ability to just SSH in and fix things.

The Sandbox-in-Sandbox Problem

OpenClaw spawns Docker containers for tool isolation. If the gateway is also in Docker:

Docker-in-Docker (DinD): Needs --privileged flag, which defeats all security benefits. Storage driver conflicts (overlay2 inside overlay2). Known stability issues.
Sibling containers via socket: Returns to the Docker socket paradox (root access).
Sysbox: True nested containers without privilege escalation. But limited platform support and not tested with OpenClaw.

Performance Overhead (Minor But Real)

Container startup: ~1-3s overhead per sandbox spawn
Network NAT: ~1-2ms per request
Memory: ~50-100MB for container runtime
Disk I/O through overlay2: 5-15% slower for writes
TTS audio processing and browser screenshots slightly slower

It Would Require a Major Rewrite

I built clawmacdo — a Rust CLI that deploys OpenClaw to DigitalOcean and Tencent Cloud via 16 SSH-based provisioning steps. Moving to Docker means:

Replace Steps 9-15 with docker compose up
Build and publish a gateway Docker image
Rewrite backup/restore for Docker volumes
Change Web UI progress streaming from SSH to Docker log tailing
Re-test the Tencent Cloud integration I just finished

Estimated effort: 2-3 weeks. For questionable security improvement.

The Scorecard

Issue	Severity	Mitigatable?
Docker socket = root access	CRITICAL	Partial (proxy)
WhatsApp pairing breaks	HIGH	Yes (fragile)
Sandbox-in-sandbox	HIGH	Partial
clawmacdo rewrite	HIGH (effort)	Yes but expensive
Networking complexity	MEDIUM-HIGH	Yes
State management	MEDIUM	Yes (volumes)
Debugging harder	MEDIUM	Yes (tooling)
Performance overhead	LOW	Negligible

My Decision: Stay on Host

Short-term: Keep the host-based architecture. Harden with allowlists, workspace-only file access, and group policies. This provides better security ROI than containerization.

Medium-term: Watch for OpenClaw Podman support (rootless sandboxing) and Sysbox maturity.

Long-term: Revisit when rootless container runtimes are natively supported and WhatsApp Business API replaces QR-based pairing.

The right answer is not always the most technically sophisticated one. Sometimes keeping it simple is the engineering decision.

Full Research Doc

The complete analysis with architecture diagrams is on GitHub:

DEPLOYMENT_ARCHITECTURE_RESEARCH.md

And the CLI itself:

github.com/kenken64/clawmacdo — star it if you found this useful!

The best deployment architecture is the one where adding Docker does not accidentally give root access to the thing you are trying to isolate. 🦞🔒

Top comments (1)

Zhengqun Koo • Apr 29

Agent37 did it - OpenClaw in Docker. And it seems to work fine github.com/Agent-3-7/openclaw-host...