I run OpenClaw — an open-source AI assistant that connects to WhatsApp, Telegram, and Discord with full system access (shell, files, browser, voice). It currently runs directly on the host with Docker sandbox containers for tool isolation.
I spent a week researching whether to move the entire gateway into Docker. Here is what I found — and why I decided not to.
The Three Options
Option A: Host-Based (Current)
Gateway runs on the host as a systemd service. Sandbox containers handle tool execution.
┌──────────────────────────────────────┐
│ HOST │
│ ┌────────────────────────────────┐ │
│ │ OpenClaw Gateway (systemd) │ │
│ │ - WhatsApp connection │ │
│ │ - Telegram bot │ │
│ │ - API keys in ~/.openclaw │ │
│ └──────────┬─────────────────────┘ │
│ │ Docker socket │
│ ▼ │
│ ┌────────────────────────────────┐ │
│ │ Sandbox Containers │ │
│ │ - Tool execution only │ │
│ │ - Filesystem isolation │ │
│ └────────────────────────────────┘ │
└──────────────────────────────────────┘
Option B: Full Docker
Everything in Docker — gateway, tools, sandbox.
Option C: Hybrid
Gateway in Docker, sandbox as sibling containers via Docker socket.
Option C sounded like the sweet spot. It is not. Here is why.
The Docker Socket Paradox
This is the core problem that killed Option C for me.
OpenClaw needs to spawn Docker containers for sandboxed tool execution. If the gateway is inside Docker, it needs access to the Docker daemon. The standard approach: mount /var/run/docker.sock.
But mounting the Docker socket gives the container root-equivalent access to the entire host.
An attacker (or a rogue AI prompt injection) could:
- Create a privileged container mounting
/(host root filesystem) - Read
/etc/shadow, SSH keys, every API key on the server - Spawn cryptomining containers
- Delete all containers, volumes, and images
Even a read-only mount does not help — the Docker API is still fully writable through the socket.
Goal: Isolate gateway from host for security
Requires: Docker socket to spawn sandbox containers
Result: Docker socket gives root access to host
Net: Security improvement = 0
You containerize for security, then hand back all the access via the socket. It is like locking your front door but leaving the key under the mat.
Mitigations Exist But Add Complexity
Docker socket proxy restricts which API endpoints are accessible. But it adds another container to manage, must be carefully configured, and still allows container creation (which is the attack vector).
Sysbox runtime allows nested containers without socket mount or privileged mode. But it requires host-level installation and is not widely supported.
Podman is daemonless and rootless. But OpenClaw sandbox currently assumes Docker.
WhatsApp Pairing Breaks in Docker
OpenClaw uses the Baileys library (WhatsApp Web protocol). Pairing requires a QR code scan from your phone. Inside Docker:
- Session data must be volume-mounted. If the volume mapping breaks on restart, you re-pair.
- Docker NAT layer adds latency to WebSocket connections.
- I already see frequent gateway disconnects (HTTP 499, 428, 503) on the host. Docker networking would make this worse.
- Container restart without proper volume persistence = lost WhatsApp session = manual re-pair.
Workarounds exist (noVNC for QR display, phone number OTP pairing) but they add fragility to something that currently just works.
Networking Gets Complicated Fast
The gateway needs outbound access to:
- WhatsApp servers (WebSocket)
- Telegram API
- Anthropic, OpenAI, Google APIs
- npm registry
Plus:
- Health check endpoint on port 18789
- Inter-container communication with sandbox containers
- DNS resolution (flaky inside Docker)
- Tailscale VPN needs
--cap-add NET_ADMINand/dev/net/tun
Every layer of Docker networking adds 1-2ms latency and a potential failure point.
Persistent State is a Minefield
Everything in ~/.openclaw/ needs volume mounts:
volumes:
- ./config:/home/openclaw/.openclaw
- ./workspace:/home/openclaw/.openclaw/workspace
- ./media:/home/openclaw/.openclaw/media
- /var/run/docker.sock:/var/run/docker.sock
Risks:
- UID/GID mismatches between host user and container user
- Volume corruption if the container crashes mid-write
-
Backup complexity increases (Docker volumes vs simple
tar) - File ownership conflicts when accessing workspace from both host and container
- Media files (voice messages, images) need careful permission handling
Debugging Gets Painful
| Task | Host | Docker |
|---|---|---|
| View logs | journalctl --user -u openclaw-gateway |
docker logs openclaw-gateway |
| Restart | systemctl --user restart openclaw-gateway |
docker restart openclaw-gateway |
| Shell | Direct SSH | docker exec -it openclaw-gateway bash |
| Edit config | vim ~/.openclaw/openclaw.json |
Edit volume or exec in |
| Install plugin | openclaw plugins install ... |
Exec in + restart container |
You lose systemd watchdog, journald integration, and the ability to just SSH in and fix things.
The Sandbox-in-Sandbox Problem
OpenClaw spawns Docker containers for tool isolation. If the gateway is also in Docker:
-
Docker-in-Docker (DinD): Needs
--privilegedflag, which defeats all security benefits. Storage driver conflicts (overlay2 inside overlay2). Known stability issues. - Sibling containers via socket: Returns to the Docker socket paradox (root access).
- Sysbox: True nested containers without privilege escalation. But limited platform support and not tested with OpenClaw.
Performance Overhead (Minor But Real)
- Container startup: ~1-3s overhead per sandbox spawn
- Network NAT: ~1-2ms per request
- Memory: ~50-100MB for container runtime
- Disk I/O through overlay2: 5-15% slower for writes
- TTS audio processing and browser screenshots slightly slower
It Would Require a Major Rewrite
I built clawmacdo — a Rust CLI that deploys OpenClaw to DigitalOcean and Tencent Cloud via 16 SSH-based provisioning steps. Moving to Docker means:
- Replace Steps 9-15 with
docker compose up - Build and publish a gateway Docker image
- Rewrite backup/restore for Docker volumes
- Change Web UI progress streaming from SSH to Docker log tailing
- Re-test the Tencent Cloud integration I just finished
Estimated effort: 2-3 weeks. For questionable security improvement.
The Scorecard
| Issue | Severity | Mitigatable? |
|---|---|---|
| Docker socket = root access | CRITICAL | Partial (proxy) |
| WhatsApp pairing breaks | HIGH | Yes (fragile) |
| Sandbox-in-sandbox | HIGH | Partial |
| clawmacdo rewrite | HIGH (effort) | Yes but expensive |
| Networking complexity | MEDIUM-HIGH | Yes |
| State management | MEDIUM | Yes (volumes) |
| Debugging harder | MEDIUM | Yes (tooling) |
| Performance overhead | LOW | Negligible |
My Decision: Stay on Host
Short-term: Keep the host-based architecture. Harden with allowlists, workspace-only file access, and group policies. This provides better security ROI than containerization.
Medium-term: Watch for OpenClaw Podman support (rootless sandboxing) and Sysbox maturity.
Long-term: Revisit when rootless container runtimes are natively supported and WhatsApp Business API replaces QR-based pairing.
The right answer is not always the most technically sophisticated one. Sometimes keeping it simple is the engineering decision.
Full Research Doc
The complete analysis with architecture diagrams is on GitHub:
DEPLOYMENT_ARCHITECTURE_RESEARCH.md
And the CLI itself:
github.com/kenken64/clawmacdo — star it if you found this useful!
The best deployment architecture is the one where adding Docker does not accidentally give root access to the thing you are trying to isolate. 🦞🔒
Top comments (0)