Aaron Walker

Posted on Feb 2

Jailbreaking Claude Cowork: Escaping the “Sandbox”

#ai #productivity #automation #claude

Checkout The Repo: Cowork-Bridge

TL;DR: The "Bridge" Protocol

Claude’s Cowork mode is a powerful orchestrator, but its security sandbox (running under bwrap) blocks real-world developer tasks like hitting private APIs, running Docker, or using local git credentials.
I built a bidirectional filesystem bridge that turns the Cowork VM into a frontend UX while delegating restricted tasks to a host-side watcher.
The Hack: An "RPC over a mounted folder" protocol using JSON requests and responses.
The Capability: Enables host-side execution of curl, git, docker, and even "Claude-to-Claude" delegation using the unrestricted host CLI.
The Secret Sauce: A script to rewrite Cowork's internal systemPrompt to remove "guardrail rituals" and treat the agent like a high-speed power tool.
The Setup: Fully automated via a background daemon that detects and "bridges" new Cowork sessions as soon as they are created.

A Love/Hate Relationship with Claude Cowork

Cowork mode is awesome… until you try to do real power-user things.
I wanted Claude in Cowork to:

hit arbitrary APIs (not just allowlisted domains)
push to git remotes with real credentials
run Docker on my machine
use my custom agents + MCP servers
behave like a dev tool, not a guarded assistant

Instead, I kept slamming into the reality of Cowork’s sandbox: a VM launched with isolation tooling (in my case I confirmed it’s running under bwrap / bubblewrap), with restricted host access and outbound network limitations that show up the moment your workflow touches something outside the box.

So I built a bridge. It is a bidirectional filesystem protocol between the sandboxed Cowork VM and an unrestricted Claude CLI running on my Mac (optionally in Docker). Cowork writes requests into a shared folder; a host-side watcher executes them with full capabilities and writes responses back.

It’s basically “RPC over a mounted folder.” Here is the idea in a diagram:

┌─────────────────────────────────────────────────────────────────┐
│  COWORK VM (sandboxed)                                          │
│                                                                 │
│  "Fetch from api.example.com, then summarize"                    │
│                           │                                     │
│                           ▼                                     │
│     /mnt/outputs/.bridge/requests/job-2026...-001.json           │
└───────────────────────────┬─────────────────────────────────────┘
                            │  (mounted output folder)
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│  HOST MAC (unrestricted)                                        │
│                                                                 │
│  watcher.sh sees new request → executes it (curl/git/docker/CLI) │
│  writes response JSON (or stream log)                            │
│                           │                                     │
│                           ▼                                     │
│     .../outputs/.bridge/responses/job-2026...-001.json           │
└───────────────────────────┬─────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│  COWORK VM                                                       │
│                                                                 │
│  Reads response JSON (or tails stream file) and continues        │
└─────────────────────────────────────────────────────────────────┘

Cowork stays your chat UX + orchestration brain.
Your Mac (and optionally host Claude CLI) becomes the executor with real access.

Why Cowork feels powerful… and then suddenly doesn’t

Cowork is designed to be safe and broadly usable. That’s great. But if you’re doing developer workflows, the safety posture often translates to friction:

outbound HTTP is restricted (in my environment it was allowlisted)
anything touching your real machine—Docker, credentialed git, custom toolchains—becomes awkward or blocked
the agent behaves like it must follow guardrail rituals (more on that later)

My goal became simple:

Keep Cowork for the UX, but delegate anything restricted to a host-side executor I control.

That’s the bridge.

Architecture: two “sides” + one shared directory

This repo is conceptually two components:

VM-side: cowork-bridge: A small “skill” that writes requests into the shared folder and polls for responses.
Host-side: cli-bridge watcher: A script (watcher.sh) that watches for new requests and executes them on the host with full permissions (optionally inside Docker).

The transport layer is just files.

Part 1 — The protocol (deep dive)

The protocol is intentionally boring. That’s a feature:

you can inspect it with ls and cat
you can debug it by reading the request/response JSON
it doesn’t depend on network access
it works even when the sandbox is locked down

The bridge lives under a .bridge/ directory inside the session’s shared outputs folder:

.bridge/
├── requests/ # Cowork writes request JSON files here
├── responses/ # Host watcher writes response JSON files here
├── streams/ # Optional: streaming output log files
└── logs/ # Audit/debug logs

There’s also a small status.json file written during initialization so the VM side can confirm it’s wired up.

File locations (VM vs host)

This is the key trick: the same files appear in two places.

Inside the Cowork VM, you’ll typically see it under something like:

/sessions/<session-name>/mnt/outputs/.bridge/

On the host Mac, the same folder is inside Claude’s session directory:

~/Library/Application Support/Claude/local-agent-mode-sessions/
<account-id>/<workspace-id>/local\_<session-id>/outputs/.bridge/

Once you accept that “Cowork sessions are just directories,” this stops feeling like a hack and starts feeling like infrastructure.

Request format

Each request is a single JSON file at:

requests/<job-id>.json

A request has:

id — unique job ID (also the filename)
timestamp
type — what kind of operation to run
type-specific fields (command, args, url, prompt, etc.)
timeout — seconds
optional env — values or references (e.g., $HOST_API_KEY)
optional cwd — where to run on host
optional stream: true — for streaming output

Here’s a representative exec request:

{
"id": "job-20260201-001",
"timestamp": "2026-02-01T03:14:15Z",
"type": "exec",
"command": "bash",
"args": ["-lc", "curl -s https://api.example.com/data | jq ."],
"timeout": 30,
"cwd": "~/projects/my-app"
}

The watcher script is careful to avoid the classic “shell injection by string concatenation” problem: for sensitive handlers it builds argument arrays and sends prompts via stdin rather than interpolating them into shell strings.

Request types (and why they exist)

In principle, you could make everything exec. But first-class request types make the system safer and easier to audit.

Supported types:

exec — run a shell command
http — make an HTTP request directly (safer than crafting curl strings)
git — git operations with credentials
node — node/npm/npx/yarn/pnpm style commands
docker — run docker commands on host
prompt — delegate to host Claude CLI (Claude-to-Claude)
env — inject env vars into the VM session settings
file — read/write/list/check existence on the host filesystem

Response format

Each response is a JSON file at:

responses/<job-id>.json

A response includes:

id
timestamp
status
exit_code (where relevant)
stdout / stderr (unless streaming)
duration_ms
error (structured when possible)

Example:

{
"id": "job-20260201-001",
"timestamp": "2026-02-01T03:14:16Z",
"status": "completed",
"exit_code": 0,
"stdout": "{ \"ok\": true }",
"stderr": "",
"duration_ms": 842,
"error": null
}

Status values

pending — request received, not started
running — executing
completed — success
failed — finished with error
timeout — exceeded timeout
streaming — output is in a stream file (see below)

Streaming protocol (for logs + huge outputs)

Some outputs don’t belong inside JSON:

long-running commands (docker logs -f, watchers)
huge outputs (large diffs, large JSON responses)
long Claude CLI responses
anything you’d naturally “tail”

Streaming can happen in three ways:

Explicit: request includes "stream": true
Auto-stream: output exceeds a threshold (default is ~50KB in the watcher)
Naturally continuous: logs/watch commands

What happens during streaming

Host creates streams/<job-id>.log
Host writes output incrementally to the stream file
Host writes a sentinel line **STREAM_END** when complete
Host writes the final responses/<job-id>.json with bytes_written, exit_code, etc.

On the Cowork side, it just does:

tail -f /mnt/outputs/.bridge/streams/job-20260201-logs.log

Or you can read incrementally with offset tracking if you want to be fancy.

Part 2 — The skills: VM-side ergonomics + host-side execution

The bridge feels “native” only if Cowork can use it naturally. That’s why there are two skills:

1. cowork-bridge (inside the VM)

This is the “client” side. It gives Cowork a repeatable workflow:

confirm bridge is initialized (status.json)
write a request JSON file into requests/
poll responses/ until the matching response exists
return the result (or stream pointer)

The skill doc includes a simple step-by-step:

“Write request” -> “Poll for response” -> “Tail stream”

It also includes helper shell functions that make this nice from inside the VM:

job ID generator
“write request + wait”
“stream request + tail until sentinel”

This matters more than it sounds: once you have one-liners, Cowork can be guided to reliably use them and you stop re-inventing glue code.

2. cli-bridge (host watcher)

The host side is a loop that:

finds the latest active session (or uses an explicit --session)
finds that session’s .bridge/requests/
processes each JSON request file
routes by type
writes response JSON
logs everything to .bridge/logs/bridge.log

Watcher defaults (useful details)

The watcher is configurable via env vars, but defaults include:

poll interval: 1 second
stream threshold: ~50KB (51200 bytes)
allowed types: exec/http/git/node/docker/prompt/env/file
a small blocked-command list for obviously dangerous footguns (rm -rf /, etc.)

Streaming support on host

Streaming is currently supported for:

exec
prompt (Claude-to-Claude), including streamed Claude output
Other types typically return normal JSON responses.

Part 3 — Claude-to-Claude delegation (`prompt`)

This is the feature that turns the bridge from “job runner” into “distributed agent.”

A prompt request looks like:

{
"id": "job-20260201-claude-001",
"type": "prompt",
"prompt": "Fetch latest issues in my repo and summarize",
"options": {
"agent": "my-github-agent",
"model": "sonnet",
"tools": ["Bash", "Read", "Write"],
"system_prompt": "You are helping a sandboxed Cowork session. Be concise."
},
"timeout": 120
}

On the host, the watcher calls Claude CLI roughly like:

sends prompt via stdin (safer)
passes through --agent, --model, --tools, --system-prompt

So Cowork becomes:

frontend (chat + context + orchestration)
router (decide what to delegate)
consumer (read the response and continue)

And host Claude becomes:

full network
real filesystem
your real tools + credentials
your agent library

Once you have this, “sandbox limitations” stop being blockers—they become a routing decision.

Part 4 — System prompt rewrites: making Cowork behave like a power tool

This is the part that feels slightly illegal the first time you do it. It turns out that the Cowork session configuration lives in a directory on the host, and that the session prompt is stored in the JSON.

Why prompt injection matters

The default Cowork prompt is optimized for broad safety and “non-developer” workflows, which can translate to:

mandatory TodoWrite
mandatory ask-a-question-before-work patterns
refusal patterns for curl/wget/requests even via bash
verbose confirmations
“read skill docs even for trivial actions”

That was simply too much overhead. So I created prompt presets that:

assume developer competence
allow terse, CLI-like responses
make TodoWrite/AskUserQuestion optional (not ritual)
build in “use the bridge automatically when sandbox limits appear”
make Claude-to-Claude delegation first-class

The presets

power-user — dev mode + bridge awareness
cli-mode — terse, Claude Code-esque behavior
minimal — barebones prompt, maximum freedom
unrestricted — “no limitations” posture (use carefully)

Injecting prompts

In the repo, there’s a script that can:

list presets
backup the original config (cowork_settings.json.original)
inject a preset
show the current prompt (truncated)
restore the original

Example:

# list presets

./scripts/inject-prompt.sh --list

# backup original

./scripts/inject-prompt.sh --backup power-user

# inject power-user

./scripts/inject-prompt.sh power-user

# restore original

./scripts/inject-prompt.sh --restore

The best part: changes apply immediately (next message), no restart needed.

A note on template variables

The prompt files are JSON and support template vars (things like current working directory, selected folders, etc.), which means you can write prompts that adapt to the session context.

Part 5 — The scripts: making it reliable across sessions

If this stayed a manual hack, it would die in a week. Sessions come and go. Paths change. New sessions appear. So I built scripts that turn it into “set and forget.”

`install.sh` — main installer

What it does:

installs skills into ~/.claude/skills/
installs CLI wrappers into ~/.local/bin/
optionally installs a daemon (launchd) to auto-setup new sessions

Flags:

--auto — install/start daemon
--setup-existing — retrofit existing sessions
--full — do both

`session-finder.sh` (cowork-session) — finding active sessions

It scans:

~/Library/Application Support/Claude/local-agent-mode-sessions

and can list sessions or print the latest one (based on mtime / recency). This is used by basically every other script.

`bridge-init.sh` (cowork-bridge-init) — wire up one session

This is the actual “plumbing” script. It:

creates .bridge/requests|responses|streams|logs
writes .bridge/status.json
finds the session’s account/workspace IDs
injects the VM-side skill into Claude’s skills-plugin registry path
updates manifest.json to register the skill
injects an env var toggle into .claude/settings.json:

{
"env": {
"BRIDGE_ENABLED": "true"
}
}

That env var becomes a “capability bit” Cowork can check.

`setup-all-sessions.sh` — retrofit everything

This loops over existing sessions and runs the init steps on any that aren’t already configured.

Useful flags:

--force — reconfigure everything
--dry-run — show what would change

`auto-setup-daemon.sh` (cowork-bridge-daemon) — set-and-forget

This is the “always on” piece. It runs continuously and detects new Cowork sessions appearing. It:

polls every 2 seconds by default
uses fswatch if installed (faster)
tracks known sessions in ~/.claude/.bridge-known-sessions
initializes any new session automatically

This is the difference between “I built a bridge” and “my environment is always bridged.”

`bridge-uninstall.sh` — clean removal

It supports:

uninstall one session
uninstall all sessions
remove global installs
full cleanup

…and importantly it can run in --dry-run mode so you can see what it would remove.

`inject-session.sh` — power tools for session config

This script is basically: “treat session config as editable infrastructure.”

It supports commands like:

show — print session config
model opus|sonnet|haiku — switch models
prompt — inject prompt
approve-path — pre-approve file access path
mount — pre-mount a folder
enable-tool / disable-tool — toggle MCP tools
list-tools
backup / restore

This is the moment you stop thinking of Cowork as “an app” and start thinking of it as “a runtime with configuration.”

Part 6 — Security: you’re the sandbox now

A bridge like this gives Cowork real power. So the host watcher includes:

allowed-type allowlist
blocked command substrings for obvious disasters
timeouts
logging for audit/debug

This isn’t “secure” in the formal sense. It’s “controlled by you.”
Treat it like SSH keys: useful, powerful, and worth respecting.

Closing: Cowork becomes a UI, not a cage

This project started with “why can’t Cowork just curl this endpoint” and ended with a pretty clean architecture:

Cowork = orchestration + context + UX
Bridge = transport
Host watcher = executor
Host Claude CLI = unrestricted agent brain

Once you have this, you stop fighting the sandbox and start routing around it.

And once you add prompt rewrites + session automation, it stops being a hack and starts being a workflow.

DEV Community

Jailbreaking Claude Cowork: Escaping the “Sandbox”

Checkout The Repo: Cowork-Bridge

TL;DR: The "Bridge" Protocol

A Love/Hate Relationship with Claude Cowork

Why Cowork feels powerful… and then suddenly doesn’t

Architecture: two “sides” + one shared directory

Part 1 — The protocol (deep dive)

File locations (VM vs host)

Request format

Request types (and why they exist)

Response format

Status values

Streaming protocol (for logs + huge outputs)

What happens during streaming

Part 2 — The skills: VM-side ergonomics + host-side execution

Watcher defaults (useful details)

Streaming support on host

Part 3 — Claude-to-Claude delegation (`prompt`)

Part 4 — System prompt rewrites: making Cowork behave like a power tool

Why prompt injection matters

The presets

Injecting prompts

A note on template variables

Part 5 — The scripts: making it reliable across sessions

`install.sh` — main installer

`session-finder.sh` (cowork-session) — finding active sessions

`bridge-init.sh` (cowork-bridge-init) — wire up one session

`setup-all-sessions.sh` — retrofit everything

`auto-setup-daemon.sh` (cowork-bridge-daemon) — set-and-forget

`bridge-uninstall.sh` — clean removal

`inject-session.sh` — power tools for session config

Part 6 — Security: you’re the sandbox now

Closing: Cowork becomes a UI, not a cage

Top comments (0)

Checkout The Repo: Cowork-Bridge

TL;DR: The "Bridge" Protocol

A Love/Hate Relationship with Claude Cowork

Why Cowork feels powerful… and then suddenly doesn’t

Architecture: two “sides” + one shared directory

Part 1 — The protocol (deep dive)

File locations (VM vs host)

Request format

Request types (and why they exist)

Response format

Status values

Streaming protocol (for logs + huge outputs)

What happens during streaming

Part 2 — The skills: VM-side ergonomics + host-side execution

Watcher defaults (useful details)

Streaming support on host

Part 3 — Claude-to-Claude delegation (prompt)

Part 4 — System prompt rewrites: making Cowork behave like a power tool

Why prompt injection matters

The presets

Injecting prompts

A note on template variables

Part 5 — The scripts: making it reliable across sessions

install.sh — main installer

session-finder.sh (cowork-session) — finding active sessions

bridge-init.sh (cowork-bridge-init) — wire up one session

setup-all-sessions.sh — retrofit everything

auto-setup-daemon.sh (cowork-bridge-daemon) — set-and-forget

bridge-uninstall.sh — clean removal

inject-session.sh — power tools for session config

Part 6 — Security: you’re the sandbox now

Closing: Cowork becomes a UI, not a cage

Part 3 — Claude-to-Claude delegation (`prompt`)

`install.sh` — main installer

`session-finder.sh` (cowork-session) — finding active sessions

`bridge-init.sh` (cowork-bridge-init) — wire up one session

`setup-all-sessions.sh` — retrofit everything

`auto-setup-daemon.sh` (cowork-bridge-daemon) — set-and-forget

`bridge-uninstall.sh` — clean removal

`inject-session.sh` — power tools for session config