DEV Community

kaz
kaz

Posted on • Originally published at 0xkaz.com

Offline Claude Code on Mac mini M4: llamafile + Safehouse Sandbox

I run Claude Code on a Mac mini M4 against a local Qwen3-1.7B model via llamafile. No requests leave the machine. No API fees. The model is weaker than Claude, but for exploratory work — trying things, iterating on drafts, quick edits — it is enough.

The part that required the most thought was not the model. It was making --dangerously-skip-permissions actually safe.

Why offline

Cost. Every Claude Code session hits the Anthropic API. For early-stage work where the direction isn't clear yet, the cost adds up. A local model costs nothing per request.

Privacy. Some repos I work in contain drafts and notes I don't want leaving the machine. With a local model, nothing leaves.

The setup

Default: two terminals.

Terminal 1: llamafile (Qwen3-1.7B on :8080)
Terminal 2: run-claude.sh (Claude Code inside Safehouse sandbox)
Enter fullscreen mode Exit fullscreen mode

llamafile has supported the Anthropic Messages API natively since v0.10.0 — you point ANTHROPIC_BASE_URL directly at it, no proxy needed. Claude Code thinks it's talking to Anthropic. The request never leaves localhost.

The repo also includes an optional proxy.py (130 lines, FastAPI + httpx) for debugging — useful if you want to inspect exactly what Claude Code is sending. Skip it if you don't need that visibility.

The --dangerously-skip-permissions problem

Claude Code asks for confirmation before every file write and command by default. That's the right call for interactive use.

For async workflows — sending a task from my phone and coming back to a completed result — it breaks everything. Nothing finishes while you're away.

--dangerously-skip-permissions disables those prompts. Without something else enforcing limits, that means unrestricted filesystem access.

The something else is Safehouse.

What Safehouse does

Safehouse is a macOS sandbox wrapper. It enforces access restrictions at the OS (kernel) level, not the application layer.

run-claude.sh launches Claude Code inside a Safehouse sandbox scoped to a single directory. Claude Code can freely read and write within that directory. It cannot reach anything outside — other repos, ~/.ssh, system files.

With --dangerously-skip-permissions inside Safehouse:

  • Claude Code acts without asking ✓
  • It can only act inside the sandbox ✓
  • The OS enforces the boundary, not the app ✓

This is the combination that makes async use practical without handing over the whole machine.

The LiteLLM supply chain attack (March 2026)

On March 24, 2026, LiteLLM versions 1.82.7 and 1.82.8 on PyPI were found to contain a credential stealer. The malicious code ran automatically on Python startup (via a .pth file) and exfiltrated AWS credentials, GCP auth, GitHub tokens, SSH keys, and crypto wallet files. Available for ~3 hours before PyPI quarantined them. CVSS 9.4. Fixed in 1.82.9.

Attack vector: threat actor TeamPCP compromised Trivy — the security scanner in LiteLLM's CI/CD pipeline — and used it to obtain PyPI publishing credentials.

I wasn't using LiteLLM. I mention this not as "I predicted it" but as a data point: a large, active OSS project has a large, active CI/CD pipeline. That's another way in. Keeping dependencies small reduces that exposure.

Verdict

The offline setup works well for what it's designed for. Qwen3-1.7B is not Claude Sonnet — for complex reasoning or unfamiliar codebases, the quality gap is real. For exploratory work on familiar repos, it's fine.

The broader recommendation, regardless of the local LLM question: if you use --dangerously-skip-permissions for any reason, use an OS-level sandbox to contain it. Application-layer restrictions can be worked around. OS-level ones are much harder to escape.

Repo: github.com/0xkaz/claude-llamafile-sandbox — 3 shell scripts, 1 Python file, 1 README.

Top comments (0)