Dawid Nitka

Posted on Jun 14

Claude Code Safety: Hooks, Sandboxes, and Running Autonomously Without the Paranoia

#ai #claude #automation #productivity

TL;DR: Claude Code runs Bash, which means it can run dangerous Bash. There are many ways to put guardrails around it. Four well-known ones: write your own PreToolUse hooks (lightweight, portable, yours), use the built-in /sandbox (filesystem and network isolation), run inside a Docker microVM (strongest isolation), or grab a community hook collection. This article walks through those four and ends with a copy-paste hook you can drop into any plugin.

Claude was halfway through cleaning up a build directory when it generated rm -rf with a path that resolved to my home folder. A glob expanded wrong. The command was syntactically fine and about to run.

It didn't, because a hook caught it and returned exit code 2. Claude saw the rejection and retried with a scoped path instead.

Safety tooling earns its place by catching the handful of commands that would ruin your day, without getting in the way of the hundred that wouldn't. There are many methods for this. Here are four well-known ones, from "ten lines of Bash" to "full microVM", and when each earns its place.

You can pass this article URL directly to Claude Code and follow along.

The permission model: how Claude Code decides what to run

Before the guardrails, the thing they hook into.

Every time Claude wants to use a tool, Claude Code runs it through a permission check. Bash commands get matched against your allow and deny rules. In normal mode you approve risky commands manually, one at a time.

Everyone who has run a real session knows where this goes. Approve, approve, approve, command after command, until the prompting wears you down and you flip on bypass mode just to get work done. And there's the gap. The moment you let Claude run autonomously, the manual approval step disappears, and a wrong command runs with no human in the loop.

That tradeoff is exactly why the more automated approaches exist. Instead of approving everything or nothing, they let Claude run freely and stop it only in the cases that actually matter. Hooks intercept the command and decide programmatically. Sandboxes constrain what any command can touch. They stack, so you don't have to pick just one.

Approach A: Custom hooks - the lightweight, portable option you own

A hook is a script Claude Code runs at a specific point in its lifecycle. The one you want for safety is PreToolUse: it fires before a tool executes, and its exit code decides what happens next.

Three outcomes:

Exit 0 - allow the command. Optionally print JSON with a systemMessage to warn without blocking.
Exit 2 - block the command. Whatever you write to stderr gets shown to Claude, so it understands why and can adjust.
JSON decision - print a permissionDecision object to deny with a custom reason, even in bypass mode.

You wire it up in a hooks.json file inside your plugin:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "${CLAUDE_PLUGIN_ROOT}/hooks/safety-guards.sh"
          }
        ]
      }
    ]
  }
}

The matcher is Bash, so this only runs for Bash commands. ${CLAUDE_PLUGIN_ROOT} resolves to your plugin's directory, so the path works on any machine.

The script reads the command from stdin as JSON, extracts it, and checks it against patterns. The skeleton is just the read, a fail-open guard, and one block rule:

#!/bin/bash
set -euo pipefail

# Claude Code passes the tool input as JSON on stdin
INPUT=$(cat)
COMMAND=$(echo "$INPUT" | sed -n 's/.*"command"\s*:\s*"\([^"]*\)".*/\1/p' | head -1)

# Fail open: if we can't read a command, don't block
[ -z "$COMMAND" ] && exit 0

# BLOCK: rm -rf targeting home or root
if echo "$COMMAND" | grep -qE 'rm\s+-[a-zA-Z]*rf[a-zA-Z]*\s'; then
  if echo "$COMMAND" | grep -qE '(~[/\s]|\$HOME|/home/[^/\s]+[/\s]|\s+/\s+)'; then
    echo "BLOCKED: rm -rf targeting home or root directory." >&2
    exit 2
  fi
fi

# ... more block and warn rules go here

exit 0

The full script, with secret detection, the force-push gate and warnings, is at the end of this article.

Two design choices worth calling out.

Fail open. If the script can't parse a command, it exits 0 and lets it through. A safety hook that breaks every Bash call the moment its regex hits an edge case is worse than no hook, because you'll rip it out within a day. So block the clearly dangerous stuff and wave the rest through.

Block what's always destructive, warn on what's only sometimes wrong. rm -rf ~ is never what you want, so block it. git reset --hard is sometimes exactly the thing you need, so warn and let Claude decide with the warning sitting right there in context.

A real guard script grows more patterns over time. Common additions:

Block .env writes through the shell (echo ... > .env)
Block git push --force unless it's --force-with-lease
Warn on git clean -f, DROP TABLE, TRUNCATE, production keywords
Gate git push behind an explicit confirmation token

The whole thing is portable: a shell script and a JSON file, with no daemon and nothing to install. It rides along in your plugin and works on Linux, macOS, WSL and Git Bash on Windows. There's a tradeoff, of course. It only catches patterns you thought to write ahead of time - a smart filter, with all the holes any filter has.

A fuller version of this script, with the force-push gate and the full pattern list, lives in the reference repo: github.com/Nagell/claude-marketplace.

Approach B: Native sandbox - built-in filesystem and network isolation

Claude Code ships a sandbox you turn on with /sandbox. Instead of inspecting commands, it constrains what any command is allowed to do.

Two layers of isolation:

Filesystem - writes are locked to the current working directory by default. A command can read widely but can't write outside the project unless you allow it. You tune this with allowWrite and denyWrite rules.
Network - outbound traffic goes through a local proxy with an allowlist. Approved domains pass, everything else is blocked. No surprise calls to a random host.

The payoff is fewer interruptions. Because sandboxed commands can't escape the box, Claude Code stops asking you to approve them one by one. In practice that's a large cut in permission prompts during an autonomous session, which is the whole reason you'd run one.

Platform support is the catch:

macOS - uses the OS Seatbelt sandbox, works out of the box.
Linux / WSL2 - uses bubblewrap, which you install yourself (plus socat; Ubuntu 24.04+ needs an AppArmor tweak).
WSL1 - not supported.

Other caveats: Docker and watchman don't play well inside the sandbox, and there's an escape hatch. When a command fails because of sandbox restrictions, Claude can retry it with dangerouslyDisableSandbox. That retry runs outside the sandbox but goes back through the normal permission flow, so it still asks you. It's not a silent bypass. To remove the hatch entirely, set allowUnsandboxedCommands: false (Strict sandbox mode in the /sandbox panel) and the parameter is ignored.

Source: code.claude.com/docs/en/sandboxing

Approach C: Docker microVM - strongest isolation, team-friendly

When hooks and the native sandbox aren't enough, you go up a level: run Claude inside a Docker microVM.

This isn't a container sharing your host kernel. Each sandbox gets a full microVM with its own kernel and a private Docker daemon. Your dev environment is cloned in, Claude works inside it, and nothing it does touches the host. Worst case, you throw the VM away.

This is the right call for two situations:

Teams - everyone gets the same isolated environment, and a mistake in one is contained to one.
Long-running autonomous agents - if Claude is going to run for hours unattended, the blast radius of any single command should be a disposable VM, not your laptop.

The cost is setup and overhead. You're running a VM, which is heavier than a hook or the native sandbox. For a quick session it's overkill. But if you're leaving an agent running unattended overnight, it's the only one of the four that lets you actually sleep.

Official guide: Docker Sandboxes for Claude Code

Approach D: Community collections - standing on shoulders

You don't have to write every pattern yourself. Several maintained hook collections exist, and they've already hit the edge cases you haven't:

karanb192/claude-code-hooks - blocks the classics: rm -rf ~, fork bombs, curl | sh.
CodyLunders/claude-code-hooks-library - 55 hooks, 12 security-specific, including AWS credential scanning (AKIA[0-9A-Z]{16}) and an interactive installer.
disler/claude-code-hooks-mastery - educational, walks through multiple security patterns so you learn the mechanism, not just the config.
Boucle - a standalone safety hook framework for Claude Code: framework.boucle.sh.

There's also a DEV Community writeup of 10 hooks pulled from 108 hours of autonomous operation, which is exactly the kind of battle-tested list worth reading before you invent your own.

So skim one of these, lift the patterns that fit your setup, and drop them into your own guard script. You keep the portability of owning the file, and you skip relearning the lessons someone already paid for.

Comparison

	Custom hooks	Native sandbox	Docker microVM	Community hooks
Isolation level	Pattern-based filter	Filesystem + network	Full VM, own kernel	Pattern-based filter
Setup effort	Low (a script + JSON)	Low-medium (install on Linux)	High (VM tooling)	Low (install a collection)
Permission friction	Low	Very low (auto-allow)	Very low	Low
Portability	Excellent (just files)	OS-dependent	Needs Docker	Excellent
Platforms	Linux, macOS, Windows (Git Bash / WSL)	macOS and Linux native, Windows via WSL2 (no WSL1)	Anywhere Docker runs	Linux, macOS, Windows
Catches the unknown	No, only known patterns	Yes, by construction	Yes, by construction	No, only known patterns

The split that matters is the last row. A hook only catches what you already wrote a pattern for, whereas a sandbox constrains what any command can touch in the first place, so it stops the things you never saw coming too. That's why the strong setups stack them - a hook on top for fast, project-specific rules you can read at a glance, and a sandbox underneath as the backstop for everything else.

What to actually use

Most developers, most of the time: custom hooks. Lightweight, portable and you understand exactly what they do. Start here.
Frequent autonomous sessions: add the native sandbox on top. The interruption cut alone pays for the Linux setup.
Teams or overnight agents: Docker microVM. When the blast radius has to be zero, nothing else gets you there.

Start with the hook. Here's one to copy into a plugin's hooks/ directory and wire up through the hooks.json shown earlier:

#!/bin/bash
# safety-guards.sh - PreToolUse hook for Bash
# Exit 0 = allow, exit 2 = block (stderr shown to Claude)
set -euo pipefail

INPUT=$(cat)
COMMAND=$(echo "$INPUT" | sed -n 's/.*"command"\s*:\s*"\([^"]*\)".*/\1/p' | head -1)
[ -z "$COMMAND" ] && exit 0

block() { echo "BLOCKED: $1" >&2; exit 2; }

# Destructive filesystem
echo "$COMMAND" | grep -qE 'rm\s+-[a-zA-Z]*rf[a-zA-Z]*\s' && \
  echo "$COMMAND" | grep -qE '(~[/\s]|\$HOME|/home/[^/\s]+[/\s]|\s+/\s+)' && \
  block "rm -rf targeting home or root"

# Hardcoded secrets
echo "$COMMAND" | grep -qE '(API_KEY|SECRET|TOKEN|PASSWORD)=['\''"]?[a-zA-Z0-9_\-]{16,}' && \
  block "possible hardcoded secret - use env vars"

# Unsafe force push
echo "$COMMAND" | grep -qE 'git\s+push\s+.*--force[^-]' && \
  ! echo "$COMMAND" | grep -qE 'force-with-lease' && \
  block "git push --force without --force-with-lease"

# Warn (allow)
WARNINGS=""
echo "$COMMAND" | grep -qE 'git\s+reset\s+--hard' && WARNINGS="git reset --hard destroys uncommitted changes"
[ -n "$WARNINGS" ] && echo "{\"systemMessage\": \"SAFETY WARNING: $WARNINGS\"}"

exit 0

Ship it in your plugin and Claude Code runs it before every Bash call. (The file needs its executable bit, which you set once and commit; git preserves it, so nobody runs chmod after installing.) Add patterns as you find the ones that bite.

You can have Claude write these rules for you. If you do, ask it to test them too: have it feed the hook fake commands that should trip each pattern (a dummy rm -rf ~, a planted secret, a git push --force) and confirm they actually block. Ask it to check the regex isn't too loose while it's there, so a real command doesn't slip past. It's the fastest way to catch a pattern that's too greedy or too narrow before it catches you.

That's a safety net you own, in a file you can read in one sitting.

Get started

Use the starter template to get a clean slate with everything pre-wired: github.com/Nagell/claude-marketplace-template.

The template gives you one empty plugin, a CLAUDE.md with sensible defaults, and a GitHub Actions workflow that handles versioning and releases automatically. Hit "Use this template" on GitHub and you're ready to add your first plugin.

If you want to see a fuller working example with multiple plugins and real hook scripts, github.com/Nagell/claude-marketplace is the reference repo used throughout this series.