brian austin

Posted on Apr 1

What the Claude Code source leak reveals about how it actually works (and what to do with that)

#claudecode #ai #security #programming

What the Claude Code source leak reveals about how it actually works

Yesterday, a source map file accidentally left in the Claude Code NPM package exposed what appears to be Anthropic's internal implementation. The HN thread hit 900+ points in hours. Developers are understandably fascinated.

Here's what the leak actually reveals — and more importantly, what it means for how you use Claude Code today.

What was in the leak

Researcher Alex Kim's breakdown identified several surprising internals:

1. Fake tool responses
Claude Code uses synthetic/stub tool call responses in some contexts. This isn't as sinister as it sounds — it's a common technique for keeping the model grounded when real tool execution would be circular or undefined. But it explains some of the "why did it pretend to run that command" behavior you may have seen.

2. Frustration regexes
The codebase apparently contains regex patterns that detect user frustration signals — things like repeated failed attempts, certain phrases, escalating tone. Claude Code is literally watching for signs you're getting annoyed.

3. Undercover mode
There are references to a mode where Claude Code operates without surfacing its reasoning — essentially silent execution. This aligns with the --no-verbose behavior but suggests it's more deeply baked in than a simple flag.

4. System prompt injection protection
Evidence of internal checks to prevent prompt injection through tool outputs — a known attack vector for coding agents.

What this means practically

None of this changes how you should use Claude Code. But it does explain some behaviors:

Why it sometimes "pretends" to have run something:
The fake tool responses are a scaffold. If you're seeing Claude claim it ran a command it didn't, add explicit verification to your CLAUDE.md:

# Verification rules
- Always show actual command output, never summarize
- If a command fails, show the exact error — don't paraphrase
- Confirm file writes by showing the written content

Why it softens after you push back:
The frustration detection is real. Claude Code is designed to de-escalate when you seem stuck. This is actually useful to know — if you're getting soft/hedging responses, it may have detected frustration and switched to a more cautious mode. Try starting a fresh session.

Why verbose mode matters:
With undercover mode confirmed, --verbose isn't optional if you want to understand what's actually happening. Always run it:

# In your .claude/settings.json
{
  "output": {
    "verbose": true
  }
}

The deeper issue: you're running code you can't read

The leak happened because of a source map in an NPM package. This is a reminder that Claude Code is a closed-source tool running on your filesystem with broad permissions.

This doesn't mean you shouldn't use it. But it does mean you should control what it can touch:

// .claude/settings.json — lock it down
{
  "permissions": {
    "deny": [
      "Bash(git push*)",
      "Bash(git reset*)",
      "Bash(rm -rf*)",
      "Bash(curl*)",
      "Bash(wget*)",
      "Bash(npm publish*)"
    ]
  }
}

Also: use a dedicated API key for Claude Code with spend limits set, separate from any production keys.

The API vs. the product

Here's something the leak makes clearer: the Claude Code client (the tool you're running) is complex, opaque, and evolving. The underlying API is comparatively simple and transparent.

If the complexity concerns you — and for production use cases it probably should — there's a straightforward alternative: use the Claude API directly with a flat-rate proxy, so you control exactly what you send and get back.

I've been using SimplyLouie for this: it's a $2/month Claude API endpoint you can swap in via ANTHROPIC_BASE_URL. No usage tracking beyond billing, no frustration detection, no undercover mode.

export ANTHROPIC_BASE_URL=https://api.simplylouie.com
export ANTHROPIC_API_KEY=your_key_here

# Now use the Anthropic SDK normally — same interface, flat rate
curl $ANTHROPIC_BASE_URL/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-5",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "explain the claude code source leak"}]
  }'

For Claude Code specifically, you can also set the base URL in your environment and Claude Code will use it automatically:

export ANTHROPIC_BASE_URL=https://api.simplylouie.com
claude  # uses SimplyLouie endpoint, flat $2/month

The CLAUDE.md hardening checklist

Given what the leak reveals, here's what I'd add to every project's CLAUDE.md:

# Trust and verification rules

## You must always:
- Show actual terminal output verbatim (no paraphrasing)
- Confirm file changes by diffing before and after
- Ask before any destructive operation (delete, reset, publish)
- Report uncertainty explicitly: "I'm not sure if X worked"

## You must never:
- Claim a command ran if you didn't show its output
- Summarize an error — show the full stack trace
- Make assumptions about state — verify with ls, cat, git status
- Proceed after a failure without explicit confirmation

Final thought

The Claude Code leak is interesting because it reveals that the tool is more complex than it presents itself. That's not unusual for developer tools. But for something running on your codebase with filesystem access, complexity deserves scrutiny.

The right response isn't to stop using Claude Code — it's to use it with eyes open: verbose mode on, deny rules set, a fresh session when things feel off, and explicit verification in your CLAUDE.md.

And if you want to go deeper and work directly with the Claude API without the client-layer complexity: simplylouie.com/developers

Have you looked at the source leak? What surprised you most? Sharing below.

DEV Community