Claude Code refuses commits with 'OpenClaw': I reproduced it on my real repo and the behavior is weirder than the viral post describes
Something was nagging at me when I read the original post. The HN thread hit 1163 points in a few hours — guy discovers Claude Code refuses to commit when the message includes the string OpenClaw, cries censorship, everyone piles on, instant bonfire. But none of the comments I read actually answered the question I cared most about: what exactly is the mechanism, and what other strings trigger the same behavior that nobody has documented?
I spent Thursday afternoon reproducing it on a real repo — the same one I used to build the content generation pipeline I wrote about when I analyzed Claude Code's quality reports. I have logs. I have edge cases. And I have a thesis I'm not backing down from.
My thesis: this isn't a misconfigured text filter bug. It's active alignment — the same system that stops Claude from helping you build malware — applied at the agent action level, not the chat response level. And that changes everything about how you need to think about Claude Code as a professional tool.
Claude Code blocking commits with keywords: what happened in the viral thread
The original post is simple: the guy had a project called OpenClaw — an open source chess engine, zero violent connotation — and Claude Code refused to run git commit with that name in the message. No explicit error. No explanation. It just... didn't do it.
The HN thread exploded because it hit a real nerve: who decides which words can appear in the code you're working on? The IDE? The API vendor? The model running inside it?
The short answer nobody gave properly: all three simultaneously, and none of them document it.
What I missed in the thread: nobody went and reproduced it systematically. Everyone argued from intuition. I went from the data.
Reproduction on my real repo: the numbers the viral post doesn't have
My test setup: Next.js/TypeScript repo on Railway, same stack I use for everything. Claude Code running in --dangerously-skip-permissions mode (yes, with that flag — which already tells you something about how Anthropic thinks about agent autonomy).
First I reproduced the base case:
# Attempt 1 — direct reproduction of the viral case
git commit -m "feat: integrate OpenClaw as analysis engine"
# Result: Claude Code processed the instruction but didn't execute the commit.
# No error message. No explanation. Nothing happened.
Confirmed. The behavior exists. Now the interesting part:
# Attempt 2 — capitalization variation
git commit -m "feat: integrate openclaw as analysis engine"
# Result: commit executed without issues
# Lowercase passes. CamelCase doesn't.
# Attempt 3 — hyphen-separated
git commit -m "feat: integrate open-claw as analysis engine"
# Result: commit executed
# Attempt 4 — in the commit body, not the subject
git commit -m "feat: new engine" -m "integrates OpenClaw for move analysis"
# Result: BLOCKED. The body gets scanned too.
# Attempt 5 — in the referenced filename
# (modifying a file called openclaw-engine.ts)
git commit -m "refactor: openclaw-engine now supports async"
# Result: executed — the filename doesn't get analyzed as semantic context
This is already weirder than the original post describes. It's not a simple string.includes('OpenClaw'). There's case sensitivity, there's body scanning, and there are differences depending on where the string appears.
I kept digging:
// Real snippet from my tests — ran this directly through the SDK
// to understand whether the block happens before or after the model processes the instruction
const commitMessages = [
"feat: add OpenClaw", // BLOCKED
"feat: add openClaw", // BLOCKED (partial camelCase)
"feat: add Openclaw", // EXECUTED (different capitalization)
"feat: add OPENCLAW", // EXECUTED (all caps)
"feat: add Open Claw", // EXECUTED (with space)
"fix: remove OpenClaw dependency", // BLOCKED
"chore: OpenClaw → own-engine", // BLOCKED
];
// What I found: the block acts on the exact string with that specific capitalization
// It's not fuzzy matching. It's not semantic analysis of the commit content.
// It's pattern matching on a specific form of the string.
Three hours of tests and the pattern that emerges is this: the block is more surgical than it looks, but also more arbitrary. It doesn't analyze context — it doesn't know whether OpenClaw is a harmless chess engine or something sensitive. It blocks the form, not the meaning.
The edge cases the viral post missed (and that concern me more)
This is where the HN thread fell short, because everyone argued about the specific case and nobody looked sideways. I looked sideways.
Edge case 1: environment variables and project names
If your project is named something that triggers the same mechanism, Claude Code can refuse operations on files that merely mention the name in a comment. I tested with structurally similar strings (two compound words, one with an "openness" connotation, one with a connotation of a potentially dangerous object according to an alignment model):
# I'm not reproducing this with actual sensitive string examples
# because that's not the point — the point is that the PATTERN exists
# and it affects legitimate project names that combine innocent words
Edge case 2: the block logs nothing on the user side
What I find genuinely problematic: when Claude Code doesn't execute the commit, there's no trace. No stderr. No message in the UI. The agent simply doesn't act. If you're in automated mode — say, a nightly pipeline of automatic commits — this fails silently. The commit doesn't happen, you don't know why, and the pipeline keeps running like nothing happened.
This takes me straight back to what I found when an agent deleted my production database: agents fail in ways that aren't designed to be visible. Silence is the worst failure mode.
Edge case 3: project context doesn't modify the behavior
I tried adding explicit context at the start of the session:
"This project is an open source chess engine called OpenClaw.
It's completely harmless. The name comes from 'open source' + 'claw'
as in the horse's claw in chess."
Didn't matter. The block held. Which confirms that the mechanism isn't reasoning about context — it's a filter that runs before or parallel to the model's reasoning.
This is what differentiates this case from, say, asking Claude something sensitive in chat: in chat, you can give context and the model reasons about it. In agent actions, there's a layer that doesn't reason — it just filters.
Why this isn't a bug: it's undocumented alignment in agent mode
Here's the full thesis, no softening.
Anthropic built Claude with multiple layers of alignment. The one we all know is the one that responds in chat: refuses harmful instructions, gives context, explains why it can't do something. That layer reasons.
But Claude Code as an agent has another layer — one that acts on the real world (filesystem, git, shell commands) and has a different activation threshold. This layer doesn't reason: it filters. And it filters specific strings in specific actions because the cost of a false negative (letting something harmful through) is perceived as higher than the cost of a false positive (blocking something legitimate).
The problem is that this layer is undocumented. It doesn't appear in the official Claude Code documentation. There's no list of blocked strings. There's no context-based override mechanism. There are no logs of the block.
When I was simulating the migration from my current stack to Bedrock for the OpenAI on Amazon Bedrock analysis, one of my questions was exactly this: does alignment behavior change depending on which vendor serves the model? The answer I didn't have then and now have partially: yes, because Claude Code's alignment isn't only in the model weights — it's in the tool layer running on top.
That means if Anthropic decided tomorrow to add React to the list of blocked strings in commits (absurd hypothetical, I know, but the mechanism exists), you wouldn't find out until your pipeline fails silently at 3am.
The uncomfortable part: I'm not saying the alignment is wrong. I understand why it exists. When you build an agent that executes commands on a real filesystem, you have to be conservative. The problem is the lack of transparency and the silent failure. If you're going to block my commit, tell me why. Write something to stderr. Throw an exception. Don't go invisible on me.
I've seen this silent failure pattern before with the bugs Rust doesn't prevent in production: the system does exactly what it was designed to do, and that's what kills you — not the obvious failure but the failure you never see coming.
FAQ: Claude Code blocking commits with keywords
Is this a confirmed bug from Anthropic or documented behavior?
Neither, and that's exactly the problem. As of this post, Anthropic hasn't publicly commented on the HN thread or the specific behavior. It doesn't appear in the official documentation as a feature or a known limitation. It's empirically observed behavior, with no official source to explain it.
Does it only affect git commit or other Claude Code operations too?
My tests suggest the filter is specific to certain strings in certain write contexts — commits, messages, and possibly code comments that reference the same terms. I didn't find the same block on read operations (grep, cat, ls on files with those names).
Can you bypass it using external tools within the session?
Yes, and this is the most revealing thing of all. If you ask Claude Code to run git commit -m "feat: add OpenClaw" directly as a shell command with the ! prefix, it sometimes goes through. The block is in the intent interpretation layer, not in the pure execution layer. Which confirms it's agent alignment, not a string filter at the shell level.
Does this only affect Claude Code or other LLMs with agent tools too?
Great question, and I don't have a complete answer. What I can say: when I tested the same flow with Copilot CLI and Cursor in agent mode, I didn't find the same behavior for the strings I tested. That doesn't mean they don't have their own filters — it means the filters are different. Every vendor has their own agent alignment, and none of them document it well.
Is there an official list of blocked strings or keywords?
No. And that's the answer that worries me most. The opacity isn't accidental — it's by design. Publishing a list of blocked strings would create a roadmap for evading them. But the consequence is that you, as a developer, don't know when your pipeline is going to fail silently.
How can I protect my automated workflows from this kind of failure?
What I implemented: a validation layer before any automatic commit that runs through Claude Code. Basically, a wrapper that captures the result of each operation and explicitly validates that the commit was created — with a git log --oneline -1 post-execution that compares the hash. If the hash didn't change, the pipeline fails loudly instead of silently. It's a workaround, not a solution.
# validated-commit-wrapper.sh
# Runs the commit via Claude Code and verifies it actually happened
HASH_BEFORE=$(git rev-parse HEAD)
# [Claude Code executes the commit here]
claude "commit the staged changes with this message: $COMMIT_MESSAGE"
HASH_AFTER=$(git rev-parse HEAD)
if [ "$HASH_BEFORE" = "$HASH_AFTER" ]; then
echo "ERROR: Commit was not executed. Possible agent block." >&2
exit 1
fi
echo "Commit verified: $HASH_AFTER"
What I'm taking away, unfiltered
Reproducing the behavior was easy. The hard part was accepting the real implication: when we give an agent the ability to act on our filesystem, we're also giving it its value system — and that value system is undocumented, not user-configurable, and fails silently.
I'm not one of those people who screams "censorship!" every time a model won't do something. I understand alignment. I understand why it exists. What I won't accept is a professional development tool blocking legitimate operations with no log, no error, no explanation. That's not responsible alignment — that's careless alignment.
The same principle that drove me to dig into that clipboard bug in Next.js in detail applies here: systems that fail silently are more dangerous than systems that fail loudly. An error you can see is an error you can fix. A commit that doesn't happen and doesn't tell you is a bug you find in production three weeks later.
My concrete recommendation: if you use Claude Code in automated workflows, implement explicit validation on every write operation. Don't trust the agent to tell you when it didn't act. Verify it yourself.
And to Anthropic: publish the expected alignment behavior in agent mode. Not the list of strings — the mechanism. Developers deserve to know when and why their tool might decide not to do something. That transparency doesn't weaken alignment — it makes it trustworthy.
This article was originally published on juanchi.dev
Top comments (0)