The problem nobody talks about
AI coding agents lie. Not on purpose - they hallucinate.
Claude Code tells you "All tests pass!" when tests were never executed. It says "I updated the file" when the content is byte-for-byte identical. It sneaks in git commit --no-verify to skip the hooks that would catch its mistakes.
This isn't rare. It's a documented bug and it hits every serious Claude Code user. System prompts don't help - the agent just ignores them when it "decides" something is done.
I spent a couple weeks building a fix.
Don't ask the agent to be honest - verify it
That's the whole idea. Claude Code has a hooks API - before and after every tool call, it can run your scripts. Those scripts inspect what actually happened and block the agent if the results don't match the claims.
Agent claims: "I updated utils.ts"
|
[PostToolUse hook]
|
Compare SHA256 before/after -> IDENTICAL
|
BLOCKED: "File was not actually modified. Checksum unchanged."
Can't argue with a checksum. This isn't a prompt the agent can ignore. It's a gate.
Six hooks, zero fluff
| Hook | What it catches |
|---|---|
| Dangerous command blocker |
--no-verify, --force push; warns on reset --hard, clean -f
|
| Pre-commit test runner | Auto-detects your framework, runs tests before every commit |
| File checksum recorder | Saves SHA256 before file edit |
| Exit code verifier | Command failed (exit 1) but agent claims success |
| Phantom edit detector | File unchanged after a claimed "edit" |
| Commit verification reminder | Makes the agent prove the fix works before claiming "done" |
Two days of real use
I ran TruthGuard on a production Flutter project:
- 5 commits blocked - agent kept trying to commit with failing tests
-
3 dangerous commands caught - 2x
git push --force, 1xgit commit --no-verify - 0 false positives
The pre-commit test hook alone stopped me from shipping broken code five times in two days. Five times.
Pre-commit testing is the killer feature
When Claude runs git commit, TruthGuard intercepts it. Detects your project type, runs the right test command, and blocks the commit if tests fail. Simple as that.
# Auto-detection:
# pubspec.yaml -> flutter test
# package.json -> npm test
# Cargo.toml -> cargo test
# go.mod -> go test ./...
# pyproject.toml -> python -m pytest
Override if you want:
# .truthguard.yml
test_command: "npm run test:unit"
skip_on_no_tests: false
The subtler problem: wrong fixes
After building the basic hooks, I ran into something trickier. Claude makes real changes, tests pass, but the fix doesn't actually solve the original problem. It genuinely thinks it's done.
So I added a post-commit reminder. After every successful commit:
"You just committed code. STOP and verify: did you actually confirm the fix works?"
A nudge, basically. But it makes the agent pause instead of rushing to "Done."
Install
npx truthguard install
cd your-project
npx truthguard init
Copies scripts to ~/.truthguard/, adds hooks to .claude/settings.json. Restart Claude Code and that's it.
Homebrew works too:
brew tap spyrae/truthguard && brew install truthguard
Agent-agnostic
Scripts read JSON from stdin, write JSON to stdout. Same scripts power both Claude Code and Gemini CLI. Supporting another agent means writing a config file, not rewriting hooks.
What's next
This is the free local-only version. No backend, no telemetry, everything on your machine.
Some ideas I'm thinking about:
- A second LLM that checks whether the diff actually solves the described problem
- Team dashboard with honesty stats
- VS Code extension for Cursor and Copilot users
Links
GitHub: github.com/spyrae/truthguard
npm: npmjs.com/package/truthguard
If your agent lies in ways I haven't covered - open an issue and I'll write a hook for it.
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.