I Built a Lie Detector for AI Coding Agents

#claudecode #ai #productivity #opensource

The problem nobody talks about

AI coding agents lie. Not on purpose - they hallucinate.

Claude Code tells you "All tests pass!" when tests were never executed. It says "I updated the file" when the content is byte-for-byte identical. It sneaks in git commit --no-verify to skip the hooks that would catch its mistakes.

This isn't rare. It's a documented bug and it hits every serious Claude Code user. System prompts don't help - the agent just ignores them when it "decides" something is done.

I spent a couple weeks building a fix.

Don't ask the agent to be honest - verify it

That's the whole idea. Claude Code has a hooks API - before and after every tool call, it can run your scripts. Those scripts inspect what actually happened and block the agent if the results don't match the claims.

Agent claims: "I updated utils.ts"
    |
[PostToolUse hook]
    |
Compare SHA256 before/after -> IDENTICAL
    |
BLOCKED: "File was not actually modified. Checksum unchanged."

Can't argue with a checksum. This isn't a prompt the agent can ignore. It's a gate.

Six hooks, zero fluff

Hook	What it catches
Dangerous command blocker	`--no-verify`, `--force push`; warns on `reset --hard`, `clean -f`
Pre-commit test runner	Auto-detects your framework, runs tests before every commit
File checksum recorder	Saves SHA256 before file edit
Exit code verifier	Command failed (exit 1) but agent claims success
Phantom edit detector	File unchanged after a claimed "edit"
Commit verification reminder	Makes the agent prove the fix works before claiming "done"

Two days of real use

I ran TruthGuard on a production Flutter project:

5 commits blocked - agent kept trying to commit with failing tests
3 dangerous commands caught - 2x git push --force, 1x git commit --no-verify
0 false positives

The pre-commit test hook alone stopped me from shipping broken code five times in two days. Five times.

Pre-commit testing is the killer feature

When Claude runs git commit, TruthGuard intercepts it. Detects your project type, runs the right test command, and blocks the commit if tests fail. Simple as that.

# Auto-detection:
# pubspec.yaml     -> flutter test
# package.json     -> npm test
# Cargo.toml       -> cargo test
# go.mod           -> go test ./...
# pyproject.toml   -> python -m pytest

Override if you want:

# .truthguard.yml
test_command: "npm run test:unit"
skip_on_no_tests: false

The subtler problem: wrong fixes

After building the basic hooks, I ran into something trickier. Claude makes real changes, tests pass, but the fix doesn't actually solve the original problem. It genuinely thinks it's done.

So I added a post-commit reminder. After every successful commit:

"You just committed code. STOP and verify: did you actually confirm the fix works?"

A nudge, basically. But it makes the agent pause instead of rushing to "Done."

Install

npx truthguard install
cd your-project
npx truthguard init

Copies scripts to ~/.truthguard/, adds hooks to .claude/settings.json. Restart Claude Code and that's it.

Homebrew works too:

brew tap spyrae/truthguard && brew install truthguard

Agent-agnostic

Scripts read JSON from stdin, write JSON to stdout. Same scripts power both Claude Code and Gemini CLI. Supporting another agent means writing a config file, not rewriting hooks.