DEV Community

Richard  Ketelsen
Richard Ketelsen

Posted on

How to catch AI hallucinations before they reach production

LLMs hallucinate. That's not news. What's underdiscussed is how that failure mode behaves in long working sessions: confident reconstruction that looks fluent, cites specifics, and feels right — until three sessions later when something supposed to be true turns out not to be.

This is week 5 of an 8-week deep dive on CRAFT for Cowork, a structured working environment for Claude. The QA framework treats AI reliability as a measurable engineering problem.

The four gates

CRAFT's verification core is a reusable sub-routine — RCP-CWK-024 — that any recipe can call before reporting a result:

Gate 1: File-pointability
        Can the claim be traced to a specific file?
Gate 2: Read-vs-reconstructed
        Was the data read this session, or recalled from memory?
Gate 3: Lessons-Learned conflict
        Does the claim contradict a documented LL entry?
Gate 4: Untested assumption
        Is this verified or assumed?
Enter fullscreen mode Exit fullscreen mode

A claim that fails any gate gets flagged — visibly to the user, not buried in the answer.

Confidence scoring with decay

Every claim also gets a 0-100 confidence score, graded against a source hierarchy:

  • 80-100 — Evidence read directly from files
  • 50-79 — Behavioral observation of tool output
  • 30-49 — Design intent inferred from documentation
  • 0-29 — Pure reasoning, no source

A 10-point penalty applies once the session passes 70% token usage — the late-session decay correction. The scoring isn't there to make you suspicious of the AI. It's there to give you a calibrated read on a specific claim.

Real receipts

The framework caught nine Week-1 content files in this campaign that referenced CRAFT as "open source" — incorrect (it's a dual license: BSL 1.1 spec, proprietary content). The factual claim validator flagged the mismatch with documented license language. All nine corrected pre-publication.

The cross-file audit recipe (RCP-CWK-036) runs every 5-10 sessions and has caught ~40% drift in tracking-file state tables. Drift that would otherwise propagate as silent ground truth into every subsequent session.

Try it

CRAFT for Cowork: free public beta on GitHub.

🔗 https://github.com/CRAFTFramework/craft-framework
🔗 License: https://craftframework.ai/craft-license/ (BSL 1.1 → Apache 2.0 on Jan 1 2029 for the spec; proprietary for content)

Last week: device switching across desktop and laptop. Next week: the project structure that makes verification possible.

Top comments (1)

Collapse
 
craftframework profile image
Richard Ketelsen

"I wrote this because I was tired of LLMs 'confidently lying' to me during long working sessions—especially when those errors started propagating into my ground-truth files.

The four-gate verification pattern has been a game-changer for my workflow, but I'm curious: How are you all handling AI reliability in your own projects? Do you use any specific 'sanity check' prompts or external validation layers, or are you mostly relying on manual code reviews?"