thestack_ai

Posted on Mar 16

I built 6 cognitive firewalls because my AI kept confidently giving wrong answers

#ai #claudecode #productivity #opensource

I use Claude Code every day. It writes code fast. But it thinks poorly.

Not always. Not obviously. That's what makes it dangerous.

The moment I realized something was wrong

I asked Claude: "Is SQLite viable for an app with 1000 concurrent users?"

It said: "No, SQLite is not suitable for high-concurrency applications. Use PostgreSQL or MySQL instead for production workloads."

Confident. Clear. Completely wrong.

1000 concurrent users does not equal 1000 concurrent writes. A typical web app at this scale generates about 30 concurrent write transactions. SQLite in WAL mode handles around 120 writes/sec. Expensify serves 10M+ users on SQLite.

Claude didn't check any sources. It just gave the "safe" answer.

Six failures, not one

I started paying attention. I noticed six distinct patterns:

Premature closure: Rushes to execute ambiguous requests instead of asking questions
Hallucination: States claims without verification
Anchoring bias: Locks onto the first "obvious" answer
Confirmation bias: Agrees with you instead of challenging
Black-box reasoning: Gives conclusions without showing assumptions
Optimism bias: Assumes the plan will work

The fix: structured skills

I tried prompt engineering. "Be more careful." "Check your sources."

It doesn't work. The AI nods, then does the same thing.

What works is structure:

swing-research: Every claim traced to a source or labeled "Unverified." Source tier grading (S/A/B/C). 2+ independent sources for key claims.

swing-review: Steel-man first, then 3-vector attack. "Looks good" is structurally banned.

swing-clarify: 5W1H decomposition. Ambiguity score 0-6. Up to 3 clarifying questions before execution.

swing-options: 5 options across probability zones. At least 1 unconventional.

swing-trace: Every assumption rated. Every decision fork documented. Weakest link identified.

swing-mortem: "It's 6 months from now. This failed completely. What went wrong?"

What changed

The SQLite question through swing-research: conclusion flipped from "No" to "Yes, with caveats" backed by actual sources.

JWT review through swing-review: found a Critical security vulnerability (no refresh token rotation) the baseline missed entirely.

Not better answers. Structurally different reasoning.

Try it

npx skills add whynowlab/swing-skills --all

Six skills. Each targets one cognitive failure. MIT licensed.

GitHub: https://github.com/whynowlab/swing-skills

DEV Community