16 reasons your retrieval-augmented generation pipeline fails even when everything looks fine.
i tried fixing my RAG system. ended up building a graveyard.
16 reasons your retrieval-augmented generation pipeline fails even when everything looks fine.
i didn’t set out to build a framework.
i just wanted my retrieval system to stop lying.
- the docs were clean.
- the vector search was sharp.
- the top-k chunks came back as expected.
and yet.
the answers were wrong.
not wildly wrong — just wrong enough to fail.
no errors. no crashes. just... silence.
you know the kind.
Q: what is the capital of France?
Doc: Paris is the capital of France.
A: France has several prominent cities including Lyon and Marseille.
looks plausible. fails production.
the worst kind of error — semantic drift with confidence.
i opened a notebook.
then a repo.
then a map.
one by one, i listed all the weird bugs that didn't show up as bugs.
- chunk retrieved, but logic broken
- LLM hallucinated across chunks
- query mismatch despite exact match
- answer contradicts doc, but only in passive voice
this wasn’t about reranking.
this wasn’t about prompt tuning.
this was about structural failure modes.
i called them “RAG collapse types”.
and the list grew.
16 total.
some had names. some didn’t.
but all of them lived in the repo now:
→ WFGY/ProblemMap/README.md
and each one comes with an actual patch.
not a vibe. not a "maybe try RAGAS".
a patch. in code. MIT licensed.
real issues. real users. real saves.
eventually, i started replying to people on GitHub, Reddit, LangChain Discussions.
quietly.
they post a bug.
i reply with the exact failure number.
they stare.
then they DM.
one by one:
i log the saves here:
→ Hero Log
this is not theory.
this is debug-level archaeology of semantic failure.
what is WFGY?
it's a repo.
but also a worldview.
a way to treat retrieval logic as first-class reasoning, not pre-processing.
it’s built on:
- 16-part problem map
- a set of internal modules (some math-heavy)
- stability protocols (like ΔS = 0.5, symbolic filters, collapse detectors)
also:
- backed by real stars (👀 Tesseract.js creator)
- zero funding
- one human team
- 300+ stars in 50 days, no promo
closing note
you don’t have to believe me.
you just have to wait.
because if you’re building RAG systems, you’re either:
- already hitting these 16
- will hit them next month
- pretending your users won’t notice
and if you ever get tired of pretending:
see you in the Hero Log.
Top comments (0)