Most RAG pilot problems are not model problems at first.
They are source problems.
The demo looks promising because the happy-path question is easy. Then the pilot meets real internal documents:
- duplicated policies;
- stale PDFs;
- contradictory SOPs;
- screenshots with important text;
- docs with no owner;
- files that answer "what" but not "when this changed";
- permissions that do not match how the assistant will be used.
Before choosing embeddings, chunking strategy, rerankers, or agent tooling, I like to check whether the source layer can support a useful answer.
The quick filter:
- Which source is authoritative when two documents disagree?
- Who owns each source?
- How often does it change?
- Can the system cite the exact source passage?
- What questions should the assistant refuse?
- What questions require escalation to a human?
- What is the cost of a confident but unsupported answer?
That last one matters. A RAG system that says "I do not know" is often safer than one that confidently blends two outdated documents.
A small pilot should usually prove four things:
- retrieval finds the right source;
- the answer stays inside the source;
- citations are inspectable;
- unsupported questions are refused or escalated.
If those are not true, improving the prompt is usually the wrong first fix. The next milestone should be source cleanup, evaluation questions, and refusal criteria.
I packaged a small RAG evaluation kit and a fixed-scope async readiness review for teams that want to check this before funding a bigger internal assistant:
RAG Pilot Readiness Review:
https://mindtrovertlabs-sketch.github.io/scopegrade-storefront/rag-pilot-readiness-review.html
RAG Pilot Evaluation Kit:
https://mindtrovertlabs-sketch.github.io/scopegrade-storefront/
Free preview:
https://mindtrovertlabs-sketch.github.io/scopegrade-storefront/preview.html
This is a planning/evaluation resource, not a guarantee of accuracy, compliance, or production readiness. The goal is to make the first pilot milestone honest before more engineering time gets spent.
Top comments (0)