Originally published on AI Tech Connect.
What you need to know One pipeline for every query is a mistake. A single retrieval pattern either over-serves simple lookups, burning money and seconds you did not need to spend, or under-serves hard questions and returns a thin, wrong answer. Adaptive RAG matches the query to the right amount of machinery. There are three tiers worth building. Naive RAG (embed, top-k vector search, stuff, generate) for simple facts; hybrid retrieval plus a cross-encoder reranker for harder questions; and an agentic loop that plans, retrieves, judges sufficiency and re-retrieves for synthesise-and-cite tasks. Reranking is the highest-ROI single change you can make. Retrieve a wide candidate set, then re-score the top results against the query with a cross-encoder. It needs no re-indexing and commonly…
Top comments (0)