DEV Community

马国锦
马国锦

Posted on

Agentic RAG: When Your Retrieval System Learns to Think for Itself

Traditional RAG retrieves once and generates. Agentic RAG retrieves, evaluates, and decides whether to retrieve again.


Traditional RAG vs Agentic RAG

Traditional: User asks → Vector search once → Generate answer
Agentic: User asks → Agent analyzes → Decides search strategy → Verifies results → Re-searches if needed → Generates

The agent thinks between steps.


The Core Loop

class AgenticRAG:
    def __init__(self, retriever, llm, max_rounds=3):
        self.retriever = retriever
        self.llm = llm
        self.max_rounds = max_rounds

    def _decide(self, query, context):
        prompt = f"Query: {query}. Existing info: {context or 'none'}. Decide: RETRIEVE | GENERATE | REFINE"
        return self.llm.generate(prompt)

    def _verify(self, answer, sources):
        prompt = f"Answer: {answer}. Sources: {sources}. Is every claim backed by a source? yes/no"
        return "yes" in self.llm.generate(prompt).lower()

    def run(self, query):
        context = []
        for _ in range(self.max_rounds):
            decision = self._decide(query, "\n".join(context))
            if "GENERATE" in decision:
                answer = self.llm.generate(f"Based on: {context}\nQuery: {query}")
                if self._verify(answer, context):
                    return answer
            else:
                results = self.retriever.search(query, top_k=5)
                context.extend(results)
        return answer
Enter fullscreen mode Exit fullscreen mode

Three Critical Decisions

1. When to Stop Searching

Combine confidence threshold + round limit (max 3) + information gain check.

2. Which Tools to Give the Agent

Start: vector + BM25 + RRF. Add SQL and web search later.

3. How Deep to Verify

Light check (1 LLM call) for every query. Deep check (re-search + compare) for high-stakes answers.


When to Use Each

Scenario Use
Simple fact lookup Traditional RAG
Multi-hop reasoning Agentic RAG
Numerical aggregation Agentic RAG + SQL
Latency-sensitive (<500ms) Traditional RAG
High accuracy (medical/legal) Agentic RAG + deep verify

Traditional RAG hopes. Agentic RAG verifies. Each extra round pushes recall from 60% to 90%+.


☕ Support This Content

If my articles saved you debugging time, scan the QR code below to buy me a coffee.

Buy me a coffee

Follow @mgj for weekly AI engineering deep dives.

Top comments (0)