Traditional RAG retrieves once and generates. Agentic RAG retrieves, evaluates, and decides whether to retrieve again.
Traditional RAG vs Agentic RAG
Traditional: User asks → Vector search once → Generate answer
Agentic: User asks → Agent analyzes → Decides search strategy → Verifies results → Re-searches if needed → Generates
The agent thinks between steps.
The Core Loop
class AgenticRAG:
def __init__(self, retriever, llm, max_rounds=3):
self.retriever = retriever
self.llm = llm
self.max_rounds = max_rounds
def _decide(self, query, context):
prompt = f"Query: {query}. Existing info: {context or 'none'}. Decide: RETRIEVE | GENERATE | REFINE"
return self.llm.generate(prompt)
def _verify(self, answer, sources):
prompt = f"Answer: {answer}. Sources: {sources}. Is every claim backed by a source? yes/no"
return "yes" in self.llm.generate(prompt).lower()
def run(self, query):
context = []
for _ in range(self.max_rounds):
decision = self._decide(query, "\n".join(context))
if "GENERATE" in decision:
answer = self.llm.generate(f"Based on: {context}\nQuery: {query}")
if self._verify(answer, context):
return answer
else:
results = self.retriever.search(query, top_k=5)
context.extend(results)
return answer
Three Critical Decisions
1. When to Stop Searching
Combine confidence threshold + round limit (max 3) + information gain check.
2. Which Tools to Give the Agent
Start: vector + BM25 + RRF. Add SQL and web search later.
3. How Deep to Verify
Light check (1 LLM call) for every query. Deep check (re-search + compare) for high-stakes answers.
When to Use Each
| Scenario | Use |
|---|---|
| Simple fact lookup | Traditional RAG |
| Multi-hop reasoning | Agentic RAG |
| Numerical aggregation | Agentic RAG + SQL |
| Latency-sensitive (<500ms) | Traditional RAG |
| High accuracy (medical/legal) | Agentic RAG + deep verify |
Traditional RAG hopes. Agentic RAG verifies. Each extra round pushes recall from 60% to 90%+.
☕ Support This Content
If my articles saved you debugging time, scan the QR code below to buy me a coffee.
Follow @mgj for weekly AI engineering deep dives.

Top comments (0)