RAG Agent Patterns: Self-Query, Corrective, Adaptive Retrieval

#ai #machinelearning #llm

This article was originally published on AI Study Room. For the full version with working code examples and related articles, visit the original post.

RAG Agent Patterns: Self-Query, Corrective, Adaptive Retrieval

Introduction

Basic RAG retrieves documents once and generates an answer. RAG agents take this further: they decide when to retrieve, formulate their own queries, verify retrieved information, and adapt their strategy based on the question complexity. This article covers three agentic RAG patterns that dramatically improve retrieval quality.

Self-Query RAG

Instead of using the raw user question as the search query, the agent generates an optimized query:

def self_query_rag(question: str) -> str:

    # Step 1: Generate search query

    search_query = call_llm(f"""

    Generate an optimal search query for a vector database.

    Extract key terms, rephrase questions as search statements.

    Output ONLY the search query, nothing else.

    User question: {question}

    """)

    # Step 2: Retrieve using optimized query

    chunks = vector_search(search_query, k=5)

    # Step 3: Generate answer from retrieved chunks

    context = "\n\n".join(chunks)

    answer = call_llm(f"""

    Answer the question based on the context below.

    If the context does not contain enough information, say so.

    Context: {context}

    Question: {question}

    """)

    return answer

The self-query pattern resolves the fundamental mismatch between natural language questions and keyword-optimized search indices. A question like "How do I handle rate limiting?" becomes the search query "rate limiting strategies implementation patterns error handling."

Corrective RAG (CRAG)

Corrective RAG adds a verification step between retrieval and generation. If retrieved documents are irrelevant, the agent takes corrective action:

def corrective_rag(question: str, max_attempts: int = 3) -> str:

    for attempt in range(max_attempts):

        # Retrieve

        chunks = vector_search(question, k=5)

        # Score relevance

        relevance_scores = []

        for chunk in chunks:

            score = call_llm(f"""

            On a scale of 0-10, how relevant is this document to:

            '{question}'

            Respond with only a number.

            """, chunk)

            relevance_scores.append(float(score.strip()))

        avg_relevance = sum(relevance_scores) / len(relevance_scores)

        if avg_relevance >= 7:

            # High confidence: generate answer

            context = "\n\n".join(chunks[:3])

            return generate_answer(question, context)

        elif avg_relevance >= 4:

            # Medium confidence: try query decomposition

            sub_questions = decompose_question(question)

            sub_answers = [corrective_rag(sq) for sq in sub_questions]

            return synthesize_answers(question, sub_answers)

        else:

            # Low confidence: reformulate query

            question = reformulate_query(question, chunks)

    return "Unable to find sufficient information to answer this question."

CRAG prevents the "hallucinate confidently from irrelevant context" failure mode common in naive RAG. Each attempt either improves the query or escalates to a more sophisticated strategy.

Adaptive Retrieval

Adaptive retrieval dynamically selects the retrieval strategy based on question characteristics:

class AdaptiveRetriever:

    def __init__(self):

        self.strategies = {

            "factoid": self.factoid_retrieval,

            "comparison": self.comparison_retrieval,

            "procedural": self.procedural_retrieval,

            "analytical": self.analytical_retrieval,

        }

    def retrieve(self, question: str) -> list[str]:

        # Classify the question type

        q_type = call_llm(f"""

        Classify this question as one of: factoid, comparison, procedural, analytical

        Respond with only the type name.

        Question: {question}

        """)

        strategy = self.strategies.get(q_type.strip(), self.factoid_retrieval)

        return strategy(question)

    def factoid_retrieval(self, question: str) -> list[str]:

        # Simple direct retrieval

        return vector_search(question, k=3)

    def comparison_retrieval(self, question: str) -> list[str]:

        # Retrieve documents for each side of the comparison

        entities = extract_comparison_entities(question)

        docs = []

        for entity in entities:

            docs.extend(vector_search(entity, k=3))

        return docs[:6]

    def procedural_retrieval(self, question: str) -> list[str]:

        # Step-by-step retrieval

        steps = decompose_steps(question)

        docs = []

        for step in steps:

            docs.extend(vector_search(step

Read the full article on AI Study Room for complete code examples, comparison tables, and related resources.

Found this useful? Check out more developer guides and tool comparisons on AI Study Room.