Day 11: Conversational RAG — How to Chat with Your Documents 💬

#ai #langchain #python #rag

Yesterday, we built a RAG chain that could answer a single question. But if you followed up with "Can you explain that further?", the AI would get confused. Why? Because it didn't have contextual history.

Today, we solve the hardest part of RAG: Conversational Memory. We'll teach the AI to understand that "it" or "that" refers to things mentioned earlier in the chat.

🏗️ The Problem: The "Query Re-writing" Challenge

If you ask:

"How does LangChain work?"
"Can you give me an example of it?"

The retriever doesn't know what "it" is. It will literally search your database for the word "it," which is useless.

To fix this, we add a step called History-Aware Retrieval. The AI takes your follow-up question and the chat history, then "re-writes" it into a standalone question that the retriever can understand.

🛠️ Step 1: Contextualizing the Question

We create a sub-chain that looks at the history and the new question to produce a "search-friendly" query.

from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# The prompt that tells the AI to re-write the question if history exists
contextualize_q_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question."
)

contextualize_q_prompt = ChatPromptTemplate.from_messages([
    ("system", contextualize_q_system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}"),
])

# Wrap your existing retriever (from Day 9)
history_aware_retriever = create_history_aware_retriever(
    llm, retriever, contextualize_q_prompt
)

🛠️ Step 2: The Full Conversational Chain

Now, we plug this into our document chain to create the final "Conversational RAG" flow.

from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

# Standard Q&A prompt
qa_system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer the question."
    "\n\n"
    "{context}"
)

qa_prompt = ChatPromptTemplate.from_messages([
    ("system", qa_system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}"),
])

question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

# The final chain!
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

🚀 Testing it Out

from langchain_core.messages import HumanMessage, AIMessage

chat_history = []

# First Interaction
question = "What is LangSmith?"
result = rag_chain.invoke({"input": question, "chat_history": chat_history})
print(result["answer"])

# Update History
chat_history.extend([
    HumanMessage(content=question),
    AIMessage(content=result["answer"]),
])

# Follow-up (The AI now knows 'it' refers to LangSmith!)
second_question = "How do I get started with it?"
result = rag_chain.invoke({"input": second_question, "chat_history": chat_history})
print(result["answer"])

🎯 Day 11 Summary

Today, you bridged the final gap in RAG. You learned:

- Contextualization: Why "it" and "this" break standard retrievers.

- Query Re-writing: Using an LLM to make search queries smarter.

create_history_aware_retriever: The specific LangChain tool for this job.

Your Homework: Try running the chain without updating the chat_history list. Notice how the second answer becomes generic or fails—this proves how vital history is!

See you tomorrow! ☕