In my last project, I used RAG (Retrieval Augmented Generation) for retrieving the relevant context for the user question. But the problem I faced is that, from the user query, the retrieval is not always very accurate.
For example, In my budget AI project, I embedded the user query directly to match the vector-store database for getting the relevant documents. But for questions like:
"What did I buy last month?"
The retrieval sometimes provided relevant documents and sometimes failed. Because, this question is too vague and does not provide enough context or specific details (like category or amount) to match effectively with the embedded vectors. The semantic search might struggle to identify which specific records are relevant without additional context.
To solve this problem, I came across a technique called multi-query. Before matching with the vector-store, I break down the user question into multiple prompts for the semantic search. For example, The given question can be broke into multiple different related prompts:
- "List all expenses from last month."
- "Show purchases and expenses made in the last month."
- "What items did I spend money on in the previous month?"
- "Provide details of all transactions from last month."
- "What were my expenses for each category last month?"
This can be easily done with the help of an LLM. Just ask it to break the question into multiple related questions to get relevant documents. The following diagram, taken from the LangChain GitHub repository, visualizes the process:
This process significantly increased the relevancy of my documents from the retrieval. LangChain has a built-in function for the same task, you may look at their official documentation for Multi-Query retriever.
I found this technique in Rag-From-Scratch by LangChain. There are more sophisticated techniques for improving retrieval capabilities, and I will try to write more articles if I find anything interesting.
Top comments (0)