RAG vs SELF-RAG

#rag #llm #hugginface #selfrag

What RAG does is, btw RAG is rigid in how it considers retrieving the information in order to generate context using some sort of document similarity algorithm. So, 😊 RAG will retrieve information even if it doesn’t have to do it. It immediately assumes the LLM lacks learnt embeddings to make sense of a prompt by the user and that’s why it first generates features from a domain specific corpus and adds it as context to the LLM prompt without considering the LLM might have the most relevant context in the first place (Obviously, this is an implementation detail that as Self-RAG attempts to address). How do we ask the LLM this question before tuning its response with an external source? Inasmuch as RAG is a good way to add relevance to LLMs especially using enterprise data, augmenting RAG with the intelligence to know when to retrieve or not will reduce the impact of verbiage output. Unlike a widely adopted Retrieval-Augmented Generation approach, Self-RAG retrieves on demand and criticize its own generation.