Retrieval Augmented Generation (RAG) for Dummies

#nlp #ai #rag #gpt3

Retrieval-Augmented Generation (RAG) is a technique that optimises an AI model’s performance by connecting it with external knowledge bases. RAG then helps large language models (LLMs) provide more accurate, relevant, and high-quality responses. Let’s put things into perspective.

Imagine someone asked you which planet has the most moons. Thinking back to when one was younger, one would say that Jupiter has 88 moons (for example). However, this response is flawed since the information is out of date, and the source itself is not credible. LLMs have similar challenges.

Now, let’s assume that you had first gone and looked up the question on a reputable source, such as NASA or the European Space Agency. You would have most likely gotten that Saturn has the most moons, as scientists keep discovering new ones.
Where exactly does retriever-augmentation come in?
So let's say you were using ChatGPT. Now, rather than solely relying on what the model knows, it would refer to a ‘content store’, where there is reliable data. The model would then retrieve content relevant to the query from the store. So, rather than stating that the answer is Jupiter, the store would reveal that the correct planet is Saturn.
Rather than having to retrain the model, all you have to do is update the model, and you can ensure that the store is always up to date. This approach lowers the risk of hallucination, allows access to up-to-date information, and improves trust in the model.

Summary of How RAG Works
User submits a prompt ↠ Retriever searches a knowledge base for relevant documents or data (retrieval)↠ The system combines the original prompt with the retrieved content (augmentation)↠ Generator (LLM) produces a response based on this enriched input (generation)↠ Output is returned to the user.

DEV Community

Retrieval Augmented Generation (RAG) for Dummies

Top comments (0)