DEV Community

Cover image for Retrieval Augmented Generation (RAG): How To Get AI Models Learn Your Data & Give You Answers
Saurabh Rai for Swirl

Posted on

Retrieval Augmented Generation (RAG): How To Get AI Models Learn Your Data & Give You Answers

With the growing AI and Large Language Models, there is a higher demand for people wanting to get answers from their data sources. Prompting them to integrate various methods to search and retrieve data from different sources and passing them to these AI Models has become critical.
Retrieval Augmented Generation, or RAG, is a potential solution for this.

If you're eager to discover how RAG can transform the way you access and utilize your data, read on.

AI and Generating Human-Like Responses. 🙋‍♂️

Tracing back to 2013, the AI landscape was vastly different. The buzz was about simple neural networks, the foundational building blocks that paved the way for future advancements. People were happy if they could quickly train and get good scores on MNIST.

Later, we got deep neural networks capable of performing tasks in areas like image and speech recognition, outperforming traditional algorithms.

Human-Like Response from AI

But the real change came in November 2022. When OpenAI launched ChatGPT. It took the world by storm. Amassing 100 Million+ users in just 2 months, this has been the most significant moment in internet and AI history.

Not just because it can perform multiple tasks but because its capability surprised many people. And after that, large language models have taken the world by storm. We're seeing the rise of AI startups and enterprises developing their own large language models.

And now it comes to the question. Can anyone use their own data, pass it to the LLM, and get it to generate insights on the fly?

But before this, let's discuss the current challenges with AI Large Language Models.

Limitations of AI Large Language Models 🔚

While Large Language Models are doing a really great job. Relying on them is more challenging. Consider this headline:

ChatGPT's lack of citing the article

ChatGPT really took this person's job.

If you are a regular user of LLMs or tried to test to their limits. You must have noticed a severe problem: Hallucination. It's when any large language model starts generating answers and things based on its creativity, whether it exists or not. In the above example, ChatGPT provided reference cases that didn't exist.

This stops enterprises, hospitals, and other businesses from immediately introducing AI Models into their day-to-day cases. However, hallucination isn't the only problem faced by State-of-the-art Large Language Models. There's more. And here's a list of them 👇

  • Context Windows: There is a training cutoff after which the training is stopped. Therefore, it restricts the AI models from producing anything after that date.
  • Outdated Training Data: Data is dynamic, facts are static. So, to keep up with the ongoing changes. We need to retrain the AI Model, which brings the third problem.
  • Cost: AI Models don't sit on hard drives. They take up RAM, CPU, and GPU to run and train. These are costly; the hardware expense of retraining the large language model can quickly skyrocket.
  • Accuracy and Bias: LLMs are prone to errors due to biases in the training data, lack of common-sense reasoning, and the model's way of being trained. This depends on the team of engineers as well.
  • LLM Hacking and Hijacking: Recent research has started to show that LLMs can be hacked and made to reveal potentially dangerous information.

Limitations of AI Models

Like any other software, these AI Models aren't 100% perfect and continue to thrive to be better. Now let's understand what RAG is and why it's important.

Retrieval Augmented Generation aka RAG ✨

Retrieval Augmented Generation (RAG) is a recent advancement in Artificial Intelligence. It's a form of Open Domain Question Answering with a Retriever and a Generative AI Model. It combines a search system with AI models like ChatGPT, Llama2, etc. With RAG, the system searches a vast knowledge base for up-to-date data and articles. This data is then used by the AI to give precise answers. This method helps reduce errors in AI responses and offers more customized solutions.

So, with RAG, the retriever or searcher can access the latest data, sources, and other important articles from a very large knowledge base. And then it provides it as input to the Generative AI Model. Hence the name Retriever Augmented Generation. This approach allows the Large Language Model to tap into a vast knowledge base and provide relevant and to-the-point information.

This significantly improves the problem of hallucination faced by large language models. And can provide tailored answers for you.

Why Use Retrieval Augmented Generation? Aren't Current AI Models Enough?

Current Generation AI Models have a cutoff period after which they stop training. Due to that, I asked for events that happened after that. Attempting to retrieve recent information can be a challenge.

Take a look at this example. Asking ChatGPT about BUN
ChatGPT on BUN

ChatGPT Recommends:

I would recommend checking the official documentation or repositories, tech news websites, or relevant community forums.

Can we not send the text of these documents, repositories, websites, and community forums directly into ChatGPT and get the relevant answers? And RAG helps here.

And not just about BUN. But what if, in the same manner, you wanted to query your company's data and know insights and answers relevant to your own data without making it public.

And this is where Retrieval Augmented Generation shines and provides answers with sources. And, while on the question of connecting multiple data sources. Swirl can help you solve the problem quickly.

  • Connect to various data sources.
  • Swirl can perform query processing via ChatGPT.
  • Re-ranking and getting the top-N answers via spaCy's Large Language Model. (Cosine Relevancy Processor)

We understand that for enterprise customers, Time is of the essence.
Metapipe (Enterprise) allows for a fast RAG pipeline setup for enterprise customers, providing quick answers. If you want to learn more about how Metapipe (Enterprise) can benefit your organization, don't hesitate to contact the Swirl Team for further information.

RAG vs. Fine-Tuning

What is Fine Tuning?

Fine-tuning a Large Language Model means retraining any large language model on a dataset and making it really good for a subtask. People have fine tune LLaMa2 LLM for various tasks like writing SQL, Python Code, etc. ref

While this is good for tasks with massive and static data like Python syntax, SQL, etc... The problem comes when you want to train it on something new or when no large dataset is available.
If the dataset keeps changing, you must retrain the model to keep up with the changes. And this is expensive.

Consider coding it on documentation of BUN, Astro, Swirl, or your company's documents. Also, note fine-tuning makes it good at a specific task. It may be that you won't be able to access the source or get the relevant citations for that source.

Can you do fine-tuning?

Answer these questions:

  1. Do you have the engineers and hardware required for training a Large Language Model?
  2. Do you have the data necessary to get good answers from the Large Language Model?
  3. Do you have Time?

If the answer to any of these three questions is "no." Then, you need to reconsider fine-tuning. And opt-in for a better and more accessible alternative.

RAG Fits the scenarios where Fine Tuning Doesn't.

  • Small documentations.
  • Articles, research papers, blogs.
  • Newly created code bases, etc.

Generating answers from them is easier than you think. While there are many options with which you can create a RAG Pipeline. But Swirl makes both the parts, Retrieval and Generation, easier.
Swirl can search and provide the top-N best answers from the search query and software models. Check our GitHub.

But, if you are an enterprise customer, Swirl's Metapipe can provide a much faster way to generate answers. The latter is for enterprise customers, while the former remains open source.

Contribute to Swirl 🌌

Swirl is an open-source library in Python 🐍. And we're looking for people to help build this software. Looking for fantastic people who can:

  • Create excellent articles, enhance our readme, UI, etc.
  • Contribute by adding a connector or search provider.
  • Join our community on Slack.

It would mean a lot if you could give us a 🌟 on GitHub. Keeps the team motivated. 🔥

Contribute to Swirl

GitHub logo swirlai / swirl-search

Swirl is open-source software that uses AI to simultaneously search multiple content and data sources, finds the best results using a reader LLM, then prompts Generative AI, enabling you to get answers from your own data.

Swirl

Swirl

Swirl is open source software that simultaneously searches multiple content sources and returns AI ranked results.

Start Searching · Slack · Key Features · Contribute · Documentation · Connectors



License: Apache 2.0 GitHub Release Docker Build Slack Website

Swirl is open source software that simultaneously searches multiple content sources and returns AI ranked results. Prompt your choice of Generative AI using the top N results to get answers incorporating your own data.

Swirl can connect to:

  • Databases (SQL, NoSQL, Google BigQuery)
  • Public data services (Google Programmable Search Engines, ArXiv.org, etc.)
  • Enterprise sources (Microsoft 365, Jira, Miro, etc.)

And generate insights with AI and LLMs like ChatGPT. Start discovering and generating the answers you need based on your data.

Swirl is as simple as ABC: (a) Download YML, (b) Start in Docker, (c) Search with Swirl. From there, add credentials to preloaded SearchProviders to get results from more sources.

🚀 Try Swirl with ChatGPT

Swirl with ChatGPT as a configured AI Model

Swirl with ChatGPT as a

Top comments (10)

Collapse
 
nathan_tarbert profile image
Nathan Tarbert

Nice article @srbhr!

Collapse
 
srbhr profile image
Saurabh Rai
Collapse
 
nathan_tarbert profile image
Nathan Tarbert

No problem and I followed Swirl. Looks like a really cool project!

Collapse
 
nevodavid profile image
Nevo David

Great stuff!

Collapse
 
srbhr profile image
Saurabh Rai

Thanks @nevodavid

Collapse
 
pizofreude profile image
Pizofreude

Great stack for AI and API lover!

Collapse
 
srbhr profile image
Saurabh Rai

Thanks @pizofreude

Collapse
 
cherryramatis profile image
Cherry Ramatis

really nice content, it's great to see more friendly content around the whole LLMs field

Collapse
 
srbhr profile image
Saurabh Rai
Collapse
 
felixdev9 profile image
Felix

It's one of the trending topics these days. RAG, Vector Databases and LLMs.