Arindam Majumder

Posted on Apr 8

Build a RAG System under 5 Minutes

#ai #programming #python #rag

Large Language Models (LLMs) are great at executing generic tasks, but they often lack deep contextual knowledge.

RAG solves this problem and makes it perfect for Your Tasks!

In this Article, I’ll show how to build an RAG system with LLamaIndex and DeepSeek Models from Nebius AI Studio all under 5 mins.

Sounds Interesting?

Let’s Start!

What is RAG?

Before moving forward, let’s understand what RAG is.

RAG Stands For Retrieval-Augmented Generation. It helps LLMs generate more Efficient Answers by providing context to the Questions.

It pulls relevant information from external sources, like databases or documents, and provides it to the LLM to give more accurate answers, especially when talking about specific Topics.

Key benefits of RAG:

Improves answer relevance by fetching context-specific data.
Reduces hallucinations by grounding responses in factual information.
Enhances usability in real-world applications like document analysis, chatbots, and search engines.

For this tutorial, we will build a simple RAG system that retrieves relevant responses based on provided documents.

🎥 Want a quick tutorial? Watch this step-by-step video:

Building Your First RAG System

Enough talking about RAG, it’s time to build our RAG System.

Prerequisites

For this Project, we’ll be using:

LlamaIndex (for retrieval & indexing)
Nebius AI LLMs (for embeddings & generation)

Install the dependencies

First, let’s install the dependencies. Run the Following commands:

pip install llama-index-llms-nebius llama-index-embeddings-nebius

pip install -U llama-index

This installs our required dependencies from llama-index.

Setup Environment

Next, we need to configure our environment by setting up Nebius API keys. For that, let’s create a .env file and add your Nebius API Key:

## .env
NEBIUS_API_KEY= "Your Nebius API KEY"

Importing Required Modules

We will now import the necessary modules from llama-index to work with Nebius LLM and embeddings.

from llama_index.core import SimpleDirectoryReader,Settings, VectorStoreIndex
from llama_index.embeddings.nebius import NebiusEmbedding
from llama_index.llms.nebius import NebiusLLM

Defining a Function for RAG Completion

Now, let’s define a function to run our RAG process. This function will:

Initialize the LLM & Embedding models from Nebius.
Load documents from a specified directory.
Index the documents for efficient retrieval.
Retrieve relevant responses based on the query.

Here’s the implementation:

def run_rag_completion(
    document_dir: str,
    query_text: str,
    embedding_model: str ="BAAI/bge-en-icl",
    generative_model: str ="deepseek-ai/DeepSeek-V3"
    ) -> str:

    llm = NebiusLLM(
    model=generative_model,
    api_key=os.getenv("NEBIUS_API_KEY")
    )

    embed_model = NebiusEmbedding(
        model_name=embedding_model,
        api_key=os.getenv("NEBIUS_API_KEY")
    )
    Settings.llm = llm
    Settings.embed_model = embed_model
    documents = SimpleDirectoryReader(document_dir).load_data()
    index = VectorStoreIndex.from_documents(documents)
    response = index.as_query_engine(similarity_top_k=5).query(query_text)

    return str(response)

Running the RAG Completion Process

Finally, we will test our RAG system by providing a document directory and a query:

query_text = "Who Issued this invoice and mention the Date of it"
document_dir = "./data"

response = run_rag_completion(document_dir, query_text)
print(response)

This will retrieve and generate a contextually relevant response based on the provided documents.

And that’s it! 🎉

In just a few minutes, we built a functional RAG system using LlamaIndex and Nebius AI.

Now, you can build more complex projects on top of it.

Complete Source Code: Source Co de

Thank you so much for reading!

DEV Community