Large Language Models (LLMs) are great at executing generic tasks, but they often lack deep contextual knowledge.
RAG solves this problem and makes it perfect for Your Tasks!
In this Article, I’ll show how to build an RAG system with LLamaIndex and DeepSeek Models from Nebius AI Studio all under 5 mins.
Sounds Interesting?
Let’s Start!
What is RAG?
Before moving forward, let’s understand what RAG is.
RAG Stands For Retrieval-Augmented Generation. It helps LLMs generate more Efficient Answers by providing context to the Questions.
It pulls relevant information from external sources, like databases or documents, and provides it to the LLM to give more accurate answers, especially when talking about specific Topics.
Key benefits of RAG:
Improves answer relevance by fetching context-specific data.
Reduces hallucinations by grounding responses in factual information.
Enhances usability in real-world applications like document analysis, chatbots, and search engines.
For this tutorial, we will build a simple RAG system that retrieves relevant responses based on provided documents.
🎥 Want a quick tutorial? Watch this step-by-step video:
Building Your First RAG System
Enough talking about RAG, it’s time to build our RAG System.
Prerequisites
For this Project, we’ll be using:
LlamaIndex (for retrieval & indexing)
Nebius AI LLMs (for embeddings & generation)
Install the dependencies
First, let’s install the dependencies. Run the Following commands:
pip install llama-index-llms-nebius llama-index-embeddings-nebius
pip install -U llama-index
This installs our required dependencies from llama-index.
Setup Environment
Next, we need to configure our environment by setting up Nebius API keys. For that, let’s create a .env
file and add your Nebius API Key:
## .env
NEBIUS_API_KEY= "Your Nebius API KEY"
Importing Required Modules
We will now import the necessary modules from llama-index to work with Nebius LLM and embeddings.
from llama_index.core import SimpleDirectoryReader,Settings, VectorStoreIndex
from llama_index.embeddings.nebius import NebiusEmbedding
from llama_index.llms.nebius import NebiusLLM
Defining a Function for RAG Completion
Now, let’s define a function to run our RAG process. This function will:
Initialize the LLM & Embedding models from Nebius.
Load documents from a specified directory.
Index the documents for efficient retrieval.
Retrieve relevant responses based on the query.
Here’s the implementation:
def run_rag_completion(
document_dir: str,
query_text: str,
embedding_model: str ="BAAI/bge-en-icl",
generative_model: str ="deepseek-ai/DeepSeek-V3"
) -> str:
llm = NebiusLLM(
model=generative_model,
api_key=os.getenv("NEBIUS_API_KEY")
)
embed_model = NebiusEmbedding(
model_name=embedding_model,
api_key=os.getenv("NEBIUS_API_KEY")
)
Settings.llm = llm
Settings.embed_model = embed_model
documents = SimpleDirectoryReader(document_dir).load_data()
index = VectorStoreIndex.from_documents(documents)
response = index.as_query_engine(similarity_top_k=5).query(query_text)
return str(response)
Running the RAG Completion Process
Finally, we will test our RAG system by providing a document directory and a query:
query_text = "Who Issued this invoice and mention the Date of it"
document_dir = "./data"
response = run_rag_completion(document_dir, query_text)
print(response)
This will retrieve and generate a contextually relevant response based on the provided documents.
And that’s it! 🎉
In just a few minutes, we built a functional RAG system using LlamaIndex and Nebius AI.
Now, you can build more complex projects on top of it.
Complete Source Code: Source Code
Thank you so much for reading!
Top comments (0)