RAG (Retrieval-Augmented Generation) Workflow
Import Required Libraries
This imports all the necessary libraries for loading datasets, performing text splitting, creating embeddings, building a retrieval system, and evaluating metrics.
dotenv: For loading environment variables (e.g., API keys).
load_diabetes: Provides the diabetes dataset from scikit-learn.
LangChain libraries: Tools for text splitting, embeddings, and setting up a question-answering (QA) pipeline.
FAISS: A library for efficient similarity search and clustering of dense vectors.
Ragas: For evaluating retrieval-based question-answering systems.
Dataset: From the datasets library, useful for organizing data for evaluation.
userdata: Used for securely retrieving sensitive data in Google Colab.
1. Ground Truth - Source of Truth
The foundation of the system lies in the source data. In this example:
diabetes = load_diabetes()
raw_text = diabetes.DESCR
-
load_diabetes()
: Fetches the diabetes dataset fromsklearn.datasets
. -
diabetes.DESCR
: Contains a detailed description of the dataset (variables, data characteristics). -
raw_text
: Represents the "Ground Truth" that the RAG system will reference for its operations.
2. Retrieval - Finding Relevant Information
The raw text (input) is split into smaller chunks, and a vector store is created using FAISS. Queries retrieve the most relevant chunks.
# Split into chunks (simulate document retrieval)
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
texts = text_splitter.split_text(raw_text)
# Create Embeddings & Build FAISS Index
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002") # Recommended model
docsearch = FAISS.from_texts(texts, embeddings)
-
RecursiveCharacterTextSplitter
: Splits the largeraw_text
into smaller, overlapping segments (texts
). -
OpenAIEmbeddings
: Converts input text chunk into numerical vector representations for processing. -
FAISS.from_texts(texts, embeddings)
: Builds an index of these embeddings for efficient retrieval.
3. Augmentation - Adding Context
The RetrievalQA
chain in LangChain manages augmentation.
- It takes the user's query and uses the
docsearch
index to find relevant documents. - These documents are passed to the LLM along with the original query, providing contextual information.
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=docsearch.as_retriever(),
return_source_documents=True # helpful for debugging
)
-
RetrievalQA.from_chain_type
sets up the RAG pipeline. -
retriever=docsearch.as_retriever()
connects FAISS index to QA chain, enables fetch relevant doc/info based on query -
chain_type="stuff"
pass retrieved doc/info to LLM. It stuffs query+info/doc. -
return_source_documents=True
for traceability
4. Generation - Producing the Response
The LLM generates a response by combining the query and augmented context retrieved in the previous step.
for query in queries:
result = qa_chain.invoke({"query": query})
answers.append(result["result"])
# Extract retrieved docs for Ragas evaluation
retrieved_docs = result.get("source_documents", [])
contexts.append([doc.page_content for doc in retrieved_docs])
5. Traceability - Explaining the Source
The system ensures traceability by showing the origin of the retrieved information, helping users understand the response's basis.
retrieved_docs = result.get("source_documents", [])
contexts.append([doc.page_content for doc in retrieved_docs])
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.