Documentation: Frontend (frontend.py
)
This file defines the Streamlit-based frontend for the RAG (Retrieval-Augmented Generation) chatbot. It provides the user interface, handles queries, retrieves relevant document chunks from the FAISS index, and generates answers using the Groq LLM API.
Key Responsibilities
- Load the FAISS index and pre-processed chunks.
- Take user input (questions).
- Retrieve the most relevant chunks using semantic search.
- Pass the retrieved context into a Groq-powered LLM to generate an answer.
- Display chat history in a custom chat UI styled with CSS.
- Maintain transparency by showing the retrieved chunks.
Step-by-Step Breakdown
1. Imports and Configuration
import streamlit as st
import pickle
import faiss
from sentence_transformers import SentenceTransformer
from groq import Groq
import os
from dotenv import load_dotenv
load_dotenv()
-
streamlit
→ UI framework. -
pickle
→ load pre-saved chunks. -
faiss
→ fast similarity search on embeddings. -
SentenceTransformer
→ same embedding model as backend. -
Groq
→ client for LLM. -
.env
loader → loads theGROQ_API_KEY
securely.
The API key is stored in environment variables (or
Streamlit secrets
in deployment).
2. Load Index and Chunks
INDEX_FILE = "faiss_index.bin"
CHUNKS_FILE = "chunks.pkl"
embedder = SentenceTransformer("all-MiniLM-L6-v2")
index = faiss.read_index(INDEX_FILE)
with open(CHUNKS_FILE, "rb") as f:
chunks = pickle.load(f)
- Reads the FAISS index built earlier.
- Loads the corresponding chunks (document segments).
- Ensures the embeddings model matches the backend.
3. Semantic Search Function
def search_index(query, k=10):
q_vec = embedder.encode([query])
D, I = index.search(q_vec, k)
return [chunks[i] for i in I[0]]
- Encodes the query into a vector.
- Searches FAISS for the top
k
most relevant chunks. - Returns those chunks for answer generation.
4. LLM Answer Generation
def generate_answer(question, context_chunks):
context = "\n\n".join(context_chunks)
prompt = (
f"Answer the question based on the context provided. "
"If the question is not related to the context in any way, do NOT attempt to answer. "
"Instead, strictly reply: 'My knowledge base does not have information about this.'\n\n"
f"Context: {context}\n\nQuestion: {question}\nAnswer:"
)
response = client.chat.completions.create(
messages=[{"role": "user", "content": prompt}],
model="llama-3.3-70b-versatile",
)
return response.choices[0].message.content.strip()
-
Builds a RAG prompt with:
- Retrieved context.
- Question.
Sends to Groq’s LLaMA-3.3-70B model.
Returns a clean answer.
Enforces that if no context exists → chatbot says:
“My knowledge base does not have information about this.”
5. Custom Chat UI (CSS)
st.markdown(
"""
<style>
...
</style>
""",
unsafe_allow_html=True,
)
-
Defines two styles:
- User messages (blue, aligned right).
- Bot messages (gray, aligned left).
Creates a chat bubble effect inside a scrollable chat container.
6. Streamlit App UI
st.title("📚 RAG Chatbot")
st.write("Ask questions based on the indexed documents.")
- Adds title and short description.
- Initializes chat history (
st.session_state.messages
). - Displays all past chat messages in styled bubbles.
7. Question Input & Processing
with st.form(key="chat_form", clear_on_submit=True):
question = st.text_input("Your question:", key="question_input")
submit_button = st.form_submit_button("Send")
- Form input for user question.
-
Once submitted:
- Adds user message to history.
- Retrieves chunks via
search_index()
. - Calls
generate_answer()
to get LLM response. - Adds bot’s response to chat history.
8. Transparency: Retrieved Chunks
st.markdown("### 🔍 Retrieved Chunks")
for i, chunk in enumerate(retrieved, 1):
st.write(f"**Chunk {i}:** {chunk[:300]}...")
- Displays the actual retrieved chunks.
- Helps debug if wrong documents are being pulled.
- Also available in an expander for the last query.
9. Clear Chat Button
if st.button("Clear Chat"):
st.session_state.messages = []
st.rerun()
- Resets chat history.
- Allows for a fresh session.
Workflow (Frontend)
- User asks a question in Streamlit UI.
- System encodes it → searches FAISS → retrieves top chunks.
- Chunks + question passed into Groq LLM.
- Answer generated → shown in chat UI.
- Retrieved chunks displayed for transparency.
Key Notes
-
Frontend doesn’t rebuild the index. It relies on the backend (
index_docs.py
) having run beforehand. - Chat memory is session-based only (clears on refresh).
-
Environment variable
GROQ_API_KEY
must be set in:- Local
.env
file. - Streamlit Secrets (
st.secrets["GROQ_API_KEY"]
) during deployment.
- Local
Top comments (0)