DEV Community

JeffMint
JeffMint

Posted on

Chatbot with Python (Frontend)

Documentation: Frontend (frontend.py)

This file defines the Streamlit-based frontend for the RAG (Retrieval-Augmented Generation) chatbot. It provides the user interface, handles queries, retrieves relevant document chunks from the FAISS index, and generates answers using the Groq LLM API.

Key Responsibilities

  • Load the FAISS index and pre-processed chunks.
  • Take user input (questions).
  • Retrieve the most relevant chunks using semantic search.
  • Pass the retrieved context into a Groq-powered LLM to generate an answer.
  • Display chat history in a custom chat UI styled with CSS.
  • Maintain transparency by showing the retrieved chunks.

Step-by-Step Breakdown

1. Imports and Configuration

import streamlit as st
import pickle
import faiss
from sentence_transformers import SentenceTransformer
from groq import Groq
import os
from dotenv import load_dotenv

load_dotenv()
Enter fullscreen mode Exit fullscreen mode
  • streamlit → UI framework.
  • pickle → load pre-saved chunks.
  • faiss → fast similarity search on embeddings.
  • SentenceTransformer → same embedding model as backend.
  • Groq → client for LLM.
  • .env loader → loads the GROQ_API_KEY securely.

The API key is stored in environment variables (or Streamlit secrets in deployment).


2. Load Index and Chunks

INDEX_FILE = "faiss_index.bin"
CHUNKS_FILE = "chunks.pkl"
embedder = SentenceTransformer("all-MiniLM-L6-v2")

index = faiss.read_index(INDEX_FILE)
with open(CHUNKS_FILE, "rb") as f:
    chunks = pickle.load(f)
Enter fullscreen mode Exit fullscreen mode
  • Reads the FAISS index built earlier.
  • Loads the corresponding chunks (document segments).
  • Ensures the embeddings model matches the backend.

3. Semantic Search Function

def search_index(query, k=10):
    q_vec = embedder.encode([query])
    D, I = index.search(q_vec, k)
    return [chunks[i] for i in I[0]]
Enter fullscreen mode Exit fullscreen mode
  • Encodes the query into a vector.
  • Searches FAISS for the top k most relevant chunks.
  • Returns those chunks for answer generation.

4. LLM Answer Generation

def generate_answer(question, context_chunks):
    context = "\n\n".join(context_chunks)
    prompt = (
        f"Answer the question based on the context provided. "
        "If the question is not related to the context in any way, do NOT attempt to answer. "
        "Instead, strictly reply: 'My knowledge base does not have information about this.'\n\n"
        f"Context: {context}\n\nQuestion: {question}\nAnswer:"
    )
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": prompt}],
        model="llama-3.3-70b-versatile",
    )
    return response.choices[0].message.content.strip()
Enter fullscreen mode Exit fullscreen mode
  • Builds a RAG prompt with:

    • Retrieved context.
    • Question.
  • Sends to Groq’s LLaMA-3.3-70B model.

  • Returns a clean answer.

  • Enforces that if no context exists → chatbot says:

    “My knowledge base does not have information about this.”


5. Custom Chat UI (CSS)

st.markdown(
    """
<style>
...
</style>
""",
    unsafe_allow_html=True,
)
Enter fullscreen mode Exit fullscreen mode
  • Defines two styles:

    • User messages (blue, aligned right).
    • Bot messages (gray, aligned left).
  • Creates a chat bubble effect inside a scrollable chat container.

6. Streamlit App UI

st.title("📚 RAG Chatbot")
st.write("Ask questions based on the indexed documents.")
Enter fullscreen mode Exit fullscreen mode
  • Adds title and short description.
  • Initializes chat history (st.session_state.messages).
  • Displays all past chat messages in styled bubbles.

7. Question Input & Processing

with st.form(key="chat_form", clear_on_submit=True):
    question = st.text_input("Your question:", key="question_input")
    submit_button = st.form_submit_button("Send")
Enter fullscreen mode Exit fullscreen mode
  • Form input for user question.
  • Once submitted:

    • Adds user message to history.
    • Retrieves chunks via search_index().
    • Calls generate_answer() to get LLM response.
    • Adds bot’s response to chat history.

8. Transparency: Retrieved Chunks

st.markdown("### 🔍 Retrieved Chunks")
for i, chunk in enumerate(retrieved, 1):
    st.write(f"**Chunk {i}:** {chunk[:300]}...")
Enter fullscreen mode Exit fullscreen mode
  • Displays the actual retrieved chunks.
  • Helps debug if wrong documents are being pulled.
  • Also available in an expander for the last query.

9. Clear Chat Button

if st.button("Clear Chat"):
    st.session_state.messages = []
    st.rerun()
Enter fullscreen mode Exit fullscreen mode
  • Resets chat history.
  • Allows for a fresh session.

Workflow (Frontend)

  1. User asks a question in Streamlit UI.
  2. System encodes it → searches FAISS → retrieves top chunks.
  3. Chunks + question passed into Groq LLM.
  4. Answer generated → shown in chat UI.
  5. Retrieved chunks displayed for transparency.

Key Notes

  • Frontend doesn’t rebuild the index. It relies on the backend (index_docs.py) having run beforehand.
  • Chat memory is session-based only (clears on refresh).
  • Environment variable GROQ_API_KEY must be set in:

    • Local .env file.
    • Streamlit Secrets (st.secrets["GROQ_API_KEY"]) during deployment.

Top comments (0)