DEV Community

shashank ms
shashank ms

Posted on

Using LLM for Question Answering in FAQs

We are going to build a lightweight FAQ answering agent that matches user questions to a curated knowledge base and generates concise responses with an LLM. It is ideal for support teams that want accurate answers without maintaining a complex retrieval stack. Oxlo.ai fits this workflow well because its flat per-request pricing keeps costs predictable even when you pass large FAQ context blocks on every call.

What you'll need

Step 1: Set up the Oxlo.ai client

First, import the OpenAI SDK and point it at Oxlo.ai. The base URL and client initialization are the only changes needed to use Oxlo.ai as a drop-in replacement.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

Step 2: Create the FAQ knowledge base

Next, define a small in-memory FAQ database. In production you might load this from a JSON file or CMS, but a dictionary is enough to demonstrate the pipeline.

FAQ_DB = {
    "What are your support hours?": "Our team is available Monday through Friday, 9 AM to 6 PM UTC.",
    "How do I reset my password?": "Click 'Forgot password' on the login screen and follow the email instructions.",
    "Do you offer an API?": "Yes. We provide a REST API with OpenAI-compatible endpoints. Documentation is at /docs.",
    "What payment methods are accepted?": "We accept credit cards, ACH transfers, and annual invoicing for Enterprise plans.",
    "Is there a free trial?": "Yes. New accounts get a 14-day free trial with full feature access.",
}

Step 3: Build a simple retriever

We need a simple way to find the most relevant FAQ entry. A keyword overlap scorer works surprisingly well for small corpora and avoids adding vector dependencies.

def retrieve_faq(query: str, faq_db: dict, top_k: int = 1):
    query_words = set(query.lower().split())
    scored = []
    for question, answer in faq_db.items():
        question_words = set(question.lower().split())
        overlap = len(query_words & question_words)
        scored.append((overlap, question, answer))
    scored.sort(reverse=True)
    return scored[:top_k]

Step 4: Write the system prompt

The system prompt constrains the model to the retrieved context and sets a fallback message for unknown questions. Keep it explicit.

SYSTEM_PROMPT = """You are a concise FAQ assistant. Answer the user's question using ONLY the provided FAQ context.
If the context does not contain the answer, say: "I don't have an answer for that. Contact support@example.com."
Do not make up information. Keep responses under two sentences."""

Step 5: Assemble the answer pipeline

Finally, wire the retriever and the LLM together. The function looks up the best match, injects it into the system message, and returns the model's answer.

def answer_question(user_message: str) -> str:
    matches = retrieve_faq(user_message, FAQ_DB)
    if not matches or matches[0][0] == 0:
        return "I don't have an answer for that. Contact support@example.com."

    context_q, context_a = matches[0][1], matches[0][2]
    context_block = f"Q: {context_q}\nA: {context_a}"

    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT + "\n\nContext:\n" + context_block},
            {"role": "user", "content": user_message},
        ],
    )
    return response.choices[0].message.content.strip()

Run it

Save the full script as faq_agent.py, replace YOUR_OXLO_API_KEY, and run python faq_agent.py. The agent should answer known questions and decline unknown ones.

if __name__ == "__main__":
    test_queries = [
        "How do I reset my password?",
        "Do you support wire transfers?",
        "What are the support hours?",
    ]
    for q in test_queries:
        print(f"Q: {q}")
        print(f"A: {answer_question(q)}")
        print()

Expected output:

Q: How do I reset my password?
A: Click 'Forgot password' on the login screen and follow the email instructions.

Q: Do you support wire transfers?
A: I don't have an answer for that. Contact support@example.com.

Q: What are the support hours?
A: Our team is available Monday through Friday, 9 AM to 6 PM UTC.

Next steps

Replace the keyword retriever with Oxlo.ai embeddings, such as BGE-Large, to support semantic matching across larger documents. You can also add multi-turn memory by appending previous user and assistant messages to the messages list before each new request.

Top comments (0)