LLM for Conversational AI: Best Practices and Challenges

#learnai #oxlo #ai

We are going to build a customer support agent for a small electronics store. The agent will handle order lookups, answer policy questions, and hand off to a human when needed, all while keeping full conversation history. Because we are using Oxlo.ai's flat per-request pricing, long back-and-forth sessions cost the same as short ones, which makes this architecture affordable to run in production.

What you'll need

Python 3.10 or newer
The OpenAI SDK: pip install openai
An Oxlo.ai API key from https://portal.oxlo.ai
A virtual environment (optional but recommended)

Step 1: Configure the client

First, verify that the Oxlo.ai endpoint responds. I always run a single ping before adding logic.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Say hello and confirm you are ready."},
    ],
)

print(response.choices[0].message.content)

Step 2: Design the system prompt

The system prompt is the only place where behavior is enforced. I keep it strict but short so the model does not drift.

SYSTEM_PROMPT = """You are Zed, a customer support agent for TechNova, an electronics store.
Your job is to help customers with order status, returns, and store policies.
Rules:
- Be concise and friendly.
- Never make up order details. Ask for an order ID if needed.
- If the user asks for a human representative, reply with exactly: [HANDOFF]
- Do not process refunds yourself. Offer the returns portal link instead.
- Do not provide legal, medical, or financial advice.
- Current date: 2025-01-15.
"""

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are Zed, a customer support agent for TechNova, an electronics store.
Your job is to help customers with order status, returns, and store policies.
Rules:
- Be concise and friendly.
- Never make up order details. Ask for an order ID if needed.
- If the user asks for a human representative, reply with exactly: [HANDOFF]
- Do not process refunds yourself. Offer the returns portal link instead.
- Do not provide legal, medical, or financial advice.
- Current date: 2025-01-15.
"""

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": "What is your return policy?"},
    ],
)

print(response.choices[0].message.content)

Step 3: Build the conversation loop

Conversational AI needs memory. I store every turn in a messages list and send the full context each time.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are Zed, a customer support agent for TechNova, an electronics store.
Your job is to help customers with order status, returns, and store policies.
Rules:
- Be concise and friendly.
- Never make up order details. Ask for an order ID if needed.
- If the user asks for a human representative, reply with exactly: [HANDOFF]
- Do not process refunds yourself. Offer the returns portal link instead.
- Do not provide legal, medical, or financial advice.
- Current date: 2025-01-15.
"""

messages = [{"role": "system", "content": SYSTEM_PROMPT}]

while True:
    user_input = input("User: ")
    if user_input.lower() in ["exit", "quit"]:
        break

    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=messages,
    )

    assistant_msg = response.choices[0].message.content
    print(f"Zed: {assistant_msg}")
    messages.append({"role": "assistant", "content": assistant_msg})

Step 4: Add tool use for order lookups

Trusting the model to hallucinate order details is dangerous. I give it a lookup_order function and let it decide when to call it.

import json
from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are Zed, a customer support agent for TechNova, an electronics store.
Your job is to help customers with order status, returns, and store policies.
Rules:
- Be concise and friendly.
- Never make up order details. Use the lookup_order tool when a user provides an order ID.
- If the user asks for a human representative, reply with exactly: [HANDOFF]
- Do not process refunds yourself. Offer the returns portal link instead.
- Do not provide legal, medical, or financial advice.
- Current date: 2025-01-15.
"""

def lookup_order(order_id: str):
    db = {
        "ORD-001": {"status": "shipped", "item": "Wireless Headphones", "eta": "Jan 18"},
        "ORD-002": {"status": "processing", "item": "USB-C Cable", "eta": "Jan 20"},
    }
    return db.get(order_id, {"error": "Order not found"})

tools = [
    {
        "type": "function",
        "function": {
            "name": "lookup_order",
            "description": "Get order status by order ID",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {
                        "type": "string",
                        "description": "The order ID, e.g. ORD-001"
                    }
                },
                "required": ["order_id"]
            }
        }
    }
]

messages = [{"role": "system", "content": SYSTEM_PROMPT}]

while True:
    user_input = input("User: ")
    if user_input.lower() in ["exit", "quit"]:
        break

    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )

    msg = response.choices[0].message

    if msg.tool_calls:
        # Append the assistant message that requested the tool
        messages.append({
            "role": "assistant",
            "content": msg.content or "",
            "tool_calls": [
                {
                    "id": tc.id,
                    "type": tc.type,
                    "function": {
                        "name": tc.function.name,
                        "arguments": tc.function.arguments,
                    }
                }
                for tc in msg.tool_calls
            ]
        })

        tool_call = msg.tool_calls[0]
        fn_name = tool_call.function.name
        args = json.loads(tool_call.function.arguments)

        if fn_name == "lookup_order":
            result = lookup_order(args["order_id"])
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "name": fn_name,
                "content": json.dumps(result),
            })

            response = client.chat.completions.create(
                model="llama-3.3-70b",
                messages=messages,
                tools=tools,
            )
            msg = response.choices[0].message

    assistant_msg = msg.content
    print(f"Zed: {assistant_msg}")
    messages.append({"role": "assistant", "content": assistant_msg})

Step 5: Add guardrails and handoff

Production agents need hard stops. I block sensitive topics client side and watch for the handoff tag the model was instructed to emit.

import json
from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are Zed, a customer support agent for TechNova, an electronics store.
Your job is to help customers with order status, returns, and store policies.
Rules:
- Be concise and friendly.
- Never make up order details. Use the lookup_order tool when a user provides an order ID.
- If the user asks for a human representative, reply with exactly: [HANDOFF]
- Do not process refunds yourself. Offer the returns portal link instead.
- Do not provide legal, medical, or financial advice.
- Current date: 2025-01-15.
"""

def lookup_order(order_id: str):
    db = {
        "ORD-001": {"status": "shipped", "item": "Wireless Headphones", "eta": "Jan 18"},
        "ORD-002": {"status": "processing", "item": "USB-C Cable", "eta": "Jan 20"},
    }
    return db.get(order_id, {"error": "Order not found"})

tools = [
    {
        "type": "function",
        "function": {
            "name": "lookup_order",
            "description": "Get order status by order ID",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {
                        "type": "string",
                        "description": "The order ID, e.g. ORD-001"
                    }
                },
                "required": ["order_id"]
            }
        }
    }
]

BLOCKED_TOPICS = ["legal advice", "medical advice", "financial advice"]

messages = [{"role": "system", "content": SYSTEM_PROMPT}]

while True:
    user_input = input("User: ")
    if user_input.lower() in ["exit", "quit"]:
        break

    if any(topic in user_input.lower() for topic in BLOCKED_TOPICS):
        print("Zed: I cannot help with that. I can assist with orders, returns, and store policies.")
        continue

    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )

    msg = response.choices[0].message

    if msg.tool_calls:
        messages.append({
            "role": "assistant",
            "content": msg.content or "",
            "tool_calls": [
                {
                    "id": tc.id,
                    "type": tc.type,
                    "function": {
                        "name": tc.function.name,
                        "arguments": tc.function.arguments,
                    }
                }
                for tc in msg.tool_calls
            ]
        })

        tool_call = msg.tool_calls[0]
        fn_name = tool_call.function.name
        args = json.loads(tool_call.function.arguments)

        if fn_name == "lookup_order":
            result = lookup_order(args["order_id"])
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "name": fn_name,
                "content": json.dumps(result),
            })

            response = client.chat.completions.create(
                model="llama-3.3-70b",
                messages=messages,
                tools=tools,
            )
            msg = response.choices[0].message

    assistant_msg = msg.content

    if assistant_msg and "[HANDOFF]" in assistant_msg:
        print("Zed: I am transferring you to a human representative now. Please hold.")
        break

    print(f"Zed: {assistant_msg}")
    messages.append({"role": "assistant", "content": assistant_msg})

Run it

Save the final script as support_agent.py, replace YOUR_OXLO_API_KEY, and run python support_agent.py. Here is a sample session.

User: Hi, where is my order?
Zed: Hello! I can help with that. Could you provide your order ID?
User: ORD-001
Zed: Let me check that for you.
Zed: Your order for Wireless Headphones has shipped and is expected to arrive by Jan 18.
User: I need a human
Zed: I am transferring you to a human representative now. Please hold.

Next steps

Replace the hardcoded lookup_order dictionary with a real call to your CRM or e-commerce backend. Then add stream=True to chat.completions.create and iterate over chunks so the user sees words appear immediately instead of waiting for the full response.