Building Conversational AI Systems for Customer Service with LLM

#learnai #oxlo #ai

We are building a customer service agent that looks up orders, processes refunds, and escalates off-topic questions to a human. It is meant for e-commerce teams that need to automate tier-1 support without managing complex dialogue state machines. We will run it on Oxlo.ai because its flat per-request pricing keeps costs predictable even as conversation history grows, and the OpenAI SDK compatibility means we can ship with a single client import.

What you'll need

Python 3.10 or newer
The OpenAI SDK: pip install openai
An Oxlo.ai API key from https://portal.oxlo.ai

Step 1: Configure the Oxlo.ai client

I start every project by verifying the endpoint. We point the OpenAI SDK at Oxlo.ai and make a quick call to confirm the key works. I use llama-3.3-70b as the workhorse because it handles tool use and multi-turn context reliably, though qwen-3-32b is worth testing if you need multilingual responses.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key="YOUR_OXLO_API_KEY"
)

# verify connectivity
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "say ok"}],
    max_tokens=10
)
print(response.choices[0].message.content)

Step 2: Lock down the system prompt

The system prompt is the only guardrail that matters at runtime. It defines the agent's scope, tone, and the exact tools it may call. Keep it strict and short so the model does not drift.

SYSTEM_PROMPT = """You are a customer service agent for an online electronics store named TechMart.
Your job is to help customers with order status and refund requests.

Policies:
- You can only discuss orders and return policies.
- If a customer asks about anything else, politely decline and offer to transfer them to a human.
- Always verify the order_id with a tool before giving a refund.
- Be concise. Ask for the order_id if it is missing.

Tools:
- get_order_status(order_id: str)
- initiate_refund(order_id: str, reason: str)
"""

Step 3: Define the tool schemas

We need to tell the model what it can do. Oxlo.ai supports the standard OpenAI tools format, so we declare two functions with JSON schemas for order lookups and refunds.

import json

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_order_status",
            "description": "Retrieve the current status of a customer order.",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {
                        "type": "string",
                        "description": "The order ID, for example ORD-1234"
                    }
                },
                "required": ["order_id"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "initiate_refund",
            "description": "Start a refund for an order.",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string"},
                    "reason": {"type": "string"}
                },
                "required": ["order_id", "reason"]
            }
        }
    }
]

Step 4: Implement the backend functions

These functions would normally query your ERP or database. For this tutorial we mock them with a dictionary so the agent is runnable immediately.

ORDERS_DB = {
    "ORD-1234": {"status": "shipped", "item": "USB-C Cable", "delivered": True},
    "ORD-5678": {"status": "processing", "item": "Mechanical Keyboard", "delivered": False},
}

def get_order_status(order_id: str) -> str:
    order = ORDERS_DB.get(order_id)
    if not order:
        return json.dumps({"error": "Order not found."})
    return json.dumps(order)

def initiate_refund(order_id: str, reason: str) -> str:
    if order_id not in ORDERS_DB:
        return json.dumps({"error": "Order not found."})
    return json.dumps({"status": "refund_initiated", "order_id": order_id, "reason": reason})

Step 5: Build the conversation loop

This is the core orchestrator. It sends the user message to Oxlo.ai, checks for tool calls, executes any local functions, and feeds the results back to the model before returning the final text. Because Oxlo.ai bills per request rather than per token, injecting long tool responses and conversation history does not explode the cost the way it would on token-based providers.

def run_agent(user_message: str, conversation: list = None) -> str:
    if conversation is None:
        conversation = [{"role": "system", "content": SYSTEM_PROMPT}]
    
    conversation.append({"role": "user", "content": user_message})
    
    # first pass: model decides if it needs tools
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=conversation,
        tools=tools,
        tool_choice="auto",
    )
    
    message = response.choices[0].message
    
    # convert assistant message to a plain dict for the next request
    assistant_msg = {
        "role": message.role,
        "content": message.content or "",
    }
    if message.tool_calls:
        assistant_msg["tool_calls"] = [
            {
                "id": tc.id,
                "type": tc.type,
                "function": {
                    "name": tc.function.name,
                    "arguments": tc.function.arguments,
                }
            } for tc in message.tool_calls
        ]
    conversation.append(assistant_msg)
    
    # handle any tool calls
    if message.tool_calls:
        for tool_call in message.tool_calls:
            fn_name = tool_call.function.name
            args = json.loads(tool_call.function.arguments)
            
            if fn_name == "get_order_status":
                result = get_order_status(**args)
            elif fn_name == "initiate_refund":
                result = initiate_refund(**args)
            else:
                result = json.dumps({"error": "Unknown tool"})
            
            conversation.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "name": fn_name,
                "content": result,
            })
        
        # second pass: model generates final answer with tool results
        final_response = client.chat.completions.create(
            model="llama-3.3-70b",
            messages=conversation,
            tools=tools,
        )
        return final_response.choices[0].message.content
    
    return message.content

Run it

With the loop in place we can test three realistic scenarios: a status check, a refund, and an off-topic question.

# Test 1: order status lookup
print(run_agent("Where is my order ORD-1234?"))

# Test 2: refund request
print(run_agent("I want a refund for ORD-1234. It arrived damaged."))

# Test 3: off-topic guardrail
print(run_agent("What is the weather today?"))

When I run this locally I see output like the following:

Your order ORD-1234 (USB-C Cable) has been shipped and is marked as delivered.

I have initiated a refund for order ORD-1234 because it arrived damaged. You should see the credit within 5 business days.

I can only help with orders and returns. I can transfer you to a human agent for weather questions. Would you like me to do that?

Wrap-up and next steps

You now have a working support agent that enforces guardrails, calls business functions, and runs on flat per-request pricing. Two concrete next steps: wire the mock functions to your real database or REST API, and add retrieval from your help center using Oxlo.ai's embeddings endpoint so the agent can answer policy questions without hard-coding them in the prompt. See https://oxlo.ai/pricing for plan details if you are ready to move to production traffic.