I recently built a customer support dialogue agent for a side project. It handles multi-turn refund requests, remembers context across turns, and calls a mock order lookup tool when needed. I will walk through the exact code, running on Oxlo.ai, because its request-based pricing stays predictable even as conversation history grows.
What you'll need
Before starting, make sure you have the following:
- Python 3.10 or newer
- The OpenAI SDK:
pip install openai - An Oxlo.ai API key from https://portal.oxlo.ai
Step 1: Configure the Oxlo.ai client
I started with the standard OpenAI SDK pointed at Oxlo.ai. Since Oxlo.ai is fully OpenAI-compatible, this requires no client forks or custom wrappers. I picked Llama 3.3 70B as the base model because it follows instructions reliably.
from openai import OpenAI
client = OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key="YOUR_OXLO_API_KEY"
)
MODEL = "llama-3.3-70b"
response = client.chat.completions.create(
model=MODEL,
messages=[
{"role": "user", "content": "Hello, can you help with a refund?"}
]
)
print(response.choices[0].message.content)
Step 2: Define the system prompt
A dialogue agent needs a tight system prompt. This one sets boundaries, defines the tool use rule, and prevents hallucination of order details.
SYSTEM_PROMPT = """You are a customer support agent for an electronics store.
Your job is to help users check refund status and answer policy questions.
Rules:
- Always ask for the order_id if the user mentions a refund but does not provide one.
- If you have the order_id, call the lookup_order tool to get the current status.
- Do not make up order details. Only use data returned by the tool.
- Be concise. Ask one clarifying question at a time."""
Step 3: Add conversation memory
Without memory, every turn is stateless. I used a simple Python list to hold the message history. In production you would persist this to Redis or Postgres, but the structure stays identical.
conversation_history = [
{"role": "system", "content": SYSTEM_PROMPT}
]
def add_user_message(content):
conversation_history.append({"role": "user", "content": content})
def add_assistant_message(content, tool_calls=None):
msg = {"role": "assistant", "content": content or ""}
if tool_calls:
msg["tool_calls"] = tool_calls
conversation_history.append(msg)
def add_tool_message(tool_call_id, content):
conversation_history.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": content
})
def get_messages():
return list(conversation_history)
Step 4: Define the tool schema
I registered a single function, lookup_order, using OpenAI's tool schema. Oxlo.ai parses this schema and feeds it to the model exactly like the OpenAI API.
import json
# Mock database
ORDERS = {
"ORD-1001": {"status": "refunded", "item": "USB-C Cable", "amount": 12.99},
"ORD-1002": {"status": "processing", "item": "Mechanical Keyboard", "amount": 89.50},
}
def lookup_order(order_id: str):
return ORDERS.get(order_id, {"error": "Order not found"})
TOOLS = [
{
"type": "function",
"function": {
"name": "lookup_order",
"description": "Get the current status and details of an order by ID.",
"parameters": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "The unique order identifier, e.g. ORD-1001"
}
},
"required": ["order_id"]
}
}
}
]
def call_tool(name, arguments):
if name == "lookup_order":
return lookup_order(**json.loads(arguments))
return {"error": "Unknown tool"}
Step 5: Run the dialogue loop
The loop sends the user message to the model, checks for tool_calls, executes any functions, and sends the results back for a final answer. This two-pass flow is standard for reliable tool use.
def run_turn(user_input):
add_user_message(user_input)
response = client.chat.completions.create(
model=MODEL,
messages=get_messages(),
tools=TOOLS,
tool_choice="auto"
)
msg = response.choices[0].message
# Preserve tool_calls for the model context
tool_calls = None
if msg.tool_calls:
tool_calls = [
{
"id": tc.id,
"type": "function",
"function": {
"name": tc.function.name,
"arguments": tc.function.arguments
}
} for tc in msg.tool_calls
]
add_assistant_message(msg.content, tool_calls=tool_calls)
# If the model requested tools, run them and send results back
if msg.tool_calls:
for tc in msg.tool_calls:
result = call_tool(tc.function.name, tc.function.arguments)
add_tool_message(tc.id, json.dumps(result))
final = client.chat.completions.create(
model=MODEL,
messages=get_messages(),
tools=TOOLS
)
add_assistant_message(final.choices[0].message.content)
return final.choices[0].message.content
return msg.content
Run it
The script below simulates three turns. The agent asks for the missing order ID, calls the tool once it has one, then answers a follow-up using the stored context.
if __name__ == "__main__":
print("Agent: Hello! How can I help you today?")
user_inputs = [
"I want a refund for my keyboard",
"It's ORD-1002",
"How long will the refund take?",
]
for user_input in user_inputs:
print(f"\nUser: {user_input}")
reply = run_turn(user_input)
print(f"Agent: {reply}")
Example output:
Agent: Hello! How can I help you today?
User: I want a refund for my keyboard
Agent: I can help with that. Could you please provide your order ID?
User: It's ORD-1002
Agent: I found your order for Mechanical Keyboard. The refund is currently processing.
User: How long will the refund take?
Agent: Refunds typically take 3 to 5 business days to complete once they are in processing status.
Wrap-up
If you deploy this, replace the mock ORDERS dict with a real database call. You should also consider swapping to Qwen 3 32B if you need stronger multilingual support, or Kimi K2.6 for longer context windows when handling lengthy escalation threads. Both are available on Oxlo.ai with the same flat per-request pricing, so longer transcripts do not increase your cost per turn.
Top comments (0)