We are going to build a command-line customer support agent that remembers context across multiple turns and can look up mock order statuses. This tutorial is for developers who want a working conversational AI prototype without getting lost in token math or provider-specific SDKs.
What you'll need
- Python 3.10 or newer
- An Oxlo.ai API key from https://portal.oxlo.ai
- The OpenAI SDK:
pip install openai
I also recommend grabbing a beverage. This should take about fifteen minutes.
Step 1: Set up the client
Create a file named support_agent.py. Import the OpenAI SDK and point it at Oxlo.ai. I am using llama-3.3-70b because it is a reliable general-purpose model that follows instructions well.
from openai import OpenAI
import os
client = OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key=os.environ.get("OXLO_API_KEY")
)
if not client.api_key:
raise ValueError("Set your OXLO_API_KEY environment variable.")
Step 2: Write the system prompt
The system prompt is the agent's job description. I keep it in a constant so I can tweak tone and guardrails without touching the logic.
SYSTEM_PROMPT = """You are a customer support agent for VoltGear, an electronics store.
Your job is to answer questions about return policies, shipping times, and order status.
If the user asks about an order, ask for their order ID, then use the lookup_order tool.
Do not make up order details. Do not provide tech support for products.
Keep responses under three sentences unless the user asks for detail.
"""
Step 3: Build the conversation loop
Conversational AI needs memory. Append every user message and assistant reply to a list, then pass the full list back to the API each time. On Oxlo.ai, this does not inflate your cost, because pricing is flat per request regardless of how long the history grows.
messages = [{"role": "system", "content": SYSTEM_PROMPT}]
def chat(user_input):
messages.append({"role": "user", "content": user_input})
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=messages,
)
reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": reply})
return reply
Step 4: Add a tool for order lookups
A support agent that cannot look things up is just a chatbot. Oxlo.ai supports OpenAI-compatible function calling, so we can define a lookup_order tool. The model decides when to call it.
import json
def lookup_order(order_id: str):
db = {
"VG-1001": {"status": "shipped", "eta": "2026-01-15"},
"VG-1002": {"status": "processing", "eta": "2026-01-20"},
}
return db.get(order_id, {"status": "not_found", "eta": None})
tools = [
{
"type": "function",
"function": {
"name": "lookup_order",
"description": "Get the shipping status and ETA for a customer order.",
"parameters": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "The order ID, for example VG-1001."
}
},
"required": ["order_id"]
}
}
}
]
Step 5: Wire everything together
Now we combine the loop, message history, and tool handler. If the model returns a tool call, we execute our Python function, append the result to the conversation, and send it back to the model for a natural-language answer.
def run_agent():
messages = [{"role": "system", "content": SYSTEM_PROMPT}]
while True:
user_input = input("User: ")
if user_input.lower() in ["exit", "quit"]:
break
messages.append({"role": "user", "content": user_input})
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=messages,
tools=tools,
tool_choice="auto",
)
message = response.choices[0].message
if message.tool_calls:
assistant_msg = {
"role": "assistant",
"content": message.content or "",
"tool_calls": [
{
"id": tc.id,
"type": tc.type,
"function": {
"name": tc.function.name,
"arguments": tc.function.arguments,
},
}
for tc in message.tool_calls
],
}
messages.append(assistant_msg)
for tc in message.tool_calls:
if tc.function.name == "lookup_order":
args = json.loads(tc.function.arguments)
result = lookup_order(args["order_id"])
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"name": tc.function.name,
"content": json.dumps(result),
})
second = client.chat.completions.create(
model="llama-3.3-70b",
messages=messages,
)
reply = second.choices[0].message.content
messages.append({"role": "assistant", "content": reply})
print(f"Agent: {reply}")
else:
messages.append({"role": "assistant", "content": message.content})
print(f"Agent: {message.content}")
if __name__ == "__main__":
run_agent()
Run it
Export your key and start the script.
export OXLO_API_KEY="YOUR_OXLO_API_KEY"
python support_agent.py
Here is a sample session.
User: what is your return policy
Agent: You can return items within 30 days of delivery for a full refund. Items must be in original packaging.
User: check order VG-1001
Agent: Order VG-1001 has already shipped and is expected to arrive on January 15, 2026.
User: thanks
Agent: You are welcome. Let me know if you need anything else.
Next steps
Turn this script into a FastAPI endpoint so a frontend or another service can POST messages to it. If you want stronger reasoning for more complex multi-step support flows, swap llama-3.3-70b for kimi-k2.6 or deepseek-v3.2. Both are available on Oxlo.ai with the same flat per-request pricing, so a longer conversation history never balloons your bill. See the details at https://oxlo.ai/pricing.
Top comments (0)