DEV Community

shashank ms
shashank ms

Posted on

LLMs for Dialogue Systems: Opportunities and Limitations

I recently built a customer support dialogue agent for a side project. It handles multi-turn refund requests, remembers context across turns, and calls a mock order lookup tool when needed. I will walk through the exact code, running on Oxlo.ai, because its request-based pricing stays predictable even as conversation history grows.

What you'll need

Before starting, make sure you have the following:

Step 1: Configure the Oxlo.ai client

I started with the standard OpenAI SDK pointed at Oxlo.ai. Since Oxlo.ai is fully OpenAI-compatible, this requires no client forks or custom wrappers. I picked Llama 3.3 70B as the base model because it follows instructions reliably.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key="YOUR_OXLO_API_KEY"
)

MODEL = "llama-3.3-70b"

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "user", "content": "Hello, can you help with a refund?"}
    ]
)
print(response.choices[0].message.content)

Step 2: Define the system prompt

A dialogue agent needs a tight system prompt. This one sets boundaries, defines the tool use rule, and prevents hallucination of order details.

SYSTEM_PROMPT = """You are a customer support agent for an electronics store.
Your job is to help users check refund status and answer policy questions.
Rules:
- Always ask for the order_id if the user mentions a refund but does not provide one.
- If you have the order_id, call the lookup_order tool to get the current status.
- Do not make up order details. Only use data returned by the tool.
- Be concise. Ask one clarifying question at a time."""

Step 3: Add conversation memory

Without memory, every turn is stateless. I used a simple Python list to hold the message history. In production you would persist this to Redis or Postgres, but the structure stays identical.

conversation_history = [
    {"role": "system", "content": SYSTEM_PROMPT}
]

def add_user_message(content):
    conversation_history.append({"role": "user", "content": content})

def add_assistant_message(content, tool_calls=None):
    msg = {"role": "assistant", "content": content or ""}
    if tool_calls:
        msg["tool_calls"] = tool_calls
    conversation_history.append(msg)

def add_tool_message(tool_call_id, content):
    conversation_history.append({
        "role": "tool",
        "tool_call_id": tool_call_id,
        "content": content
    })

def get_messages():
    return list(conversation_history)

Step 4: Define the tool schema

I registered a single function, lookup_order, using OpenAI's tool schema. Oxlo.ai parses this schema and feeds it to the model exactly like the OpenAI API.

import json

# Mock database
ORDERS = {
    "ORD-1001": {"status": "refunded", "item": "USB-C Cable", "amount": 12.99},
    "ORD-1002": {"status": "processing", "item": "Mechanical Keyboard", "amount": 89.50},
}

def lookup_order(order_id: str):
    return ORDERS.get(order_id, {"error": "Order not found"})

TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "lookup_order",
            "description": "Get the current status and details of an order by ID.",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {
                        "type": "string",
                        "description": "The unique order identifier, e.g. ORD-1001"
                    }
                },
                "required": ["order_id"]
            }
        }
    }
]

def call_tool(name, arguments):
    if name == "lookup_order":
        return lookup_order(**json.loads(arguments))
    return {"error": "Unknown tool"}

Step 5: Run the dialogue loop

The loop sends the user message to the model, checks for tool_calls, executes any functions, and sends the results back for a final answer. This two-pass flow is standard for reliable tool use.

def run_turn(user_input):
    add_user_message(user_input)
    
    response = client.chat.completions.create(
        model=MODEL,
        messages=get_messages(),
        tools=TOOLS,
        tool_choice="auto"
    )
    
    msg = response.choices[0].message
    
    # Preserve tool_calls for the model context
    tool_calls = None
    if msg.tool_calls:
        tool_calls = [
            {
                "id": tc.id,
                "type": "function",
                "function": {
                    "name": tc.function.name,
                    "arguments": tc.function.arguments
                }
            } for tc in msg.tool_calls
        ]
    add_assistant_message(msg.content, tool_calls=tool_calls)
    
    # If the model requested tools, run them and send results back
    if msg.tool_calls:
        for tc in msg.tool_calls:
            result = call_tool(tc.function.name, tc.function.arguments)
            add_tool_message(tc.id, json.dumps(result))
        
        final = client.chat.completions.create(
            model=MODEL,
            messages=get_messages(),
            tools=TOOLS
        )
        add_assistant_message(final.choices[0].message.content)
        return final.choices[0].message.content
    
    return msg.content

Run it

The script below simulates three turns. The agent asks for the missing order ID, calls the tool once it has one, then answers a follow-up using the stored context.

if __name__ == "__main__":
    print("Agent: Hello! How can I help you today?")
    
    user_inputs = [
        "I want a refund for my keyboard",
        "It's ORD-1002",
        "How long will the refund take?",
    ]
    
    for user_input in user_inputs:
        print(f"\nUser: {user_input}")
        reply = run_turn(user_input)
        print(f"Agent: {reply}")

Example output:

Agent: Hello! How can I help you today?

User: I want a refund for my keyboard
Agent: I can help with that. Could you please provide your order ID?

User: It's ORD-1002
Agent: I found your order for Mechanical Keyboard. The refund is currently processing.

User: How long will the refund take?
Agent: Refunds typically take 3 to 5 business days to complete once they are in processing status.

Wrap-up

If you deploy this, replace the mock ORDERS dict with a real database call. You should also consider swapping to Qwen 3 32B if you need stronger multilingual support, or Kimi K2.6 for longer context windows when handling lengthy escalation threads. Both are available on Oxlo.ai with the same flat per-request pricing, so longer transcripts do not increase your cost per turn.

Top comments (0)