Building Virtual Language Assistants with LLM: Best Practices and Examples

#engineering #oxlo #ai

I recently shipped a multilingual virtual support agent that handles tier-1 customer inquiries without token-based cost surprises. In this tutorial, I will walk through building the same system on Oxlo.ai, using request-based pricing so long conversations with tools do not inflate your bill. You will end up with a working assistant that checks order status, answers policy questions, and escalates when it should.

What you'll need

You need Python 3.10 or newer, the OpenAI SDK, and an Oxlo.ai API key. Install the SDK and grab your key from the portal.

Python 3.10+
OpenAI SDK: pip install openai
Oxlo.ai API key from https://portal.oxlo.ai

Step 1: Configure the Oxlo.ai client

Point the OpenAI SDK at Oxlo.ai and select a general-purpose model. I use Llama 3.3 70B because it handles mixed tool use and casual language well.

from openai import OpenAI
import json

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")
MODEL = "llama-3.3-70b"

Step 2: Design the system prompt

The system prompt is the contract that keeps the agent polite, accurate, and scoped. I keep it explicit about what it can do and when to hand off.

SYSTEM_PROMPT = """You are a customer support agent for an electronics store.
You can check order status and estimate shipping dates.
Policies:
- Returns are accepted within 30 days with a receipt.
- Do not issue refunds yourself. Escalate refund requests to a human.
- If a customer is angry or asks for legal advice, escalate immediately.
Be concise. Ask for the order ID if needed."""

Step 3: Define tool schemas and mock functions

The agent needs something to actually do. I define two tools, get_order_status and get_shipping_estimate, backed by a simple dictionary so the code runs without external services.

MOCK_DB = {
    "ORD-1234": {"status": "shipped", "item": "USB-C Cable", "last_update": "2024-05-10"},
    "ORD-5678": {"status": "processing", "item": "Mechanical Keyboard", "last_update": "2024-05-12"},
}

def get_order_status(order_id: str) -> str:
    record = MOCK_DB.get(order_id.upper())
    if not record:
        return json.dumps({"error": "Order not found."})
    return json.dumps(record)

def get_shipping_estimate(order_id: str) -> str:
    record = MOCK_DB.get(order_id.upper())
    if not record:
        return json.dumps({"error": "Order not found."})
    if record["status"] == "shipped":
        return json.dumps({"estimate": "2 business days", "carrier": "FastPost"})
    return json.dumps({"estimate": "3-5 business days after processing"})

TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "get_order_status",
            "description": "Get the current status of a customer order.",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string", "description": "The order ID, e.g. ORD-1234"}
                },
                "required": ["order_id"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_shipping_estimate",
            "description": "Estimate shipping time for an order.",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string", "description": "The order ID, e.g. ORD-1234"}
                },
                "required": ["order_id"]
            }
        }
    }
]

Step 4: Build the conversation loop

Here is the core handler. It sends the conversation to Oxlo.ai, executes any tool calls, and sends the results back so the model can produce the final answer.

def run_agent_turn(messages):
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        tools=TOOLS,
        tool_choice="auto",
    )
    message = response.choices[0].message
    messages.append(message)

    if message.tool_calls:
        for tool_call in message.tool_calls:
            fn_name = tool_call.function.name
            args = json.loads(tool_call.function.arguments)

            if fn_name == "get_order_status":
                result = get_order_status(**args)
            elif fn_name == "get_shipping_estimate":
                result = get_shipping_estimate(**args)
            else:
                result = json.dumps({"error": "Unknown tool"})

            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "name": fn_name,
                "content": result,
            })

        second = client.chat.completions.create(
            model=MODEL,
            messages=messages,
            tools=TOOLS,
        )
        messages.append(second.choices[0].message)
        return second.choices[0].message.content

    return message.content

Step 5: Add escalation guardrails

I wrap the loop with a simple check for escalation keywords. If the customer mentions a lawsuit or uses profanity, we print an escalation notice and stop the automation.

ESCALATION_WORDS = {"lawyer", "sue", "lawsuit", "attorney", "refund me now"}

def chat(user_input, messages=None):
    if messages is None:
        messages = [{"role": "system", "content": SYSTEM_PROMPT}]

    if any(word in user_input.lower() for word in ESCALATION_WORDS):
        return "[ESCALATED TO HUMAN AGENT]", messages

    messages.append({"role": "user", "content": user_input})
    reply = run_agent_turn(messages)
    return reply, messages

Run it

This script exercises the full flow: a simple question, an order lookup, and an escalation test.

if __name__ == "__main__":
    # Conversation 1: order lookup
    msgs = None
    reply, msgs = chat("Where is my order ORD-1234?", msgs)
    print("Bot:", reply)

    reply, msgs = chat("When will it arrive?", msgs)
    print("Bot:", reply)

    # Conversation 2: escalation
    reply2, _ = chat("I want to sue your company!")
    print("Bot:", reply2)

Example output:

Bot: Your order ORD-1234 for a USB-C Cable has been shipped as of 2024-05-10.
Bot: It should arrive within 2 business days via FastPost.
Bot: [ESCALATED TO HUMAN AGENT]

Next steps

Swap the mock dictionary for a real database connector and add an async queue so the agent can handle multiple sessions in parallel. If your workload involves long context threads or agentic loops, consider Oxlo.ai's request-based pricing at https://oxlo.ai/pricing to keep costs flat regardless of conversation length.