DEV Community

q409605362
q409605362

Posted on

Building a Multilingual AI Chatbot for Southeast Asia: Qwen vs DeepSeek in Practice

The biggest challenge when building AI apps for Southeast Asia isn't the code — it's the latency and language support. Most developers default to OpenAI, but when your users are in Bangkok, Ho Chi Minh City, or Kuala Lumpur, that round-trip to US servers adds 300-500ms before the model even starts thinking.

I spent time testing Qwen and DeepSeek models through Asiatek AI's Singapore gateway, and the results were eye-opening. Here's what I learned building a multilingual support chatbot.

The Setup

I built a simple customer support chatbot that needs to handle 4 languages: English, Thai, Vietnamese, and Malay. The requirements:

  • Respond in the user's language automatically
  • Keep latency under 3 seconds end-to-end
  • Cost less than $50/month at moderate volume

Why Singapore Gateway Matters

Before we dive into code, let's talk about latency. I ran curl benchmarks from different Southeast Asian cities:

From To OpenAI (US) To Asiatek AI (SG)
Bangkok ~320ms ~28ms
Ho Chi Minh City ~310ms ~32ms
Kuala Lumpur ~290ms ~18ms
Singapore ~260ms ~8ms
Jakarta ~340ms ~35ms

That's a 10-15x improvement in network latency alone. When you're streaming tokens, this makes the difference between a snappy response and one that feels broken.

Model Selection: Qwen vs DeepSeek

This was the most interesting part. Both model families have strengths:

Qwen (Alibaba) — Best for Southeast Asian languages

  • Excellent Thai, Vietnamese, Malay support out of the box
  • Qwen Turbo is insanely cheap at $0.08/M input tokens
  • Qwen Plus hits the sweet spot for quality vs cost

DeepSeek — Best for reasoning and code

  • DeepSeek Chat handles multilingual conversations well
  • DeepSeek Reasoner excels at complex problem-solving
  • 128K context window is great for long conversations

For our customer support bot, Qwen Plus was the winner. Here's why:

  • It correctly identifies and responds in Thai, Vietnamese, and Malay without explicit prompting
  • At $0.84/M input + $2.50/M output, it's still 60% cheaper than GPT-4o
  • Quality is comparable to GPT-4o for customer support scenarios

The Code

1. Basic Setup

from openai import OpenAI

client = OpenAI(
    api_key="your_asiatek_key",
    base_url="https://api.asiatekai.com/v1"
)
Enter fullscreen mode Exit fullscreen mode

Yes, that's it. If you're already using the openai Python package, just change api_key and base_url.

2. Auto-Detect Language and Respond

import json

SYSTEM_PROMPT = """You are a helpful customer support agent for a Southeast Asian e-commerce platform.

Rules:
1. Detect the user's language automatically
2. Always respond in the same language as the user
3. Be concise and friendly
4. If you're unsure about something, say so honestly

Supported languages: English, Thai, Vietnamese, Malay, Chinese"""

def chat(user_message: str, history: list = None) -> str:
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    if history:
        messages.extend(history)
    messages.append({"role": "user", "content": user_message})

    response = client.chat.completions.create(
        model="qwen-plus",
        messages=messages,
        temperature=0.7,
        max_tokens=500
    )
    return response.choices[0].message.content
Enter fullscreen mode Exit fullscreen mode

3. Test It

# English
print(chat("How do I track my order?"))
# → "You can track your order by going to 'My Orders' in the app..."

# Thai
print(chat("สอบถามวิธีติดตามสินค้าหน่อยค่ะ"))
# → "คุณสามารถติดตามสินค้าได้โดยไปที่ 'คำสั่งซื้อของฉัน' ในแอป..."

# Vietnamese
print(chat("Làm sao để theo dõi đơn hàng?"))
# → "Bạn có thể theo dõi đơn hàng bằng cách vào 'Đơn hàng của tôi' trong ứng dụng..."

# Malay
print(chat("Macam mana nak track order saya?"))
# → "Anda boleh menjejak pesanan anda dengan pergi ke 'Pesanan Saya' dalam aplikasi..."
Enter fullscreen mode Exit fullscreen mode

Qwen Plus correctly detects and responds in all four languages without any explicit language parameter.

4. Add Streaming for Better UX

def chat_stream(user_message: str, history: list = None):
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    if history:
        messages.extend(history)
    messages.append({"role": "user", "content": user_message})

    stream = client.chat.completions.create(
        model="qwen-plus",
        messages=messages,
        temperature=0.7,
        max_tokens=500,
        stream=True
    )

    for chunk in stream:
        if chunk.choices[0].delta.content:
            yield chunk.choices[0].delta.content

# Usage
for token in chat_stream("ร้านค้าเปิดกี่โมงคะ"):
    print(token, end="", flush=True)
Enter fullscreen mode Exit fullscreen mode

With the Singapore gateway, the first token arrives in under 200ms — users see the response start almost instantly.

Cost Analysis

Let's crunch the numbers for a real-world scenario. Assume:

  • 1,000 conversations/day
  • Average 5 messages per conversation
  • Average 200 input tokens + 150 output tokens per message
Model Daily Cost Monthly Cost
GPT-4o $9.75 $292.50
Qwen Turbo $0.17 $5.10
Qwen Plus $1.52 $45.60
DeepSeek Chat $0.62 $18.60

Qwen Plus hits our $50/month budget with room to spare. And if you're building a high-volume, low-complexity bot, Qwen Turbo is practically free at $5/month.

When to Use Which Model

Use Case Best Model Why
Customer support (multilingual) Qwen Plus Best SEA language support
High-volume FAQ bot Qwen Turbo Cheapest, fast enough
Code review assistant DeepSeek Coder 128K context, code-optimized
Complex reasoning tasks DeepSeek Reasoner Strong reasoning ability
Long document analysis Qwen Long Up to 10M token context
General chatbot DeepSeek Chat 128K context, great value

Tips for Production

  1. Set temperature=0 for FAQ bots — More consistent, cheaper (fewer output tokens from rambling)
  2. Use max_tokens wisely — Customer support doesn't need 4K token responses
  3. Cache common queries — If 30% of your questions are "where's my order", cache the response
  4. Monitor your usage — Asiatek AI dashboard shows token usage per API key

Try It Yourself

  • Sign up at asiatekai.com — free trial credits, no credit card needed
  • Get your API key instantly
  • Change base_url in your existing OpenAI code and you're done
from openai import OpenAI

client = OpenAI(
    api_key="your-key",
    base_url="https://api.asiatekai.com/v1"
)
Enter fullscreen mode Exit fullscreen mode

Full pricing: asiatekai.com/pricing

Top comments (0)