Chinese models now account for 61% of global LLM token consumption. DeepSeek, Qwen, GLM, and Doubao consistently dominate the global top 10 on OpenRouter. But for developers outside China, accessing them is painful — no English docs, no international payment, confusing pricing.
I tested all 6 major APIs. Here's what I found.
Price Comparison (June 2026)
| Model | Provider | Input $/1M tokens | Output $/1M tokens | vs OpenAI |
|---|---|---|---|---|
| DeepSeek V3 | DeepSeek | $0.35 | $0.52 | 95% cheaper |
| DeepSeek V4-Flash | DeepSeek | $0.003 | $0.015 | 99.7% cheaper |
| Qwen-Max | Alibaba | $0.58 | $1.74 | 92% cheaper |
| GLM-5 | Zhipu AI | $0.87 | $4.05 | 84% cheaper |
| Doubao Pro | ByteDance | $0.43 | $0.87 | 95% cheaper |
| MiniMax M2.5 | MiniMax | $0.45 | $0.90 | 95% cheaper |
DeepSeek V4-Flash at $0.003/M is 1/300th the cost of GPT-4o. For agent chains or batch processing, you can call it without thinking about cost.
Quick Start
All Chinese models follow OpenAI API format. Change base_url and model — zero code changes.
# DeepSeek
curl https://api.deepseek.com/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-d '{"model":"deepseek-chat","messages":[{"role":"user","content":"Hello"}]}'
# Qwen — same format, different endpoint
curl https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-d '{"model":"qwen-max","messages":[{"role":"user","content":"Hi"}]}'
How to Get API Access
| Model | Sign Up | Payment | Free Tier |
|---|---|---|---|
| DeepSeek | platform.deepseek.com | Alipay/WeChat | 5M tokens |
| Qwen | dashscope.aliyun.com | Alipay | 2M tokens/month |
| GLM-5 | open.bigmodel.cn | WeChat/Alipay | 1M tokens |
| Doubao | console.volcengine.com/ark | Alipay | 500K tokens |
| MiniMax | platform.minimaxi.com | Alipay | 1M tokens |
All platforms support English UI. Most don't require a Chinese phone number.
Latency (tested from Singapore)
| Model | TTFT | Tokens/sec | Total (100 tokens) |
|---|---|---|---|
| DeepSeek V3 | 380ms | 85 t/s | 1.5s |
| DeepSeek V4-Flash | 120ms | 240 t/s | 0.5s |
| Qwen-Max | 450ms | 65 t/s | 2.0s |
| GLM-5 | 520ms | 55 t/s | 2.3s |
Which Model for What
| Use Case | Model |
|---|---|
| Agent chains (5-10 calls) | DeepSeek V3 |
| Bulk processing (translation/summary) | DeepSeek V4-Flash |
| Chinese long-form content | Qwen-Max |
| Complex reasoning | GLM-5 |
| Chat products | Doubao Pro |
| Creative writing | MiniMax M2.5 |
Bonus: Chinese Video Models
| Model | Maker | Price |
|---|---|---|
| Kling 3.0 | Kuaishou | ¥0.8/sec |
| Seedance 2.0 | ByteDance | ¥1/sec |
| Wan 2.1 | Alibaba | ¥0.5/sec |
All data, code examples, and registration guides are on GitHub: github.com/BX166/china-llm-gateway
Top comments (0)