How to Build a Multi-Model AI API with One Key and One SDK — DeepSeek V4, Qwen 3, and GLM-4
The Problem
You want to use the best LLM for each task. Maybe DeepSeek V4 for coding, Qwen 3 for Chinese text, and GPT-4o for creative writing. But every model has its own SDK, its own API key, and its own pricing.
What if you could switch between models with a single line change?
Enter ModelHub
ModelHub gives you access to 5+ Chinese LLMs through one API that's 100% OpenAI SDK compatible. You change the base_url and api_key, and everything else just works.
Models Available
| Model | Cost (input per 1M tokens) | Best For |
|---|---|---|
| DeepSeek V4 | $0.15 | Coding, reasoning, general |
| Qwen 3 | $0.30 | Chinese text, translation |
| GLM-4 | $0.30 | Long context, analysis |
| Doubao | $0.10 | Quick responses, cost-sensitive |
| Kimi K2.6 | $0.30 | Creative, long context |
| DeepSeek R1 | $0.55 | Complex reasoning |
For comparison: GPT-4o costs $2.50/M input tokens — that's 10-30x more.
Quick Start (3 lines of code)
import openai
client = openai.OpenAI(
base_url="https://modelhub-api.com/v1",
api_key="mh-sk-your-key-here"
)
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
That's it. Your existing OpenAI codebase works without any changes — just update the base URL and API key.
Switching Models at Runtime
Want to use different models for different tasks? Easy:
import openai
client = openai.OpenAI(
base_url="https://modelhub-api.com/v1",
api_key="mh-sk-your-key-here"
)
def code_review(code_snippet):
"""Use DeepSeek V4 for code tasks"""
return client.chat.completions.create(
model="deepseek-v4-pro",
messages=[{"role": "user", "content": f"Review this code:\n{code_snippet}"}]
)
def translate(text, language="zh"):
"""Use Qwen 3 for translation"""
return client.chat.completions.create(
model="qwen3-32b",
messages=[{"role": "user", "content": f"Translate to {language}: {text}"}]
)
def summarize(long_text):
"""Use GLM-4 for long-form processing"""
return client.chat.completions.create(
model="glm-4-7",
messages=[{"role": "user", "content": f"Summarize:\n{long_text}"}]
)
Cost Comparison Table
This is where it gets interesting. Assuming 1M input tokens:
| Provider | Model | Cost |
|---|---|---|
| OpenAI | GPT-4o | $2.50 |
| OpenAI | GPT-4o-mini | $0.15 |
| Anthropic | Claude 3.5 Sonnet | $3.00 |
| ModelHub | DeepSeek V4 | $0.15 |
| ModelHub | Qwen 3 | $0.30 |
| ModelHub | GLM-4 | $0.30 |
Running DeepSeek V4 through ModelHub costs the same as GPT-4o-mini but delivers GPT-4o-class quality on reasoning benchmarks.
What About Quality?
Chinese LLMs have improved dramatically in 2025. On key benchmarks:
- DeepSeek V4: Matches GPT-4o on coding (HumanEval 92%) and math (MATH 89%)
- Qwen 3: Strong on multilingual tasks and instruction following
- GLM-4: Excellent on long-context retrieval (128K tokens, 98% accuracy)
- Kimi K2.6: Strong reasoning with 262K context window, supports images and video
For English-first use cases, DeepSeek V4 is your best bet. For anything involving Chinese or code, it's often better than GPT-4o.
Try It Free
ModelHub gives you $5 free credit on signup — no credit card required. That's roughly:
- 33,000+ calls to DeepSeek V4 (at ~150 tokens/call)
- 50,000+ calls to Qwen 3
- One month of personal use for most developers
When Should You NOT Use ModelHub?
Be transparent: if you need:
- The absolute best creative writing (GPT-4o is still king there)
- 99.99% uptime SLA (we're growing, targeting 99.5% now)
- Enterprise compliance certifications (coming soon)
But for coding assistants, data processing, chatbots, translation, content generation — you'll save 80-95% with no quality sacrifice.
Built by a developer, for developers. Questions? Comments? Drop them below. Follow @modelhub_dev for updates.
Top comments (0)