ModelHub Dev

Posted on Jun 1

How to Build a Multi-Model AI API with One Key and One SDK

#api #ai #webdev #programming

How to Build a Multi-Model AI API with One Key and One SDK — DeepSeek V4, Qwen 3, and GLM-4

The Problem

You want to use the best LLM for each task. Maybe DeepSeek V4 for coding, Qwen 3 for Chinese text, and GPT-4o for creative writing. But every model has its own SDK, its own API key, and its own pricing.

What if you could switch between models with a single line change?

Enter ModelHub

ModelHub gives you access to 5+ Chinese LLMs through one API that's 100% OpenAI SDK compatible. You change the base_url and api_key, and everything else just works.

Models Available

Model	Cost (input per 1M tokens)	Best For
DeepSeek V4	$0.15	Coding, reasoning, general
Qwen 3	$0.30	Chinese text, translation
GLM-4	$0.30	Long context, analysis
Doubao	$0.10	Quick responses, cost-sensitive
Kimi K2.6	$0.30	Creative, long context
DeepSeek R1	$0.55	Complex reasoning

For comparison: GPT-4o costs $2.50/M input tokens — that's 10-30x more.

Quick Start (3 lines of code)

import openai

client = openai.OpenAI(
    base_url="https://modelhub-api.com/v1",
    api_key="mh-sk-your-key-here"
)

response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

That's it. Your existing OpenAI codebase works without any changes — just update the base URL and API key.

Switching Models at Runtime

Want to use different models for different tasks? Easy:

import openai

client = openai.OpenAI(
    base_url="https://modelhub-api.com/v1",
    api_key="mh-sk-your-key-here"
)

def code_review(code_snippet):
    """Use DeepSeek V4 for code tasks"""
    return client.chat.completions.create(
        model="deepseek-v4-pro",
        messages=[{"role": "user", "content": f"Review this code:\n{code_snippet}"}]
    )

def translate(text, language="zh"):
    """Use Qwen 3 for translation"""
    return client.chat.completions.create(
        model="qwen3-32b",
        messages=[{"role": "user", "content": f"Translate to {language}: {text}"}]
    )

def summarize(long_text):
    """Use GLM-4 for long-form processing"""
    return client.chat.completions.create(
        model="glm-4-7",
        messages=[{"role": "user", "content": f"Summarize:\n{long_text}"}]
    )

Cost Comparison Table

This is where it gets interesting. Assuming 1M input tokens:

Provider	Model	Cost
OpenAI	GPT-4o	$2.50
OpenAI	GPT-4o-mini	$0.15
Anthropic	Claude 3.5 Sonnet	$3.00
ModelHub	DeepSeek V4	$0.15
ModelHub	Qwen 3	$0.30
ModelHub	GLM-4	$0.30

Running DeepSeek V4 through ModelHub costs the same as GPT-4o-mini but delivers GPT-4o-class quality on reasoning benchmarks.

What About Quality?

Chinese LLMs have improved dramatically in 2025. On key benchmarks:

DeepSeek V4: Matches GPT-4o on coding (HumanEval 92%) and math (MATH 89%)
Qwen 3: Strong on multilingual tasks and instruction following
GLM-4: Excellent on long-context retrieval (128K tokens, 98% accuracy)
Kimi K2.6: Strong reasoning with 262K context window, supports images and video

For English-first use cases, DeepSeek V4 is your best bet. For anything involving Chinese or code, it's often better than GPT-4o.

Try It Free

ModelHub gives you $5 free credit on signup — no credit card required. That's roughly:

33,000+ calls to DeepSeek V4 (at ~150 tokens/call)
50,000+ calls to Qwen 3
One month of personal use for most developers

Get your free API key

When Should You NOT Use ModelHub?

Be transparent: if you need:

The absolute best creative writing (GPT-4o is still king there)
99.99% uptime SLA (we're growing, targeting 99.5% now)
Enterprise compliance certifications (coming soon)

But for coding assistants, data processing, chatbots, translation, content generation — you'll save 80-95% with no quality sacrifice.

Built by a developer, for developers. Questions? Comments? Drop them below. Follow @modelhub_dev for updates.

DEV Community