The Problem
I was paying for 5+ different AI subscriptions: OpenAI, Anthropic, Google, etc. Each with separate API keys, billing dashboards, and SDK quirks.
When DeepSeek-V3 dropped at ~$0.28 per million output tokens (vs GPT-4o at $10), I wanted to switch — but the friction of changing SDKs across multiple projects was a pain.
So I built TokenHub — an OpenAI-compatible gateway that routes to 40+ AI models with a single API key.
How It Works
It's a drop-in replacement for the OpenAI SDK. Just change base_url and api_key:
from openai import OpenAI
client = OpenAI(
api_key="your-tokenhub-key",
base_url="https://jiatoken.com/v1"
)
# Use any of 40+ models — DeepSeek, MiniMax, Claude, GPT, Gemini, Llama, etc.
response = client.chat.completions.create(
model="deepseek-v3",
messages=[{"role": "user", "content": "Explain async/await in Python"}]
)
print(response.choices[0].message.content)
That's it. The same code works with:
gpt-4oclaude-sonnet-4-6gemini-2.5-pro-
deepseek-v3/deepseek-r1 minimax-text-01llama-3.3-70b- ...and more
Real Pricing Comparison
Per million tokens (input / output):
| Model | Provider | Input | Output |
|---|---|---|---|
| GPT-4o | OpenAI | $2.50 | $10.00 |
| GPT-4o mini | OpenAI | $0.15 | $0.60 |
| DeepSeek-V3 | TokenHub | $0.07 | $0.28 |
| DeepSeek-R1 | TokenHub | $0.14 | $0.55 |
| MiniMax-Text-01 | TokenHub | $0.10 | $0.40 |
For high-volume workloads (RAG, agents, batch summarization), DeepSeek-V3 is ~35x cheaper than GPT-4o for output tokens.
When to Use Which Model
A quick mental model from my own usage:
- Cheap & good enough → DeepSeek-V3 (most general tasks)
- Reasoning → DeepSeek-R1 (CoT-style tasks)
- Long context → MiniMax-Text-01 (200K+ tokens)
- Frontier capability → GPT-4o or Claude (still worth it for hard problems)
- Code → Claude Sonnet 4.6 or DeepSeek-V3
The win is being able to A/B test across models without rewriting code.
Why I Open-Sourced the Routing Logic
(Note: TokenHub itself is hosted, but the routing pattern is straightforward.)
The hardest part wasn't the proxy — it was:
- Normalizing function-calling formats across providers
- Handling streaming differences (SSE format quirks)
- Token counting for accurate billing pre-request
If you're building something similar, the OpenAI spec is the de facto standard. Most providers either match it or have OpenAI-compatible endpoints already.
Try It
If you're tired of juggling AI subscriptions:
- 👉 https://jiatoken.com
- Free credits to start
- Pay-as-you-go, no monthly commitment
- Compatible with OpenAI SDK out of the box
I'd love feedback — especially on which models you'd want added, or pricing pain points.
What's your current setup? Are you using a single provider or juggling multiple?
Top comments (0)