<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: TokenHub</title>
    <description>The latest articles on DEV Community by TokenHub (@tokenhub_dev).</description>
    <link>https://dev.to/tokenhub_dev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3898502%2F6f13da76-8606-4490-b57e-067e230f3c22.png</url>
      <title>DEV Community: TokenHub</title>
      <link>https://dev.to/tokenhub_dev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tokenhub_dev"/>
    <language>en</language>
    <item>
      <title>I Built an OpenAI-Compatible Gateway to 40+ AI Models (DeepSeek, MiniMax, Claude)</title>
      <dc:creator>TokenHub</dc:creator>
      <pubDate>Sun, 26 Apr 2026 08:31:18 +0000</pubDate>
      <link>https://dev.to/tokenhub_dev/i-built-an-openai-compatible-gateway-to-40-ai-models-deepseek-minimax-claude-2ifk</link>
      <guid>https://dev.to/tokenhub_dev/i-built-an-openai-compatible-gateway-to-40-ai-models-deepseek-minimax-claude-2ifk</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;I was paying for 5+ different AI subscriptions: OpenAI, Anthropic, Google, etc. Each with separate API keys, billing dashboards, and SDK quirks.&lt;/p&gt;

&lt;p&gt;When DeepSeek-V3 dropped at ~$0.28 per million output tokens (vs GPT-4o at $10), I wanted to switch — but the friction of changing SDKs across multiple projects was a pain.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;TokenHub&lt;/strong&gt; — an OpenAI-compatible gateway that routes to 40+ AI models with a single API key.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;It's a drop-in replacement for the OpenAI SDK. Just change &lt;code&gt;base_url&lt;/code&gt; and &lt;code&gt;api_key&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-tokenhub-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://jiatoken.com/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Use any of 40+ models — DeepSeek, MiniMax, Claude, GPT, Gemini, Llama, etc.
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deepseek-v3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Explain async/await in Python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The same code works with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;gpt-4o&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;claude-sonnet-4-6&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gemini-2.5-pro&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;deepseek-v3&lt;/code&gt; / &lt;code&gt;deepseek-r1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;minimax-text-01&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;llama-3.3-70b&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;...and more&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real Pricing Comparison
&lt;/h2&gt;

&lt;p&gt;Per million tokens (input / output):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Input&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPT-4o&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;$2.50&lt;/td&gt;
&lt;td&gt;$10.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-4o mini&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;$0.15&lt;/td&gt;
&lt;td&gt;$0.60&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DeepSeek-V3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;TokenHub&lt;/td&gt;
&lt;td&gt;$0.07&lt;/td&gt;
&lt;td&gt;$0.28&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DeepSeek-R1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;TokenHub&lt;/td&gt;
&lt;td&gt;$0.14&lt;/td&gt;
&lt;td&gt;$0.55&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MiniMax-Text-01&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;TokenHub&lt;/td&gt;
&lt;td&gt;$0.10&lt;/td&gt;
&lt;td&gt;$0.40&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For high-volume workloads (RAG, agents, batch summarization), DeepSeek-V3 is &lt;strong&gt;~35x cheaper&lt;/strong&gt; than GPT-4o for output tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Which Model
&lt;/h2&gt;

&lt;p&gt;A quick mental model from my own usage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cheap &amp;amp; good enough&lt;/strong&gt; → DeepSeek-V3 (most general tasks)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reasoning&lt;/strong&gt; → DeepSeek-R1 (CoT-style tasks)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long context&lt;/strong&gt; → MiniMax-Text-01 (200K+ tokens)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontier capability&lt;/strong&gt; → GPT-4o or Claude (still worth it for hard problems)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code&lt;/strong&gt; → Claude Sonnet 4.6 or DeepSeek-V3&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The win is being able to A/B test across models without rewriting code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Open-Sourced the Routing Logic
&lt;/h2&gt;

&lt;p&gt;(Note: TokenHub itself is hosted, but the routing pattern is straightforward.)&lt;/p&gt;

&lt;p&gt;The hardest part wasn't the proxy — it was:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Normalizing function-calling formats&lt;/strong&gt; across providers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handling streaming differences&lt;/strong&gt; (SSE format quirks)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token counting&lt;/strong&gt; for accurate billing pre-request&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you're building something similar, the OpenAI spec is the de facto standard. Most providers either match it or have OpenAI-compatible endpoints already.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;If you're tired of juggling AI subscriptions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;👉 &lt;a href="https://jiatoken.com" rel="noopener noreferrer"&gt;https://jiatoken.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Free credits to start&lt;/li&gt;
&lt;li&gt;Pay-as-you-go, no monthly commitment&lt;/li&gt;
&lt;li&gt;Compatible with OpenAI SDK out of the box&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'd love feedback — especially on which models you'd want added, or pricing pain points.&lt;/p&gt;

&lt;p&gt;What's your current setup? Are you using a single provider or juggling multiple?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>llm</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
