DEV Community

xujfcn
xujfcn

Posted on

One API Key, 624+ AI Models: How to Use GPT-4o, Claude, Gemini & DeepSeek Without Managing Multiple Accounts

The Problem

If you're building with AI, you probably have this mess:

  • An OpenAI key for GPT-4o
  • An Anthropic key for Claude
  • A Google key for Gemini
  • A DeepSeek key for DeepSeek V3/R1
  • Maybe a few more for Llama, Qwen, Mistral...

Each provider has its own dashboard, billing, SDK quirks, and rate limits. Switching models means changing imports, auth, and sometimes the entire request format.

There's a simpler way.

The Solution: API Gateway

An API gateway sits between you and all the providers. You get one key, one endpoint, one SDK — and access to everything.

I've been using Crazyrouter which gives you 624+ models through a single OpenAI-compatible API. Here's how it works.

Setup (60 seconds)

pip install openai
Enter fullscreen mode Exit fullscreen mode

That's it. No special SDK.

from openai import OpenAI

client = OpenAI(
    base_url="https://crazyrouter.com/v1",
    api_key="sk-your-key"  # Get one at crazyrouter.com
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

Switch Models in One Line

This is the magic. Same code, different model:

# OpenAI GPT-4o
response = client.chat.completions.create(model="gpt-4o", messages=messages)

# Anthropic Claude Sonnet 4
response = client.chat.completions.create(model="claude-sonnet-4-20250514", messages=messages)

# Google Gemini 2.0 Flash
response = client.chat.completions.create(model="gemini-2.0-flash", messages=messages)

# DeepSeek V3 (insanely cheap)
response = client.chat.completions.create(model="deepseek-chat", messages=messages)

# DeepSeek R1 (reasoning model)
response = client.chat.completions.create(model="deepseek-reasoner", messages=messages)
Enter fullscreen mode Exit fullscreen mode

No SDK changes. No import changes. Just the model string.

Real-World Use Case: Model Comparison

Want to know which model is best for your task? Test them all:

from openai import OpenAI
import time

client = OpenAI(
    base_url="https://crazyrouter.com/v1",
    api_key="sk-your-key"
)

models = [
    "gpt-4o-mini",
    "deepseek-chat",
    "claude-sonnet-4-20250514",
    "gemini-2.0-flash",
    "deepseek-reasoner",
]

prompt = "Write a Python function for longest increasing subsequence with O(n log n) time complexity."

for model in models:
    start = time.time()
    try:
        resp = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}],
            max_tokens=500
        )
        elapsed = time.time() - start
        tokens = resp.usage.total_tokens
        print(f"\n{'='*50}")
        print(f"Model: {model}")
        print(f"Time: {elapsed:.2f}s | Tokens: {tokens}")
        print(f"Answer: {resp.choices[0].message.content[:200]}...")
    except Exception as e:
        print(f"\n{model}: {e}")
Enter fullscreen mode Exit fullscreen mode

Run this once and you'll know exactly which model fits your needs.

Use It in AI Coding Tools

Cursor

Settings → Models → Override OpenAI Base URL:

  • URL: https://crazyrouter.com/v1
  • API Key: sk-your-key
  • Model: deepseek-chat (budget) or claude-sonnet-4-20250514 (best quality)

Pro tip: You don't need Cursor Pro ($20/mo). Set your own API key and use any model.

Cline (VS Code Extension)

  1. Install "Cline" from VS Code marketplace
  2. Sidebar → Settings → API Provider → OpenAI Compatible
  3. Set:
Base URL: https://crazyrouter.com/v1
API Key:  sk-your-key
Model ID: claude-sonnet-4-20250514
Enter fullscreen mode Exit fullscreen mode

Continue (VS Code / JetBrains)

Edit ~/.continue/config.json:

{
  "models": [
    {
      "title": "Claude Sonnet (Crazyrouter)",
      "provider": "openai",
      "model": "claude-sonnet-4-20250514",
      "apiBase": "https://crazyrouter.com/v1",
      "apiKey": "sk-your-key"
    },
    {
      "title": "DeepSeek V3 (Crazyrouter)",
      "provider": "openai",
      "model": "deepseek-chat",
      "apiBase": "https://crazyrouter.com/v1",
      "apiKey": "sk-your-key"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Autocomplete",
    "provider": "openai",
    "model": "gpt-4o-mini",
    "apiBase": "https://crazyrouter.com/v1",
    "apiKey": "sk-your-key"
  }
}
Enter fullscreen mode Exit fullscreen mode

Aider (Terminal)

pip install aider-chat
export OPENAI_API_KEY="sk-your-key"
export OPENAI_API_BASE="https://crazyrouter.com/v1"

aider --model deepseek-chat
Enter fullscreen mode Exit fullscreen mode

Use It with LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://crazyrouter.com/v1",
    api_key="sk-your-crazyrouter-key",
    model="deepseek-chat"
)

# Everything else stays the same
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a senior Python engineer."),
    ("user", "{question}")
])

chain = prompt | llm | StrOutputParser()
result = chain.invoke({"question": "How to build a FastAPI CRUD app?"})
print(result)
Enter fullscreen mode Exit fullscreen mode

Works with LlamaIndex, AutoGen, and CrewAI too.

Image Generation

Not just text — image generation works too:

response = client.images.generate(
    model="dall-e-3",
    prompt="A cyberpunk cityscape at night with neon lights, digital art",
    size="1024x1024",
    quality="hd"
)
print(response.data[0].url)
Enter fullscreen mode Exit fullscreen mode

Pricing: Why This Matters

Model Input ($/1M tokens) Output ($/1M tokens) Best For
deepseek-chat $0.14 $0.28 Daily coding, best value
gpt-4o-mini $0.15 $0.60 Light tasks
gemini-2.0-flash $0.10 $0.40 Long docs, speed
gpt-4o $2.50 $10.00 Complex reasoning
claude-sonnet-4 $3.00 $15.00 Best coding
deepseek-reasoner $0.55 $2.19 Math/logic

DeepSeek V3 is ~1/18th the price of GPT-4o with surprisingly competitive coding ability.

Real cost example

100 AI coding assists per day (avg 500 input + 200 output tokens each):

Model Monthly Cost
deepseek-chat ~$0.13
gpt-4o-mini ~$0.59
gpt-4o ~$9.75
claude-sonnet-4 ~$13.50

With DeepSeek V3, heavy daily usage costs less than a cup of coffee per year.

Environment Variables (Set Once, Use Everywhere)

# Add to ~/.bashrc or ~/.zshrc
export OPENAI_API_KEY="sk-your-crazyrouter-key"
export OPENAI_API_BASE="https://crazyrouter.com/v1"
export OPENAI_BASE_URL="https://crazyrouter.com/v1"
Enter fullscreen mode Exit fullscreen mode

Most tools auto-detect these. Set them once and forget.

FAQ

Q: Is it really OpenAI-compatible?
A: Yes. Any tool that works with the OpenAI API works with this. Just change the base URL.

Q: Does it add latency?
A: A few tens of milliseconds for the gateway hop. Negligible compared to LLM response times (seconds).

Q: Does it support streaming?
A: Yes. Add stream=True like you normally would.

Q: Function calling / tool use?
A: Fully supported, same OpenAI format.

TL;DR

  • One API key → 624+ models (GPT-4o, Claude, Gemini, DeepSeek, Llama, Qwen...)
  • OpenAI SDK compatible → change base_url and you're done
  • Switch models by changing one string
  • Works with Cursor, Cline, Continue, Aider, LangChain, and everything else
  • DeepSeek V3 at $0.14/1M tokens is absurdly good value

Links


If this helped, drop a ❤️ and follow for more AI dev content.

Top comments (0)