GWEN

Posted on Jun 29

How to switch AI models without rewriting your app

#ai #llm #api #python

Most AI apps start with one model provider.

That is usually the right choice. For a first version, you want one SDK, one API key, one billing page, and one model name. Simple is good when you are trying to ship.

But once the product grows, the model decision gets more complicated.

You may want to test another model because:

one model is better at reasoning
another model is faster for chat
another one is cheaper for background jobs
another model handles long context better
you want a fallback when one provider is slow or unavailable

The annoying part is that switching models is often not just changing a string.

It can mean adding another SDK, another API key, another request format, another dashboard, and another set of provider-specific edge cases.

That gets messy quickly.

Before: direct OpenAI integration

A first version might look like this:

python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_OPENAI_API_KEY",
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": "Write a short onboarding message for a developer tool."
        }
    ],
)

print(response.choices[0].message.content)

This is clean and totally fine.

But if you later want to compare Claude, Gemini, DeepSeek, or another model family, you may not want to rewrite your AI integration around each provider.

After: use an OpenAI-compatible gateway

One practical option is to use an OpenAI-compatible API gateway.

Your app keeps using the OpenAI SDK style, but the gateway lets you route requests to different model families through one endpoint.

I work on the TokenBay team, so the example below uses TokenBay. The general idea applies to any OpenAI-compatible gateway.

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenbay.com/v1",
    api_key="YOUR_TOKENBAY_API_KEY",
)

response = client.chat.completions.create(
    model="gpt-5.4-mini",
    messages=[
        {
            "role": "user",
            "content": "Write a short onboarding message for a developer tool."
        }
    ],
)

print(response.choices[0].message.content)

The main change is just:

python
base_url="https://api.tokenbay.com/v1"
api_key="YOUR_TOKENBAY_API_KEY"

That is the useful part.

You keep the familiar OpenAI client shape, but you are no longer wiring every provider separately.

Try another model

Once your app uses an OpenAI-compatible endpoint, testing another supported model can be as simple as changing the model name.

python
response = client.chat.completions.create(
    model="claude-sonnet-4.6",
    messages=[
        {
            "role": "user",
            "content": "Write a short onboarding message for a developer tool."
        }
    ],
)

Or:

python
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {
            "role": "user",
            "content": "Write a short onboarding message for a developer tool."
        }
    ],
)

The point is not that every model behaves the same.

They do not.

The point is that your business logic should not need to change every time you want to compare models.

Put the model in config

For a real app, I would keep the base URL, API key, and model name in environment variables:

bash
LLM_BASE_URL=https://api.tokenbay.com/v1
LLM_API_KEY=YOUR_TOKENBAY_API_KEY
LLM_MODEL=gpt-5.4-mini

Then your application code can stay stable while you test different models.

Change config, redeploy, compare results.

Very boring. Very useful.

When this pattern helps

This setup is useful if you are:

building an AI SaaS product
comparing cost and quality across models
using different models for chat, reasoning, extraction, or fallback
trying to avoid provider-specific code too early
managing multiple projects or API keys

It does not magically solve model selection. You still need to test output quality, latency, pricing, context length, and reliability.

But it does make the integration layer much simpler.

When direct integration may be better

A gateway is not always the right choice.

Direct provider integration may be better if:

you need provider-specific beta features immediately
you already have enterprise contracts
your compliance process requires direct vendor relationships
your app only uses one model and probably always will

That is a fair tradeoff.

The point is not "always use a gateway."

The point is this:

If you are going to test multiple models anyway, your app should not need a rewrite every time.

TokenBay example

TokenBay is an OpenAI-compatible API gateway for accessing models such as GPT, Claude, Gemini, DeepSeek, and others through one endpoint and one API key.

It includes: