DEV Community

GWEN
GWEN

Posted on

How to switch AI models without rewriting your app

Most AI apps start with one model provider.

That is usually the right choice. For a first version, you want one SDK, one API key, one billing page, and one model name. Simple is good when you are trying to ship.

But once the product grows, the model decision gets more complicated.

You may want to test another model because:

  • one model is better at reasoning
  • another model is faster for chat
  • another one is cheaper for background jobs
  • another model handles long context better
  • you want a fallback when one provider is slow or unavailable

The annoying part is that switching models is often not just changing a string.

It can mean adding another SDK, another API key, another request format, another dashboard, and another set of provider-specific edge cases.

That gets messy quickly.

Before: direct OpenAI integration

A first version might look like this:

python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_OPENAI_API_KEY",
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": "Write a short onboarding message for a developer tool."
        }
    ],
)

print(response.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

This is clean and totally fine.

But if you later want to compare Claude, Gemini, DeepSeek, or another model family, you may not want to rewrite your AI integration around each provider.

After: use an OpenAI-compatible gateway

One practical option is to use an OpenAI-compatible API gateway.

Your app keeps using the OpenAI SDK style, but the gateway lets you route requests to different model families through one endpoint.

I work on the TokenBay team, so the example below uses TokenBay. The general idea applies to any OpenAI-compatible gateway.

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenbay.com/v1",
    api_key="YOUR_TOKENBAY_API_KEY",
)

response = client.chat.completions.create(
    model="gpt-5.4-mini",
    messages=[
        {
            "role": "user",
            "content": "Write a short onboarding message for a developer tool."
        }
    ],
)

print(response.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

The main change is just:

python
base_url="https://api.tokenbay.com/v1"
api_key="YOUR_TOKENBAY_API_KEY"
Enter fullscreen mode Exit fullscreen mode

That is the useful part.

You keep the familiar OpenAI client shape, but you are no longer wiring every provider separately.

Try another model

Once your app uses an OpenAI-compatible endpoint, testing another supported model can be as simple as changing the model name.

python
response = client.chat.completions.create(
    model="claude-sonnet-4.6",
    messages=[
        {
            "role": "user",
            "content": "Write a short onboarding message for a developer tool."
        }
    ],
)

Enter fullscreen mode Exit fullscreen mode

Or:

python
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {
            "role": "user",
            "content": "Write a short onboarding message for a developer tool."
        }
    ],
)

Enter fullscreen mode Exit fullscreen mode

The point is not that every model behaves the same.

They do not.

The point is that your business logic should not need to change every time you want to compare models.

Put the model in config

For a real app, I would keep the base URL, API key, and model name in environment variables:

bash
LLM_BASE_URL=https://api.tokenbay.com/v1
LLM_API_KEY=YOUR_TOKENBAY_API_KEY
LLM_MODEL=gpt-5.4-mini
Enter fullscreen mode Exit fullscreen mode

Then your application code can stay stable while you test different models.

Change config, redeploy, compare results.

Very boring. Very useful.

When this pattern helps

This setup is useful if you are:

  • building an AI SaaS product
  • comparing cost and quality across models
  • using different models for chat, reasoning, extraction, or fallback
  • trying to avoid provider-specific code too early
  • managing multiple projects or API keys

It does not magically solve model selection. You still need to test output quality, latency, pricing, context length, and reliability.

But it does make the integration layer much simpler.

When direct integration may be better

A gateway is not always the right choice.

Direct provider integration may be better if:

  • you need provider-specific beta features immediately
  • you already have enterprise contracts
  • your compliance process requires direct vendor relationships
  • your app only uses one model and probably always will

That is a fair tradeoff.

The point is not "always use a gateway."

The point is this:

If you are going to test multiple models anyway, your app should not need a rewrite every time.

TokenBay example

TokenBay is an OpenAI-compatible API gateway for accessing models such as GPT, Claude, Gemini, DeepSeek, and others through one endpoint and one API key.

It includes:

  • pay-as-you-go billing
  • API key management
  • usage logs
  • per-key limits

If you want to test this pattern, you can try TokenBay here:

[Try TokenBay]https://www.tokenbay.com/?utm_source=devto&utm_medium=community_content&utm_campaign=week1_free_content

Current launch offer:

  • 15% off most models
  • 500 free credits
  • invite a friend and get 200 credits each

I would love feedback from builders:

  • Do you prefer direct provider APIs or one OpenAI-compatible endpoint?
  • How do you currently compare model cost and quality?
  • What would make you trust or not trust an AI model gateway?

Top comments (0)