"How I Route claude-sonnet-4-6 to GPT-5 Codex — Without Claude Code Knowing the Difference"

#webdev #javascript #ai #opensource

Claude Code always sends claude-sonnet-4-6 in the request body. That string goes to whatever base URL you've configured.

Here's what most people don't realize: that string doesn't have to end up at Anthropic.

It doesn't even have to end up at a Claude model.

The Model Name Is a Routing Hint, Not a Destination

When Claude Code makes a request, it sends something like this:

{
  "model": "claude-sonnet-4-6",
  "messages": [...],
  "stream": true
}

If your ANTHROPIC_BASE_URL points to a local proxy instead of api.anthropic.com, that proxy receives the request first. It can read the model field, and decide what to do with it.

That decision is entirely up to you.

What CliGate Does With It

CliGate is a local proxy that sits at localhost:8081. Every AI coding tool I use — Claude Code, Codex CLI, Gemini CLI — routes through it.

When a request for claude-sonnet-4-6 arrives, CliGate checks its routing table:

claude-sonnet-4-6  →  ChatGPT account pool  →  GPT-5.2 Codex
claude-opus-4-6    →  ChatGPT account pool  →  GPT-5.3 Codex
claude-haiku-4-5   →  Kilo AI (free)        →  DeepSeek R1 / Qwen3

Claude Code asked for claude-sonnet-4-6. What actually handles the request is GPT-5.2 Codex, via a rotating pool of ChatGPT accounts. The response comes back in Anthropic's response format. Claude Code never knows the difference.

Why This Works

The magic is in protocol translation. CliGate translates between Anthropic's Messages API format and OpenAI's Chat Completions format at the proxy layer. Claude Code speaks Anthropic protocol. GPT-5.2 Codex speaks OpenAI protocol. The proxy bridges them invisibly.

From Claude Code's perspective, it sent a request and got back a valid streaming Anthropic response. The model name in the response is echoed back correctly. Everything behaves as expected.

The same logic applies to the haiku model. When Claude Code sends a quick completion request using claude-haiku-4-5, that gets routed to DeepSeek R1 or Qwen3 through Kilo AI — completely free, no API key required. Claude Code sees a streaming Anthropic response and moves on.

Setting This Up

The routing table lives in CliGate's Settings tab. Each model can be mapped to:

A specific ChatGPT account (or the account pool, for automatic rotation)
A Claude account (direct Anthropic protocol, no translation needed)
An API key (OpenAI, Anthropic, Azure, Vertex AI, Gemini, etc.)
The free routing path via Kilo AI

You can also set a Priority Mode for each model: account pool first (free tier), or API key first (more reliable). If the first option fails or is exhausted, the proxy falls back to the next one automatically.

One practical configuration I've settled on:

claude-sonnet-4-6:  ChatGPT account pool  (4 accounts, round-robin)
claude-opus-4-6:    Anthropic API key     (reserved for long context work)
claude-haiku-4-5:   Free routing          (DeepSeek R1 via Kilo AI)

This means the vast majority of my coding requests go through the ChatGPT account pool at no API cost. The Anthropic key only gets touched for heavy reasoning tasks. Haiku requests are free.

The Part That Surprised Me

I expected some quality degradation when routing sonnet requests to GPT-5.2 Codex. For most coding tasks, I didn't notice any.

Code generation, test writing, refactoring, explaining stack traces — these all behaved identically from Claude Code's interface. The model was different. The output quality was comparable. The cost was zero (account pool, no API billing).

The cases where I do notice a difference are long multi-file reasoning tasks, where I've configured the fallback to use the Anthropic API key directly. But those are a small fraction of the total request volume, as the usage stats from yesterday confirmed.

Why This Matters Beyond Cost

The cost savings are real, but that's not the most interesting part.

The more interesting implication is that your AI coding tool no longer locks you into a single provider's ecosystem. You chose Claude Code for its UX and agent loop — not necessarily because Anthropic's API is the only place you want your requests going.

With a proxy routing layer, those are two separate decisions. You can use the tool you like with the backend that makes sense for each request type.

The model name in your config is just a string. Where it goes is up to the routing layer.

CliGate is open source: github.com/codeking-ai/cligate

Curious what routing setups others have tried — are you using a single provider for everything, or have you experimented with mixing backends?