Wiring Claude Code up to OpenAI models is one of the more frequent asks from engineering teams that have committed to Anthropic's terminal coding agent but want freedom at the model layer. Out of the box, Claude Code only speaks to Anthropic's API, which closes the door on cross-provider benchmarking, redundancy planning, and certain cost-saving setups. Bifrost, an open-source AI gateway built by Maxim AI, eliminates that wall by silently translating Claude Code's Anthropic-format payloads into OpenAI's Chat Completions schema. What follows is the full setup playbook: installing Bifrost, switching model tiers, and confirming that tool calls behave correctly under load.
Why Teams Want Claude Code on OpenAI Models
Claude Code's core strengths (multi-file refactors, terminal automation, file-level edits) have made it a default choice for many engineering organizations. That said, real production teams often have legitimate reasons to run Claude Code with OpenAI models for some or all of that traffic:
- Token economics: At current rates, GPT-5.2 input pricing of $1.75 per million tokens and output pricing of $14 per million can come in below Claude Sonnet on input-heavy workloads.
- Cross-model evaluation: Comparing how the GPT-5 family handles a specific repository, without rebuilding the surrounding agent setup from scratch.
- Vendor diversification: Removing single-provider risk from a workflow that engineers depend on every day.
- Regulatory routing: Certain organizations are required to send LLM traffic through Azure OpenAI for residency or compliance reasons.
- Workload-specific strengths: Tapping OpenAI models on terminal-heavy benchmarks where they currently hold leads.
None of these are achievable without an intermediary. Claude Code only knows how to speak Anthropic's Messages API, and OpenAI uses an entirely different request format. Bifrost bridges this gap by exposing a fully Anthropic-compatible endpoint that rewrites traffic in flight to whichever upstream you target.
How Bifrost Translates Claude Code Traffic for OpenAI
Bifrost operates as a high-performance proxy positioned between Claude Code and the upstream LLM providers. The full traffic path looks like this:
- Rather than reaching
api.anthropic.com, Claude Code sends an Anthropic Messages API call to Bifrost. - Bifrost reads the request, parses the
provider/modelidentifier (such asopenai/gpt-5), and rewrites the payload into OpenAI's Chat Completions shape. - The reshaped request goes to OpenAI's endpoint with the configured credentials.
- OpenAI's reply is converted back into the Anthropic response schema before returning to Claude Code.
- From Claude Code's perspective, nothing has changed; the response looks like a native Anthropic answer.
All of this conversion runs at the gateway layer with sustained 5,000 RPS benchmarks showing only 11 microseconds of added overhead per request. Bifrost maintains independent performance benchmarks covering throughput and latency across every supported provider.
The whole approach rests on Bifrost's drop-in replacement design. Only one thing changes for Claude Code: the base URL. Everything else stays untouched.
Step 1: Install Bifrost and Bring It Up Locally
Bifrost runs locally as a gateway process that Claude Code talks to over HTTP. The quickest way to get an instance running is via NPX:
npx -y @maximhq/bifrost
That single command brings Bifrost up on http://localhost:8080, complete with a built-in web dashboard for configuring providers, inspecting request logs, and watching live traffic. No YAML, no env scaffolding; the gateway is genuinely zero-config at startup.
For deployments past local development, Docker is the cleaner option:
docker run -p 8080:8080 -v $(pwd)/data:/app/data maximhq/bifrost
Teams running Bifrost as shared infrastructure across an organization typically prefer the Kubernetes deployment guide.
Step 2: Add OpenAI as a Configured Provider
With the gateway up, navigate to the web UI on http://localhost:8080 and register OpenAI as a provider. You'll need two things:
- An OpenAI API key authorized for the models you intend to use (GPT-5, GPT-4o, GPT-4o-mini, etc.).
- A friendly name for the key, useful later for tracking and rotation.
Provider setup can happen via the UI, the provider configuration API, or a config.json file. If you prefer the file-based path, the structure looks something like this:
{
"providers": {
"openai": {
"keys": [
{
"name": "openai-primary",
"value": "env.OPENAI_API_KEY",
"models": [],
"weight": 1.0
}
]
}
}
}
The env.OPENAI_API_KEY reference pulls the actual key from the environment, keeping secrets out of source-controlled config. For enterprise rollouts, HashiCorp Vault, AWS Secrets Manager, and similar integrations replace plain environment variables entirely.
Step 3: Redirect Claude Code to the Gateway
Once Bifrost is running and OpenAI is wired up, swing Claude Code over to the gateway by exporting two environment variables:
export ANTHROPIC_API_KEY=your-bifrost-virtual-key
export ANTHROPIC_BASE_URL=http://localhost:8080/anthropic
If your Bifrost instance does not have virtual keys turned on, set ANTHROPIC_API_KEY to dummy. When virtual keys are enabled, that value is the anchor point for governance: budgets, rate limits, and per-key tool filtering all attach to it.
After exporting the variables, fire up claude in a fresh terminal. Every Claude Code request now flows through Bifrost.
Step 4: Swap Model Tiers Out for OpenAI Equivalents
Internally, Claude Code organizes requests into three tiers: Sonnet (the workhorse for most tasks), Opus (heavy reasoning), and Haiku (quick, lightweight calls). Each one can be remapped independently with environment variables:
# Remap the Sonnet tier to GPT-5 for everyday coding work
export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5"
# Remap the Opus tier to GPT-4o for deeper reasoning
export ANTHROPIC_DEFAULT_OPUS_MODEL="openai/gpt-4o"
# Leave Haiku on Anthropic for low-latency operations
export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4-5-20251001"
The provider/model notation tells Bifrost exactly where to send the request. Anything you have configured in your Bifrost instance is fair game, including the 20+ supported providers: Anthropic, AWS Bedrock, Google Vertex AI, Azure OpenAI, Groq, Mistral, Cohere, and Ollama for local inference, among others.
To route everything through OpenAI, override all three tiers:
export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5"
export ANTHROPIC_DEFAULT_OPUS_MODEL="openai/gpt-4o"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="openai/gpt-4o-mini"
Claude Code can also be launched with a one-shot model via the --model flag, or you can change models mid-conversation using /model openai/gpt-5 from inside the agent. The handoff is immediate, and existing context carries forward.
Step 5: Confirm That Tool Calls Actually Work
This is the step most teams skip, and the one that matters most when running Claude Code with OpenAI models. Claude Code leans heavily on tool calls for file edits, terminal commands, and code modifications. Not every model, and not every provider, streams tool-call arguments correctly. Before you call the configuration done, run a tool-heavy task and check the basics:
- File reads and writes finish cleanly.
- Multi-step shell commands run without empty-argument failures.
- Long-running edits keep their context across multiple turns.
GPT-5 and GPT-4o from OpenAI both support native tool calling and behave reliably with Claude Code. Some aggregator-style services, however, do not stream function call arguments properly, which causes Claude Code to break silently on file operations. If basic chat works but file edits don't, the most likely culprit is broken tool-call streaming on the upstream provider; in that case, point Bifrost at a different provider for that tier.
Step 6: Layer in Failover, Caching, and Governance
The moment Claude Code is going through Bifrost, the gateway's broader feature set is available without any further work on the Claude Code side:
- Provider failover: Set up fallback chains so that an OpenAI rate limit or outage transparently shifts traffic to Anthropic, Bedrock, or Vertex without dropping the user's session.
- Semantic response caching: Bifrost's semantic caching trims token spend on semantically similar prompts, which adds up fast across a team using Claude Code throughout the day.
- Virtual keys with budgets: Virtual keys carry per-engineer or per-team budgets, rate limits, and access policies, giving managers hierarchical cost visibility across teams and customers.
- Telemetry and traces: Bifrost emits OpenTelemetry spans and Prometheus metrics natively, so every Claude Code interaction shows up in Grafana, New Relic, or Datadog.
- Policy enforcement: For regulated workloads, enterprise guardrails plug in AWS Bedrock Guardrails, Azure Content Safety, and Patronus AI to apply content policy across Claude Code traffic.
If you're operating Claude Code at scale, the Claude Code integration page walks through the full configuration surface, including AWS Bedrock passthrough and Google Vertex AI authentication. Teams looking at broader terminal-agent rollouts should also check the CLI agents resource page.
Making the Configuration Stick
Setting environment variables in a single shell is fine for testing, but for everyday usage you'll want them persisted. Drop the exports into ~/.bashrc, ~/.zshrc, or whichever shell config you maintain:
# Bifrost gateway connection
export ANTHROPIC_API_KEY=your-bifrost-virtual-key
export ANTHROPIC_BASE_URL=http://localhost:8080/anthropic
# Tier-level model overrides
export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5"
export ANTHROPIC_DEFAULT_OPUS_MODEL="openai/gpt-4o"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="openai/gpt-4o-mini"
Running Claude Code inside VS Code instead? Install the Claude Code extension and feed those same variables into its settings. Because Bifrost works at the protocol layer, every Claude Code surface (terminal, VS Code, JetBrains plugins) behaves identically once routed through the gateway.
Start Running Claude Code with OpenAI Models
Pointing Claude Code at Bifrost is a few minutes of work and unlocks the multi-provider model flexibility most engineering teams want in 2026. With it in place, you can run Claude Code with OpenAI models, Anthropic models, Bedrock-hosted Claude, Vertex Gemini, Groq's open-weight models, or Ollama on your laptop, all without leaving the Claude Code interface. A single environment variable governs which model is in play.
To explore enterprise-grade Claude Code deployments (clustering, RBAC, in-VPC rollouts, immutable audit logs) book a demo with the Bifrost team, or head over to the Bifrost GitHub repository and start running Claude Code with OpenAI models today.
Top comments (0)