Unlock Claude Code: How to Run Any LLM Provider with Bifrost CLI

#cli #tooling #ai #llm

While Claude Code stands out as a premier agentic coding assistant that operates directly within the terminal, it is natively tethered to Anthropic’s proprietary model ecosystem. This restriction creates a significant bottleneck for engineering teams seeking versatility. Consequently, organizations wishing to leverage models like GPT-5, Gemini 2.5 Pro, or Groq—or those relying on cloud infrastructure via AWS Bedrock and Azure—are often forced to juggle disparate toolsets or manually tweak environment variables for every new session.

Bifrost CLI effectively removes these barriers. Designed as an interactive terminal utility, it bridges Claude Code (along with alternatives like Codex CLI, Gemini CLI, and Opencode) to the Bifrost AI Gateway. This integration empowers developers to operate Claude Code using any model from any provider via a singular command, eliminating the need for environment variable exports, configuration file edits, or provider-specific onboarding.

The Case for an AI Gateway in Claude Code Workflows

Claude Code relies extensively on tool calling for executing file operations, terminal commands, and code edits. By default, all prompts are routed straight to the Anthropic API. For scaling engineering teams, this architecture presents several critical challenges:

Restricted Model Choice: Teams lack the agility to switch to OpenAI, Google, Mistral, or self-hosted alternatives without manually reconfiguring environment variables for each instance.
Single Point of Failure: There is no inherent automatic failover. If Anthropic’s API suffers downtime or rate limits, the Claude Code session freezes completely, lacking a mechanism to pivot to an alternative provider.
Lack of Financial Oversight: Without an intermediary gateway layer, enforcing per-developer budgets, usage caps, or detailed tracking across a team utilizing Claude Code simultaneously is impossible.
Fragmented Visibility: Individual sessions do not generate unified telemetry, leaving engineering leads blind to usage patterns, error frequencies, and token consumption metrics across the organization.

Bifrost addresses these issues by positioning itself as a high-performance proxy between Claude Code and upstream providers, adding a negligible 11 microseconds of overhead per request even at 5,000 requests per second.

Deploying Bifrost CLI: A 2-Minute Guide

Prerequisites for setup include Node.js 18+ and an active Bifrost gateway. The process is fully interactive and user-friendly.

1. Initialize the Gateway
Run the following command to start the gateway locally:

npx -y @maximhq/bifrost

This spins up the gateway at http://localhost:8080, complete with a web interface for provider configuration and live monitoring.

2. Execute the CLI
Launch the command line interface with:

npx -y @maximhq/bifrost-cli

The CLI will guide you through the setup:

Base URL: Input the gateway address (default is http://localhost:8080).
Virtual Key: If the gateway utilizes virtual key authentication, enter it here. Bifrost secures this in the OS keyring rather than storing it in plaintext.
Harness Selection: Select Claude Code from the menu. The CLI can automatically install it via npm if it is missing.
Model Selection: The tool retrieves available models via the gateway’s /v1/models endpoint, offering a searchable list. Choose any model from any configured provider.

Upon confirmation, Claude Code launches with all necessary environment variables and API paths configured automatically—no export ANTHROPIC_BASE_URL required.

Executing Claude Code with Non-Anthropic Models

Bifrost handles the translation of Anthropic API requests to other provider formats, enabling Claude Code to function with 20+ supported providers. You can specify targets using the provider/model-name syntax:

OpenAI: openai/gpt-5
Google Gemini: gemini/gemini-2.5-pro
Groq: groq/llama-3.3-70b-versatile
Mistral: mistral/mistral-large-latest
xAI: xai/grok-3
Self-hosted (Ollama): ollama/llama3

Claude Code operates on three internal tiers: Sonnet (standard), Opus (complex reasoning), and Haiku (speed). Through Bifrost, each tier can be overridden independently to utilize distinct providers. For instance, a team might assign GPT-5 to the Sonnet tier, Gemini 2.5 Pro for Opus-level tasks, and a Groq-hosted model for rapid Haiku operations.

Switching models mid-session is seamless using the /model command:

/model openai/gpt-5
/model gemini/gemini-2.5-pro
/model bedrock/claude-sonnet-4-5

Note: Non-Anthropic models must support tool use capabilities, as Claude Code depends on this for file and terminal operations.

Streamlined Cloud Integration: Bedrock, Vertex, and Azure

For enterprises leveraging cloud infrastructure, Bifrost CLI automates the complexities of authentication and routing.

AWS Bedrock: Bifrost’s Bedrock passthrough manages AWS authentication. By setting CLAUDE_CODE_SKIP_BEDROCK_AUTH=1, Bifrost handles credentials and cross-region routing via its adaptive load balancer.
Google Vertex AI: Similar automation applies to Vertex endpoints, transparently handling GCP OAuth and project settings.
Azure: Since Claude Code lacks native Azure passthrough, Bifrost routes traffic through an Anthropic-compatible endpoint, translating requests for Azure-hosted models internally.

This functionality is vital for regulated sectors requiring data to remain within specific cloud environments, supported by Bifrost's in-VPC deployment option.

Advanced UI: Tabbed Sessions and MCP Support

Bifrost CLI features a persistent tabbed terminal UI, allowing developers to manage multiple agent sessions in parallel. A status bar at the bottom indicates session states (active, idle, alert). Users can toggle tabs with Ctrl+B or launch new sessions with different models instantly.

For Claude Code, the CLI auto-registers the MCP Gateway endpoint. This grants access to all configured MCP tools within the coding session without manual claude mcp add-json commands. If a virtual key is active, authenticated MCP access is configured automatically, facilitating interactions with external databases, filesystems, and custom business logic through MCP servers.

Enterprise-Grade Control and Insights

Managing costs and usage across a development team requires robust oversight. Bifrost delivers this through multiple layers:

Virtual Keys: Assign virtual keys to developers or teams with specific budgets, rate limits, and model permissions.
Budget Enforcement: Implement hierarchical cost controls at the key, team, or customer level to prevent budget overruns.
Native Observability: Track every request with native Prometheus metrics and OpenTelemetry tracing. Gain real-time insights into token usage, latency, and provider health.
Compliance Logging: Bifrost Enterprise offers immutable audit trails for comprehensive compliance tracking.

Conclusion

Bifrost is open source on GitHub and requires only two commands to initialize. Teams requiring enterprise governance, adaptive load balancing, SSO, and in-VPC deployments for their Claude Code operations should book a Bifrost demo to see how it integrates with existing infrastructure.