Bifrost serves as the open-source AI gateway for deploying Claude Code across enterprises, enabling centralized cost governance, access controls, and immutable audit trails.
By the end of 2026, Gartner predicts that 40% of enterprise applications will integrate task-specific AI agents, a dramatic leap from less than 5% today. Coding agents such as Claude Code represent a substantial portion of this shift. Yet when Claude Code scales beyond individual developers to hundreds of engineers across multiple teams and projects, a governance gap emerges: every API call is direct with no inherent cost limits, no team-level spend tracking, and no immutable record of activity. Bifrost, an open-source AI gateway written in Go by Maxim AI, sits between Claude Code and your model providers to enforce these controls at the infrastructure layer. This guide walks through routing Claude Code through a gateway and configuring the cost, access, and audit systems necessary for safe deployment at organizational scale.
Why Governance for Claude Code Breaks at Scale
A single developer running Claude Code presents minimal governance challenges; costs are consolidated on one invoice and any errors have limited impact. But introduce a hundred engineers running it concurrently across multiple teams, repositories, and initiatives, and that same setup fractures into opaque costs, no attribution by team, and no enforcement mechanism for platform teams. Claude Code governance refers to the suite of controls that restore transparency and policy enforcement when an agent sends requests directly to a provider by default.
Scale surfaces consistent governance gaps across organizations:
- No spending limits. Any developer's credentials can drive costs unbounded, with no visibility until the billing cycle closes.
- No team-level attribution. A single invoice from your provider cannot reveal which team, initiative, or engineer is responsible for the spend.
- No model restrictions. Every developer retains access to any model, from the least to the most expensive, regardless of task complexity.
- No verifiable audit records. There is no cryptographically signed trail of who sent what request to which model at what time, making SOC 2, HIPAA, GDPR, and ISO 27001 compliance difficult to demonstrate.
These are manageable when only one application relies on the feature. They become system-level problems when Claude Code becomes shared infrastructure across the organization, and they become non-negotiable blockers when an auditor or a Fortune 500 procurement team scrutinizes the deployment. The 2026 Gartner Hype Cycle for Agentic AI emphasizes that governance, security, and cost-containment capabilities must develop alongside core agent technologies. Centralizing Claude Code through a gateway is the standard approach to closing these gaps, and the governance layer is what transforms a developer tool into managed infrastructure.
Running Claude Code Through a Gateway
To route all Claude Code through a single enforcement point, every request inherits that point's centralized policies. Claude Code operates across three model tiers (Sonnet, Opus, and Haiku) and reads provider settings from a settings.json file. Pointing it at the gateway means setting the base URL to your Bifrost instance and using a gateway-issued virtual key instead of Anthropic credentials.
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:8080/anthropic",
"ANTHROPIC_AUTH_TOKEN": "your-virtual-key",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-6",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-haiku-4-6"
}
The virtual key is sent automatically in the Authorization: Bearer header by Claude Code, and Bifrost recognizes it for request routing and credential validation. Using the ANTHROPIC_AUTH_TOKEN approach means no Anthropic account credentials are needed; billing and access flow solely through the gateway's virtual key. The complete configuration is available in the Claude Code integration guide.
This setup requires minimal disruption: engineers continue using Claude Code exactly as before, while every request now flows through a control plane. Because the same tiers can be remapped to any available provider, platform teams can route Claude Code to Anthropic models on AWS Bedrock, Google Vertex AI, or Azure for residency compliance, all without modifying developer machines.
Real-Time Cost Governance for Claude Code
In Bifrost, cost governance operates in real time, not as a post-hoc report. The foundation is the virtual key, a gateway-issued credential that ties to a budget, rate limits, an allowlist of models, and a routing policy, with no direct dependency on the underlying provider credentials. Once a budget is exhausted, subsequent requests are rejected before incurring further cost.
The budget structure is hierarchical. The policy framework moves from Customer (organization) to Team to User to Virtual Key, with each layer maintaining its own independent budget:
- Organization-level budget: the overall cap for the entire account or business unit.
- Team-level budget: a sub-allocation cut from the organization cap for a department or cost center.
- Per-virtual-key budget: the ceiling for a specific engineer, service, or repository.
For any Claude Code request to succeed, it must clear every applicable budget and rate limit in the chain. When the request completes, the cost is deducted at each relevant tier. For example, a ten-person team with a $500 monthly budget and individual $75 limits per engineer means either threshold can trigger rejection. Using virtual keys, you can set budgets and rate limits with reset cycles that span a day, week, month, or year.
Two more mechanisms specifically target Claude Code expenses:
- Semantic caching. Semantically similar cached queries are returned from cache instead of being re-executed, decreasing both token consumption and response time.
- Code Mode for MCP. When Claude Code invokes tools via the MCP gateway, Code Mode permits the model to synthesize multiple tool calls into a single code block instead of making round-trips, often resulting in significant token savings. Token cost reductions of 92% or higher are documented at scale using the MCP gateway approach.
Since each request carries its virtual key ID, provider, model, token consumption, and calculated cost, platform teams get per-team and per-developer cost visibility that no single provider invoice can deliver.
Access and Usage Policy Controls
Governance also covers which teams can access Claude Code, which models they can target, and what external tools the agent is permitted to execute. Bifrost enforces these decisions at the gateway layer rather than relying on distributed developer configurations.
The key access controls for Claude Code deployments include:
- Model allowlists per virtual key. Limit expensive model tiers to selected teams. Requests to unauthorized models fail immediately before incurring cost.
- Role-based access control. RBAC comes with pre-built Admin, Developer, and Viewer roles plus unlimited custom roles, managing who can edit gateway settings, adjust budgets, and change guardrails.
- OpenID Connect and directory integration. Enterprise deployments can use OIDC with Okta, Microsoft Entra, Keycloak, Zitadel, and Google Workspace, with automatic role syncing from identity provider groups, streamlining onboarding and offboarding.
- MCP tool-level access. Per-virtual-key MCP tool configuration controls which tools a Claude Code session is permitted to call, enabling scenarios like "read files but not delete."
Key revocation is immediate. When a virtual key is disabled, every active Claude Code session using that key loses access at once, critical for when an engineer leaves or a credential is suspected of being compromised. The governance capability documentation maps each control to a specific deployment pattern.
Immutable Audit Logs for Compliance
Audit logging creates a tamper-proof record of every Claude Code interaction that audit and security teams can rely on. Bifrost writes audit data to an append-only store with cryptographic immutability verification, generating evidence trails sized for SOC 2, GDPR, HIPAA, and ISO 27001 compliance requirements.
The audit log records the events most critical for regulated AI:
- Authentication and authorization events, capturing which user made each request.
- Policy adjustments, ensuring every change to budgets, routing, or access rules is traceable.
- Model access and data events, recording who queried which model and at what time.
Logs retain a configurable lifespan; common configurations keep logs for one year and relocate to cold storage after 90 days. In compliance-heavy contexts, storing full request and response data can itself be a liability rather than a control, so Bifrost permits disabling full-content capture per deployment while preserving status, timing, and model metadata.
Audit logs stay accessible outside the gateway itself. Exports push records to Elastic, Splunk, Datadog, S3-compatible stores, and webhook endpoints, allowing security operations to handle Claude Code audits within their existing SIEM infrastructure. A minimal audit setup looks like:
"enterprise": {
"audit_logs": {
"enabled": true,
"retention": { "duration": "365d", "archive_after": "90d" },
"immutability": { "enabled": true, "verification_method": "cryptographic_hash" }
}
}
Deploying Claude Code Governance Across Your Organization
A phased approach lets platform teams enforce controls gradually without disrupting workflows. The typical sequence involves pointing Claude Code at the gateway, issuing virtual keys by team or project, establishing conservative budgets, and enabling audit logging before rolling out broader access.
How do I measure Claude Code spending by team?
Create a dedicated virtual key per team or initiative and require it in every Claude Code invocation. Each key produces per-request telemetry including token counts and costs, which roll up automatically to team and organization budgets. Invoice reconciliation becomes unnecessary.
Can I route Claude Code to non-Anthropic models?
Absolutely. Claude Code's Sonnet, Opus, and Haiku tiers map to any provider Bifrost supports, including Anthropic models on AWS Bedrock, Vertex AI, and Azure. Routing rules and model allowlists determine which mappings each virtual key can use.
How does this work in regulated or disconnected environments?
Bifrost deployments stay within your own infrastructure. In-VPC mode keeps all prompts and responses inside your private network. The Bifrost Enterprise tier supplies the compliance controls that regulated industries require. For high availability, clustering offers peer-to-peer failover with zero-downtime updates. At 5,000 requests per second, Bifrost adds only 11 microseconds of latency, so centralized control adds negligible overhead.
This same pattern extends to other coding agents such as Codex CLI and Gemini CLI, giving platform teams a single unified control surface across their entire coding-agent fleet.
Moving Claude Code to Managed Infrastructure with Bifrost
Deploying Claude Code at enterprise scale is fundamentally a governance and cost challenge, not primarily a model challenge. Bifrost, the open-source gateway, transforms a provider-direct coding agent into controlled infrastructure with hierarchical budgets, model and tool allowlists, and append-only audit logs, all without altering the developer experience. Your engineers maintain their productivity while platform, finance, and compliance teams gain the controls and visibility they require.
To understand how Bifrost can bring cost controls, governance, and audit capabilities to your Claude Code infrastructure, book a demo with the Bifrost team.
Top comments (0)