DEV Community

Kuldeep Paul
Kuldeep Paul

Posted on

Claude Code Governance: How an AI Gateway Secures Agent Access

Effective Claude Code governance requires a gateway-layer control point. Bifrost delivers that control through virtual keys, credential isolation, MCP tool filtering, and immutable audit trails for every agent session.

From a single terminal session, Claude Code reads full repositories, runs shell commands, and invokes external MCP tools, all authenticated with a raw provider API key. When developers share the same key, the same model permissions, and an unrestricted tool surface, a single leaked credential or an over-privileged session crosses from a billing problem into a security incident. Claude Code governance is how platform teams insert identity, access control, and audit between the agent and the provider, so every request is scoped, attributable, and recoverable after the fact. Bifrost, the Go-based open-source AI gateway built by Maxim AI, enforces those controls at the gateway layer through virtual keys, credential isolation, MCP tool filtering, guardrails, and audit logs. This guide explains what governance actually requires and how Bifrost applies it without touching the developer workflow.

The Five Security Controls That Claude Code Governance Requires

At its core, Claude Code governance means running every agent request through a control point that authenticates the caller, keeps real provider credentials off developer machines, limits which models and tools each session can reach, and captures each action in a tamper-proof log. An AI gateway is that control point; the Claude Code client itself does not need to change.

The risks that make this necessary are well established. The OWASP Top 10 for LLM Applications places excessive agency and sensitive information disclosure in its list of the most critical LLM security risks, and an autonomous coding agent that holds file access, shell access, and tool-calling privileges exercises that agency on every run. Five controls address those risks comprehensively:

  • Credential isolation: provider keys are held centrally in the gateway, not distributed to individual developer machines
  • Identity and attribution: each request is tied to a developer, team, or project rather than a shared API key
  • Access scoping: per-key policies define which models, providers, and MCP tools any given session is allowed to reach
  • Containment: rate limits and guardrails bound the damage a compromised or runaway session can cause
  • Audit and evidence: request-level records covering session, model, token, and user data for compliance purposes

All five controls are applied through gateway-level governance, which means they take effect on every Claude Code session uniformly, regardless of developer, provider, or deployment environment.

Why Anthropic's Native Controls Leave Gaps

Out of the box, Claude Code authenticates directly with a single provider key per developer. Anthropic's Team and Enterprise plans include SSO and admin controls, but they stop short of fine-grained, cross-provider access control and request-level audit logs that platform teams need. Direct-to-provider usage produces four security gaps:

  • Broadly scoped credentials: a raw provider key stored on a laptop can be copied, committed to version control, or reused in ways the platform team cannot track
  • Uniform access across roles: every developer, from intern to staff engineer, reaches the same provider catalog, model set, and tool surface
  • No MCP tool boundary: a connected MCP server exposes all of its tools to every session
  • No per-request identity or audit record: the provider console shows aggregate usage, not who ran which session, against which model, at what cost

Anthropic's cost documentation reports that 90 percent of users stay under $30 per active day, which still exposes organizations to a long tail of sessions costing hundreds of dollars. A flat, identical-access model cannot contain that tail. The gateway layer is where these gaps close, specifically through virtual keys that attach identity and policy to every request before any provider receives it.

Routing Claude Code Through Bifrost

Bifrost intercepts Claude Code's requests at the transport layer before they reach any upstream provider. Connecting Claude Code to the gateway takes two environment variable changes; no modifications to the Claude Code binary, extensions, or workflow are needed.

# Route Claude Code through the Bifrost gateway
export ANTHROPIC_BASE_URL="http://localhost:8080/anthropic"

# Authenticate using a Bifrost virtual key (no Anthropic account required)
export ANTHROPIC_AUTH_TOKEN="your-virtual-key"

# Start Claude Code
claude
Enter fullscreen mode Exit fullscreen mode

Bifrost exposes an Anthropic-compatible API surface, so streaming responses, tool calls, and extended thinking all continue to function without modification. The full setup, including settings.json configuration and provider passthrough patterns for Bedrock, Vertex, and Azure, is covered in the Claude Code integration guide.

Does the gateway put provider credentials on developer machines?

No. With the recommended ANTHROPIC_AUTH_TOKEN method, Claude Code transmits only the Bifrost virtual key. Real provider credentials are stored centrally in the gateway and injected per-request. Revoking or rotating a key takes effect immediately on the next request, with no key rotation ceremony and no environment variable changes required across developer machines.

What does this add to the developer's daily workflow?

In practice, nothing. Only the base URL and the virtual key differ from a direct-to-Anthropic setup. Developers run the same claude command and keep all existing session features, including /model switching. Bifrost adds 11 microseconds of per-request overhead at 5,000 requests per second in sustained performance benchmarks.

Virtual Keys: Where the Security Boundary Is Defined

Virtual keys are the core governance primitive in the Bifrost AI gateway. Each key represents one authenticated consumer, whether that is an individual developer, a squad, a CI pipeline, or an external tenant, and carries a policy defining exactly what that consumer can do:

  • Provider and model scoping: a key configured for Sonnet and Haiku cannot call Opus or reach an unlisted provider; attempting to switch out-of-scope via /model returns a clear error rather than executing silently
  • MCP tool filtering: tool filtering on a per-virtual-key basis determines which tools each agent session can call, so a developer building a support agent sees CRM and ticketing tools, while an SRE sees Kubernetes and observability tools
  • Tenant isolation: virtual keys scoped per team or customer keep access, budgets, and usage data cleanly separated across organizational boundaries

MCP tool filtering is the most direct answer to the excessive agency risk: the full set of connected tools remains available in the gateway, but each session's visible surface is constrained to what that role is authorized to use. A single GitHub or database MCP server no longer functions as an all-or-nothing grant to every developer.

Rate Limits and Guardrails as Containment Controls

Even with access scoped correctly, a compromised key or a misbehaving automated session needs to be bounded. Bifrost provides two containment mechanisms at the gateway layer: rate limits and guardrails.

Rate limits operate at the virtual key level, capping requests per minute for each consumer. This covers subagent loops, automation scripts that run unbounded, and leaked keys being driven programmatically. Setting a tighter per-minute ceiling on CI automation keys than on interactive developer keys means one failed pipeline cannot consume the team's entire throughput allocation.

Guardrails evaluate prompts and completions as they transit the gateway. Secrets detection scans for API keys, tokens, and credentials that a developer might paste into context or that a model might reproduce in its output. Custom regex patterns allow teams to redact or block organization-specific sensitive strings. Content-safety enforcement through AWS Bedrock Guardrails, Azure Content Safety, and Patronus AI applies consistently across all Claude Code sessions, configured once in the gateway rather than per developer.

Building an Audit and Compliance Record

Governance that cannot be demonstrated to auditors has limited value. Audit logs in Bifrost generate an immutable, request-level record of every Claude Code session, covering model, token counts, tool call activity, and the virtual key or user behind each request. This record meets the evidence requirements for SOC 2, HIPAA, GDPR, and ISO 27001.

For workloads in regulated industries, observability adds Prometheus metrics and OpenTelemetry distributed traces, each tagged with virtual key, team, model, and tool identifiers. Security and platform teams can query exactly which model a developer or agent used, when, and how many tokens it consumed. Bifrost can be deployed in-VPC or in air-gapped environments so no inference traffic or credential exits the organization's own infrastructure; SSO through OIDC maps virtual keys to directory identities in Okta or Microsoft Entra, closing the loop between organizational identity management and per-request attribution.

Start Governing Claude Code Agent Access

Claude Code governance is, first, a security problem. The provider console provides a spending record; it does not keep credentials off laptops, restrict tool access by role, stop a runaway session, or prove who ran what. Placing Bifrost between Claude Code and the provider adds virtual keys, tool filtering, guardrails, and audit logs behind the same Anthropic-compatible endpoint the agent already targets, with no change to how developers work. The Bifrost governance resource page documents the full control set, and the LLM Gateway Buyer's Guide covers how gateway options compare on governance and security depth.

To apply Claude Code governance to your engineering organization's actual usage, book a Bifrost demo with the team.

Top comments (0)