Kuldeep Paul

Posted on Jun 1

How to Restrict GPT-5 Access to a Single Team with Virtual Keys

Bifrost, the open-source AI gateway, restricts frontier models like GPT-5 to specific teams using virtual keys with model filtering, budgets, and rate limits. Control AI costs and enforce governance at the gateway layer.

GPT-5 is among the most capable and expensive models available through OpenAI's API. When access is unrestricted across an organization, token spend becomes unpredictable and usage patterns grow inconsistent. Bifrost, the open-source AI gateway built in Go and available on GitHub, provides the most direct path for enterprises to restrict GPT-5 access to a single team while maintaining a unified API across every provider.

Using virtual keys, you define exactly which teams can call which models, attach budgets and rate limits to each team's access, and reject any request that falls outside your boundaries. This post covers how to implement this pattern step by step, starting with creating a team, then a model-scoped virtual key, and finally enforcing the restriction across every request.

What Are Virtual Keys in Bifrost

Virtual keys are the foundational governance entity in Bifrost. A virtual key is a credential that applications present on each request, and it carries its own set of permissions, budgets, and rate limits. Rather than distributing raw OpenAI API keys to every team, you issue virtual keys that specify exactly which models and providers each consumer is allowed to access.

Each virtual key supports several controls that determine access:

Access control: model and provider filtering, so a key can be restricted to a specific set of models such as GPT-5 only
Cost management: an independent budget checked in addition to any team or customer budget
Rate limiting: token-based and request-based throttling applied at the virtual key level
Team attachment: a virtual key belongs to one team, one customer, or neither, but never both simultaneously
Status control: enable or disable a key instantly without rotating provider credentials

Because Bifrost resolves model access at the virtual key layer, restricting GPT-5 to a single team becomes a configuration change rather than a code change in every downstream application.

Why Restrict GPT-5 Access to a Single Team

Restricting GPT-5 to one team is fundamentally a cost and governance decision. Frontier models like GPT-5 have driven significant increases in coding, agent-building, and reasoning workloads since release, and enterprise reporting indicates that the economics of running frontier models remain demanding for both providers and customers. When every team has unrestricted access to the most expensive model, spend is difficult to forecast and even harder to attribute.

Scoping GPT-5 to a single team directly addresses several concrete problems:

Predictable spend: only one team can incur GPT-5 token costs, and that team operates within a fixed monthly budget
Clear attribution: usage maps to a known group, which simplifies internal chargeback, cost allocation, and reporting
Reduced blast radius: a misconfigured client elsewhere in the organization cannot accidentally route traffic to GPT-5
Staged rollout: a pilot team can validate GPT-5's value in production before broader access is granted to the organization

This pattern is common when a frontier model is approved for one specific use case, such as advanced research or agentic coding, while the rest of the organization continues to use lower-cost alternatives. Bifrost makes the boundary explicit and automatically enforced at the gateway. For a broader view of how access and cost controls work together, the Bifrost governance overview explains how virtual keys, teams, and budgets fit into a hierarchical governance model.

How to Restrict GPT-5 Access with Virtual Keys in Bifrost

To restrict GPT-5 access to a single team, create a team in Bifrost, then create a virtual key that allows only GPT-5 from OpenAI and attach it to that team. The steps below use the Bifrost governance API; the same actions are available in the web UI.

Step 1: Create the team

A team groups virtual keys and supports department-level budget management. Create the team that will own GPT-5 access:

curl -X POST http://localhost:8080/api/governance/teams \
  -H "Content-Type: application/json" \
  -d '{
    "name": "AI Research Team",
    "budget": { "max_limit": 500.00, "reset_duration": "1M" }
  }'

Teams support independent budgets but do not carry rate limits; rate limiting is applied at the virtual key level. See the budget and limits documentation for the full set of options.

Step 2: Create a virtual key that allows only GPT-5

Create a virtual key with a provider configuration that lists GPT-5 as the only allowed model, then attach it to the team using team_id. The allowed_models array is what enforces the restriction to GPT-5:

curl -X POST http://localhost:8080/api/governance/virtual-keys \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GPT-5 Research Key",
    "description": "GPT-5 access scoped to the AI Research Team",
    "provider_configs": [
      {
        "provider": "openai",
        "weight": 1.0,
        "allowed_models": ["gpt-5"]
      }
    ],
    "team_id": "team-ai-research-001",
    "is_active": true
  }'

With this configuration, the virtual key can only call GPT-5 through OpenAI. A request for any other model, or any other provider, is rejected immediately. Because only this key includes GPT-5 in its allowed_models list, no other team's virtual key can reach the model.

Step 3: Restrict the key to specific provider credentials (optional)

If you maintain separate OpenAI API keys per cost center or business unit, you can pin the virtual key to specific provider credentials with key_ids. This ties GPT-5 usage to a designated billing key:

{
  "provider_configs": [
    {
      "provider": "openai",
      "weight": 1.0,
      "allowed_models": ["gpt-5"],
      "key_ids": ["openai-research-key"]
    }
  ]
}

When key_ids is set, the virtual key can use only those provider keys. An empty array or an omitted field denies all keys, and ["*"] allows every configured key. This level of control over credential management ensures GPT-5 spend remains on a single, auditable provider key.

Adding Budgets and Rate Limits to the GPT-5 Key

A model restriction controls which model a team can call. Budgets and rate limits control how much. Attaching both to the GPT-5 virtual key transforms a binary access grant into a bounded one. Configure budgets and rate limits directly on the virtual key:

curl -X POST http://localhost:8080/api/governance/virtual-keys \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GPT-5 Research Key",
    "provider_configs": [
      { "provider": "openai", "weight": 1.0, "allowed_models": ["gpt-5"] }
    ],
    "team_id": "team-ai-research-001",
    "budget": { "max_limit": 300.00, "reset_duration": "1M" },
    "rate_limit": {
      "token_max_limit": 200000,
      "token_reset_duration": "1h",
      "request_max_limit": 500,
      "request_reset_duration": "1m"
    },
    "is_active": true
  }'

This configuration applies three independent controls to GPT-5 access:

Budget: a monthly dollar cap that resets on your chosen duration (1m, 1h, 1d, 1w, 1M, or 1Y)
Token limit: a ceiling on tokens consumed per time period
Request limit: a ceiling on requests per time period

The virtual key's budget is checked together with the team's budget, so the GPT-5 key cannot exceed either its own cap or the AI Research Team's department-level budget. This hierarchical cost control across the virtual key, team, and customer levels is central to how Bifrost keeps frontier-model spend bounded and predictable.

Enforcing the Restriction Across Every Request

Creating a scoped virtual key is only effective if every request is required to present a valid key. Bifrost can enforce virtual key authentication on all inference traffic, which closes the pathway where a client calls the gateway without a key. Enable enforcement in the client configuration:

curl -X PUT http://localhost:8080/api/config \
  -H "Content-Type: application/json" \
  -d '{ "client_config": { "enforce_auth_on_inference": true } }'

When enforcement is enabled, any request without the x-bf-vk header is rejected. Applications then send their virtual key on each call, using the OpenAI-style Authorization header:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <GPT5_VIRTUAL_KEY>" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-5", "messages": [{"role": "user", "content": "..."}]}'

A second behavior reinforces the boundary. When a client lists available models using a virtual key, Bifrost returns only the providers and models that key is permitted to use. Teams without the GPT-5 virtual key never see GPT-5 in the model list, which reduces accidental requests and keeps error metrics meaningful across supported providers.

Common Questions About Restricting GPT-5 Access

How does Bifrost restrict GPT-5 to one team?

Bifrost restricts GPT-5 to one team through a virtual key whose allowed_models list contains only gpt-5, attached to that team with team_id. No other virtual key includes GPT-5 in its allowed models, so no other team can reach the model. Because access is resolved at the gateway layer, the restriction holds regardless of which application sends the request.

Can I set a spending cap on GPT-5 usage?

Yes. Attach a budget to the GPT-5 virtual key with a max_limit and a reset_duration. The virtual key budget is checked alongside the team budget, so usage stops when either cap is reached. You can also apply token and request rate limits on the same key for more granular control.

Can a team use GPT-5 alongside other approved models?

Yes. Add additional models to the allowed_models array on that team's virtual key, for example ["gpt-5", "gpt-4o-mini"]. The key then permits only those models and rejects everything else.

Does this require self-hosting?

Bifrost is open source and available on GitHub for self-hosting. Governance features including virtual keys, teams, and budgets are part of the open-source gateway. For in-VPC deployments, role-based access control, and immutable audit logs required by compliance frameworks, the enterprise tier extends the same governance model with additional capabilities.

Getting Started with GPT-5 Access Control in Bifrost

Restricting GPT-5 access to a single team with virtual keys gives platform teams a precise, enforceable boundary around frontier-model spend. You define the allowed model on a virtual key, attach it to one team, add a budget and rate limits, and enforce authentication so every request flows through governance. Because Bifrost sits in front of every provider through a single OpenAI-compatible API, this same pattern extends to any model, any provider, and any number of teams as your access policies grow.

To see how Bifrost can centralize GPT-5 access control and cost governance across your AI infrastructure, book a demo with the Bifrost team.

DEV Community