DEV Community

Debby McKinney
Debby McKinney

Posted on

Self-Host Your LLM Gateway or Try the Managed Version (Bifrost OSS & Enterprise)

Two Ways to Run Bifrost

Bifrost is an open-source LLM gateway (MIT licensed) built for production: 11μs overhead, 5K+ RPS, 50x faster than Python alternatives.

You can:

  1. Self-host the open-source version (free forever)
  2. Try the managed version with enterprise features (14 days free)

What's in Open Source (Self-Hostable)

Everything you need for production LLM infrastructure:

  • Multi-provider routing (OpenAI, Anthropic, Azure, Vertex, Ollama)
  • Adaptive load balancing (auto-detects degraded keys)
  • Semantic caching (40% cost reduction)
  • Virtual keys with budgets and rate limits
  • MCP (Model Context Protocol) support
  • Basic observability

Deploy anywhere:

git clone https://github.com/maximhq/bifrost
docker compose up
Enter fullscreen mode Exit fullscreen mode

Runs on your infrastructure. Data never leaves your VPC.

GitHub logo maximhq / bifrost

Fastest LLM gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support & <100 µs overhead at 5k RPS.

Bifrost

Go Report Card Discord badge Known Vulnerabilities codecov Docker Pulls Run In Postman Artifact Hub License

The fastest way to build AI applications that never go down

Bifrost is a high-performance AI gateway that unifies access to 15+ providers (OpenAI, Anthropic, AWS Bedrock, Google Vertex, and more) through a single OpenAI-compatible API. Deploy in seconds with zero configuration and get automatic failover, load balancing, semantic caching, and enterprise-grade features.

Quick Start

Get started

Go from zero to production-ready AI gateway in under a minute.

Step 1: Start Bifrost Gateway

# Install and run locally
npx -y @maximhq/bifrost

# Or use Docker
docker run -p 8080:8080 maximhq/bifrost
Enter fullscreen mode Exit fullscreen mode

Step 2: Configure via Web UI

# Open the built-in web interface
open http://localhost:8080
Enter fullscreen mode Exit fullscreen mode

Step 3: Make your first API call

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello, Bifrost!"}]
  }'
Enter fullscreen mode Exit fullscreen mode

That's it! Your AI gateway is running with a web interface for visual configuration, real-time monitoring…


What the Managed Version Adds

Built for teams needing governance, compliance, and scale.

1. Hierarchical Governance

Customer (org-wide budget)
  ↓
Team (department budget)
  ↓
Virtual Key (user/app budget + rate limits)
  ↓
Provider Config (per-provider limits)
Enter fullscreen mode Exit fullscreen mode

Budget example:

{
  "customer": "acme-corp",
  "budget": 10000,
  "teams": [
    {"name": "engineering", "budget": 5000},
    {"name": "marketing", "budget": 2000}
  ]
}
Enter fullscreen mode Exit fullscreen mode

When engineering hits $5K, requests stop automatically.

2. Enterprise Security

Vault Integration:

  • HashiCorp Vault
  • AWS Secrets Manager
  • Google Secret Manager
  • Azure Key Vault

API keys sync automatically. Rotate in your vault, Bifrost picks it up.

SSO:

  • SAML 2.0
  • OAuth 2.0 / OpenID Connect
  • Active Directory / LDAP
  • Supports: Azure AD, Okta, Google Workspace, Auth0

3. Multi-Tenancy

Serve multiple organizations through one instance:

  • Per-customer budgets
  • Isolated usage tracking
  • Separate audit logs
  • Custom policies per customer

Perfect for SaaS platforms or agencies managing multiple clients.

4. Persistent Storage

Choose your backend:

  • PostgreSQL
  • MySQL

Query historical data. Build analytics. Export for compliance.

5. Advanced Observability

Every request logged with:

  • Token counts + costs
  • Virtual key used
  • Team/customer attribution
  • Model/provider routing
  • Full audit trail

6. Production Features

  • Horizontal scaling across regions
  • Zero-downtime deployments
  • Health checks + readiness probes
  • Circuit breakers
  • Automatic failover

What to Choose

Self-host open source if:

  • You're under 100 RPS
  • Single team using it
  • Manual budget tracking is fine
  • Community support works for you

Managed version if:

  • Multiple teams/departments
  • Need budget enforcement per team
  • Compliance requirements (SOC 2, GDPR, HIPAA)
  • Want SSO + vault integration
  • Serving multiple customers (multi-tenant)
  • Need 24/7 support

Comparison

Feature Open Source Managed
Performance 11μs, 5K RPS Same
Multi-provider
Semantic caching
Load balancing
Virtual keys
Budget limits Virtual key only Customer → Team → VK
Rate limiting Basic Token + request limits
Vault integration ✓ (4 providers)
SSO ✓ (SAML, OAuth, LDAP)
Multi-tenancy
PostgreSQL/MySQL
Audit compliance Basic Enterprise-grade
Support Community 24/7

Try Both

Open Source:

bash

git clone https://github.com/maximhq/bifrost
cd bifrost
docker compose up
Enter fullscreen mode Exit fullscreen mode

Visit localhost:8080 and start routing requests.

Managed (14 Days Free):

Explore Now

No credit card. Full enterprise features. Deploy in your VPC or use our cloud.

Et Voila


Docs: docs.getbifrost.ai GitHub: github.com/maximhq/bifrost

Built by Maxim AI – we also build evaluation and observability tools for production AI systems.

Top comments (0)