Bifrost: The Fastest Open-Source LLM Gateway (40x Faster than LiteLLM, Go-Powered, Fully Self-Hosted)

#ai

If you're building LLM apps at scale, your gateway shouldn't be the bottleneck. That’s why we built Bifrost, a high-performance, fully self-hosted LLM gateway that’s optimized for speed, scale, and flexibility, built from scratch in Go.

Bifrost is designed to behave like a core infra service. It adds minimal overhead at extremely high load (e.g. ~11µs at 5K RPS) and gives you fine-grained control across providers, monitoring, and transport.

Key features:

Built in Go, optimized for low-latency, high-RPS workloads
~11µs mean overhead at 5K RPS (40x lower than LiteLLM)
~9.5x faster and ~54x lower P99 latency vs LiteLLM
Works out-of-the-box via npx @maximhq/bifrost
Supports OpenAI, Anthropic, Mistral, Ollama, Bedrock, Groq, Perplexity, Gemini and more
Unified interface across providers with automatic request transformation
Built-in support for MCP tools and server
Visual Web UI for real-time monitoring and configuration
Prometheus scrape endpoint for metrics
HTTP support with gRPC coming soon
Self-hosted, Apache 2.0 licensed