If you're building LLM apps at scale, your gateway shouldn't be the bottleneck. That’s why we built Bifrost, a high-performance, fully self-hosted LLM gateway that’s optimized for speed, scale, and flexibility, built from scratch in Go.
Bifrost is designed to behave like a core infra service. It adds minimal overhead at extremely high load (e.g. ~11µs at 5K RPS) and gives you fine-grained control across providers, monitoring, and transport.
Key features:
- Built in Go, optimized for low-latency, high-RPS workloads
- ~11µs mean overhead at 5K RPS (40x lower than LiteLLM)
- ~9.5x faster and ~54x lower P99 latency vs LiteLLM
- Works out-of-the-box via npx @maximhq/bifrost
- Supports OpenAI, Anthropic, Mistral, Ollama, Bedrock, Groq, Perplexity, Gemini and more
- Unified interface across providers with automatic request transformation
- Built-in support for MCP tools and server
- Visual Web UI for real-time monitoring and configuration
- Prometheus scrape endpoint for metrics
- HTTP support with gRPC coming soon
- Self-hosted, Apache 2.0 licensed
If you're running into performance ceilings with tools like LiteLLM or just want something reliable for prod, give it a shot.
GitHub: https://github.com/maxim-ai/bifrost
Docs: https://www.getmaxim.ai/bifrost
👉 We’re live on Product Hunt! Would love your support or feedback.
https://www.producthunt.com/products/maxim-ai/launches/bifrost-2
Top comments (0)