Everyone calls their product a "gateway" now. LiteLLM markets itself as both a proxy and a gateway. Portkey is a gateway. Helicone's docs use proxy and gateway interchangeably. There's a well-cited Medium post by Bijit Ghosh that ranks on Google for this comparison — correct high-level definitions, but it stops before the implementation details that tell you what to actually choose and deploy.
Here's the precise version: three different layers, concrete Go code for each, and a decision framework based on team size.
TL;DR:
- Proxy = transport layer. Pipes requests from your app to the provider
- Router = decision layer. Chooses which model or provider handles the request
- Gateway = policy layer. Auth, rate limits, budget enforcement, audit trails
- They're not separate products — they're three layers of the same stack
The Proxy: Transport Layer
A proxy intercepts your HTTP request and forwards it to the provider. Your app changes one thing: the base_url.
// Before
client := openai.NewClient(apiKey)
// After — same SDK, same code, different URL
client := openai.NewClient(
apiKey,
openai.WithBaseURL("https://proxy.your-company.com/v1"),
)
A minimal Go proxy handler:
func (p *Proxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// Swap client key → upstream provider key
r.Header.Set("Authorization", "Bearer "+p.providerKey)
target, _ := url.Parse("https://api.openai.com")
proxy := httputil.NewSingleHostReverseProxy(target)
proxy.ServeHTTP(w, r)
}
That's the core. A proxy doesn't decide anything — it doesn't choose GPT-4o over GPT-4o-mini, doesn't enforce rate limits. It pipes traffic. Everything else is built on top of this.
The Router: Decision Layer
A router decides which model and provider handle each request. It returns a routing decision; the proxy executes it. The router is pure business logic — no HTTP, no transport — which makes it testable independently and swappable without touching the proxy.
Cost-based routing (most valuable):
func (r *Router) Route(req *ChatRequest) RoutingDecision {
complexity := r.estimateComplexity(req)
switch {
case complexity < 0.3:
// Short, simple: classification, extraction, booleans
return RoutingDecision{Model: "gpt-4o-mini", Provider: "openai"}
case complexity < 0.7:
// Medium: summarization, structured output
return RoutingDecision{Model: "gpt-4o", Provider: "openai"}
default:
// Complex: multi-step reasoning, code generation
return RoutingDecision{Model: "claude-opus-4-6", Provider: "anthropic"}
}
}
Failover routing:
var providerChain = []RoutingDecision{
{Model: "gpt-4o", Provider: "openai"},
{Model: "claude-sonnet-4-6", Provider: "anthropic"},
{Model: "gemini-1.5-pro", Provider: "google"},
}
func (r *Router) RouteWithFailover(req *ChatRequest) RoutingDecision {
for _, candidate := range providerChain {
if r.circuit.IsAvailable(candidate.Provider) {
return candidate
}
}
return providerChain[len(providerChain)-1]
}
Metadata-based routing (route by feature tag your app sets):
func (r *Router) RouteByTag(req *ChatRequest, headers http.Header) RoutingDecision {
switch headers.Get("X-Feature") {
case "support-bot":
return RoutingDecision{Model: "gpt-4o-mini", Provider: "openai"}
case "code-review":
return RoutingDecision{Model: "claude-sonnet-4-6", Provider: "anthropic"}
default:
return r.Route(req)
}
}
The Gateway: Policy Layer
A gateway adds policy enforcement above the router and proxy. The defining characteristic: the gateway has a concept of identity. It knows which team or user is sending each request and enforces rules based on that identity.
In Go, a gateway is a middleware chain wrapping the proxy:
func BuildGateway(proxy http.Handler) http.Handler {
return chain(
AuthMiddleware, // validate key → resolve tenant identity
RateLimitMiddleware, // per-tenant request + token rate limits
BudgetMiddleware, // per-team monthly spend enforcement
AuditMiddleware, // log every request with identity + decision
proxy,
)
}
func AuthMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
key := r.Header.Get("Authorization")
tenant, err := db.LookupTenant(key)
if err != nil {
http.Error(w, "unauthorized", 401)
return
}
r = r.WithContext(context.WithValue(r.Context(), tenantKey, tenant))
r.Header.Set("Authorization", "Bearer "+tenant.ProviderKey)
next.ServeHTTP(w, r)
})
}
func BudgetMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
tenant := r.Context().Value(tenantKey).(*Tenant)
if tenant.MonthlySpend >= tenant.BudgetLimit {
http.Error(w, `{"error":"budget_exceeded"}`, 429)
return
}
next.ServeHTTP(w, r)
})
}
A proxy is stateless with respect to the caller. A gateway is not.
How Products Map to These Layers
| Product | Proxy | Router | Gateway | Cost Intelligence |
|---|---|---|---|---|
| LiteLLM | ✓ | ✓ (100+ providers) | Partial | — |
| Helicone | ✓ | — | Partial | Basic |
| Portkey | ✓ | ✓ | ✓ (enterprise) | Basic |
| Langfuse | — (async only) | — | — | Basic |
| Preto | ✓ | ✓ | ✓ | ✓ (recommendations) |
One thing to know about Langfuse: it's an async observer — it doesn't sit in the request path. Zero proxy latency, but also no caching, routing, or real-time budget enforcement. A deliberate architectural choice — just a different layer entirely — fine if you only need post-hoc observability and don't need caching, routing, or budget enforcement.
What You Actually Need
One team, one model, under $2K/month → direct SDK calls. Add a proxy for logging once you have real traffic to observe.
Multiple models, cost visibility needed → proxy + router. One URL change gives you per-request cost attribution and the ability to route simple tasks to cheaper models. Teams typically see 20–40% cost reduction within the first week of enabling model routing.
Multiple teams, budget enforcement needed → gateway. The moment two teams share an API key and neither can see what the other spends, you have a governance problem. A bill spike hits. Nobody knows which team caused it. Nobody can be held accountable.
Compliance requirements (SOC 2, HIPAA, GDPR) → gateway with audit logging and PII controls. A gateway gives you the audit trail to prove it.
We're building Preto.ai — all three layers (proxy + router + gateway) plus cost intelligence in one URL change. Free up to 10K requests.
Top comments (0)