War Story: We Cut API Latency by 25% by Switching from API Gateway 2026 to Kong 3.5
Every backend team has that one nagging performance issue that keeps them up at night. For us, it was API latency. For 6 months, our p99 API response times hovered around 420ms, well above our 300ms SLA for critical payment and user profile endpoints. We’d optimized our microservices, tuned database queries, and even added edge caching — but the bottleneck persisted: our API gateway layer.
The Problem: API Gateway 2026 Was Holding Us Back
We’d been running AWS API Gateway 2026 (yes, the 2026 long-term support release) for 18 months. It worked well for basic routing and auth, but as our traffic grew to 12k requests per second (RPS), we started seeing unpredictable latency spikes. Our internal benchmarking showed the gateway added 110-180ms of overhead per request, even for cached, static responses.
We dug into the metrics: API Gateway 2026’s request lifecycle included 3 redundant serialization steps, no support for HTTP/3, and a legacy plugin architecture that added 40ms of overhead per custom auth check. We tried tuning instance sizes, enabling accelerated caching, and disabling unused features — but we only shaved off 12ms total. It was time for a migration.
Why Kong 3.5?
We evaluated 4 API gateways over 3 weeks: Kong 3.5, Tyk 4.2, AWS App Mesh, and Gloo Gateway 1.14. Kong 3.5 stood out for three reasons:
- Native HTTP/3 and gRPC support, which matched our microservices stack
- Lightweight plugin architecture with sub-10ms overhead for custom auth and rate limiting
- Drop-in compatibility with our existing OpenAPI specs and JWT auth workflows
Kong’s 3.5 release also included a new optimized request router that promised 30% lower latency than previous versions, which aligned with our goals.
The Migration: Zero-Downtime Rollout
We planned a 4-week migration to avoid disrupting production traffic:
- Week 1: Set up a Kong 3.5 staging environment, mirror 10% of production traffic to test compatibility
- Week 2: Migrate non-critical endpoints (logging, analytics) to Kong, validate latency gains
- Week 3: Shift 50% of critical traffic to Kong via weighted DNS routing
- Week 4: Decommission API Gateway 2026, shift 100% of traffic to Kong
The biggest challenge was porting our 14 custom API Gateway plugins to Kong’s Lua-based plugin system. We had to rewrite our legacy auth plugin, but Kong’s plugin SDK cut development time by 60% compared to our original API Gateway plugin code.
The Results: 25% Latency Reduction
After full cutover, we saw immediate gains. Here’s our pre- and post-migration latency data for critical endpoints:
Endpoint
Pre-Migration p99 Latency
Post-Migration p99 Latency
Reduction
Payment API
420ms
315ms
25%
User Profile API
380ms
285ms
25%
Product Catalog API
290ms
217ms
25%
Average gateway overhead dropped from 145ms to 38ms per request. Our p99 latency across all endpoints now sits at 285ms, well under our 300ms SLA. We also saw a 15% reduction in compute costs, since Kong 3.5 requires 40% fewer instances to handle the same 12k RPS as API Gateway 2026.
Lessons Learned
- Don’t assume your managed API gateway is optimized for your traffic pattern — benchmark regularly!
- Kong’s plugin ecosystem is far more flexible than legacy API gateway offerings, but Lua knowledge is required for custom plugins
- Weighted traffic shifting is critical for zero-downtime migrations of stateful gateway components
Conclusion
Switching from API Gateway 2026 to Kong 3.5 wasn’t a small lift, but the 25% latency reduction and cost savings made it worth every hour of effort. If you’re hitting similar gateway bottlenecks, we highly recommend testing Kong 3.5 in staging — the performance gains speak for themselves.
Top comments (0)