One night ~100K requests landed on our API in under a minute.
The stack looked like this:
API Gateway → Lambda → RDS
It worked fine… until it didn’t:
- Lambda maxed out concurrency.
- RDS was about to collapse under too many connections.
- Cold starts started spiking latency.
The fix wasn’t fancy-just an old systems trick: put a buffer in the middle.
We moved to API Gateway → SQS → Lambda, with a few AWS knobs:
- Reserved concurrency (cap Lambda at safe levels)
- DLQ (so we didn’t lose poison messages)
- CloudWatch alarms (queue depth + message age)
- RDS Proxy (to stop Lambda → DB connection storms)
Here’s what the reserved concurrency config looks like in practice:
aws lambda put-function-concurrency \
--function-name ProcessOrders \
--reserved-concurrent-executions 50
That way, even if 100K requests pile up, we never overwhelm the DB.
⚡ But this only works for asynchronous APIs (where the client can accept a 202 ACK).
If the API must be synchronous, you need other tools: rate limiting, provisioned concurrency, or even containers.
I wrote a full breakdown with diagrams, configs, and lessons learned here: Read on Medium
Question for Dev.to readers:
How do you handle sudden API spikes in your setup - buffer with queues, scale containers, or something else?
Top comments (0)