DEV Community

Cover image for How I handled 100K requests hitting AWS Lambda at once
Subodh Shetty
Subodh Shetty

Posted on • Originally published at Medium

How I handled 100K requests hitting AWS Lambda at once

One night ~100K requests landed on our API in under a minute.
The stack looked like this:

API Gateway → Lambda → RDS

It worked fine… until it didn’t:

  • Lambda maxed out concurrency.
  • RDS was about to collapse under too many connections.
  • Cold starts started spiking latency.

The fix wasn’t fancy-just an old systems trick: put a buffer in the middle.

We moved to API Gateway → SQS → Lambda, with a few AWS knobs:

  • Reserved concurrency (cap Lambda at safe levels)
  • DLQ (so we didn’t lose poison messages)
  • CloudWatch alarms (queue depth + message age)
  • RDS Proxy (to stop Lambda → DB connection storms)

Here’s what the reserved concurrency config looks like in practice:

aws lambda put-function-concurrency \
  --function-name ProcessOrders \
  --reserved-concurrent-executions 50
Enter fullscreen mode Exit fullscreen mode

That way, even if 100K requests pile up, we never overwhelm the DB.


⚡ But this only works for asynchronous APIs (where the client can accept a 202 ACK).
If the API must be synchronous, you need other tools: rate limiting, provisioned concurrency, or even containers.

I wrote a full breakdown with diagrams, configs, and lessons learned here: Read on Medium


Question for Dev.to readers:
How do you handle sudden API spikes in your setup - buffer with queues, scale containers, or something else?

Top comments (0)