The Mystery of the 37-Second Lambda Delay (And How AWS EventBridge Fooled Us)

#aws #eventbridge #lambda #serverless

The Mystery of the 37-Second Lambda Delay (And How AWS EventBridge Fooled Us)

We’ve all been there. Everything works flawlessly in your SQA environment, but the moment your code hits UAT, it behaves like it’s wading through molasses.

Recently, we ran into a bizarre ghost in our AWS infrastructure: a Node.js Lambda function, triggered on a regular 10-minute interval by Amazon EventBridge, was consistently taking 37 seconds to log its very first line of code.

We initially thought it was a classic cold start or VPC network issue, but the real culprit turned out to be much sneakier. Here is how we realized EventBridge completely fooled us.

The Problem: The 37-Second Wall

In SQA, the Lambda invoked within 2 to 3 seconds. In UAT, it took 37 seconds.

We tried changing the trigger to run every single minute, but the 37-second delay still happened. We even threw Provisioned Concurrency at it to force the containers to stay warm, but it made absolutely no difference. The START RequestId log line stubbornly refused to print until the 37th second of the minute.

The environment wasn't lagging; it was completely stalled before our code even kicked off.

How We Debugged It

The breakthrough came when we stopped looking at the CloudWatch timestamps and looked inside the actual event object payload passed into the Node.js handler:

{
  "source": "aws.events",
  "time": "2026-06-14T18:00:37Z",
  "resources": ["arn:aws:events:...:rule/ten-min-cron"]
}

When we looked at that "time" field generated by EventBridge, the lightbulb finally went on. The timestamp read exactly :37 seconds past the minute.

EventBridge wasn't even sending the event to our Lambda until the 37th second. Our Lambda wasn't lagging; it was executing the exact millisecond AWS handed it the job.

The Realization: EventBridge Jitter & The Redo

As it turns out, AWS explicitly states that EventBridge scheduled rules have a 60-second precision window. To prevent millions of customer crons from firing at exactly 12:00:00.000 and melting downstream services worldwide (the "thundering herd" problem), AWS intentionally jitters and staggers the execution across those first 60 seconds.

Our UAT cron rule just happened to get dealt a brutal 37-second delay slot by AWS's internal scheduling engine when the infrastructure was first built.

To prove it, we completely destroyed our UAT infrastructure and stood it up again from scratch. When the new EventBridge rule was created, AWS assigned it a completely different internal bucket. Boom—the delay instantly dropped to 8 seconds.

The Final Takeaway

If you are triggering Lambdas via an ALB or API Gateway, AWS treats it as live traffic and routes it in milliseconds. But with EventBridge crons, you are completely at the mercy of the schedule lottery.

Our code wasn't broken, and our network was fine. It was just luck of the draw with AWS's background clock engine. If your background tasks can handle running a few seconds late, save yourself the headache and just let EventBridge do its thing!