My cron job was silently failing on Cloudflare. The bug wasn't where I looked.

#cloudflare #webdev #serverless #debugging

The deploy was green. The build passed. And my data just... stopped updating. No crash. No red. No alert. Just a table that quietly stopped getting new rows, which is the worst kind of bug, because nothing tells you it's happening. You find out when you notice the numbers look stale and think "huh, that's weird," three days later.
Here's the trap I fell into, and the debugging lesson I wish I'd had tattooed on my arm before I started.

The setup

I had a small cron Worker running on Cloudflare. Every few hours it pulled a list of items from an external API and upserted them into Postgres. Boring. Reliable. Ran fine for weeks.
Then I shipped one new feature: for each item, fetch an extra bit of metadata from a second endpoint before saving. One more fetch() per item. Felt harmless.

The next run, my upserts returned 0 rows. Every batch. Silently.

The actual error

It took digging into the logs to find the real message, because the failure never bubbled up to anything I was watching:

Too many subrequests by single Worker invocation.

A subrequest is any outbound fetch() from your Worker.. every API call, every database round-trip, all of it. And on Cloudflare's free plan, you get 50 external subrequests per invocation. That's it. Cross the line and every subsequent fetch() throws, including the ones writing to your database.

Why my first fix was wrong (and it'll be yours too)

Here's the part I'm a little embarrassed about.
I already had batching logic. My upserts went out in groups of 25.. I'd written that ages ago, felt clever about it. So when I saw "too many subrequests," my brain went straight there: the batches are too big, lower the batch size.

I spent a solid hour tuning batch sizes. 25 to 15. 15 to 10. Still failing.

Because the batches were never the problem.

The new metadata feature fired one fetch per item—100 items, 100 subrequests.. and it did all of them before a single upsert ran. I'd blown the entire 50-request budget during the enrichment loop. By the time the (carefully batched, very clever) database writes started, the Worker was already over its cap. Every write failed.

I was optimizing the visible, satisfying-to-tune loop. The real cost was a quiet for loop in a different file that I'd added without thinking of it as "network" at all.

The lesson that outlived the bug

When you hit a resource cap, count the resource. Not the thing that looks expensive.

"Subrequests" doesn't feel like a thing you count. It feels like infrastructure. But the limit is a literal integer, and the fix started the moment I stopped guessing and actually tallied every fetch() across the whole invocation-DB calls and the new loop.. Instead of staring at the one piece of code that looked heavy.

The expensive looking code and the code that's actually blowing your budget are seldom the same code. The batching was a red herring precisely because it looked like the optimization-worthy part.

How I actually fixed it

A few options, depending on your situation:

Cut the subrequests. Did I need that metadata for every item on every run? No. I dropped the per-item enrichment way down and the budget problem evaporated.
Move the heavy fetching off the Worker. A CI runner (GitHub Actions, etc.) has no subrequest cap. Enrichment that doesn't have to live in the request path doesn't need to be in the Worker.
Pay. Cloudflare's paid plan ($5/mo) bumps the limit to 10,000 subrequests per invocation, and as of early 2026 you can configure it up to 10 million. For a lot of side projects that one line is the cheapest fix you'll ever buy.

I went with the first option, because the honest answer was that I didn't need most of those calls in the first place.

The takeaway

Two things to steal from my afternoon:

Serverless platforms cap outbound requests, and the failure can be completely silent. If you're on Cloudflare Workers free tier, that number is 50 external subrequests per invocation. Know your platform's number before you add a loop of fetch() calls.
When you're over a limit, don't optimize what looks expensive.. count the actual resource. The bug is usually hiding in the code you didn't think of as "that kind of code."

The most dangerous loop in your project is the one you didn't notice you wrote.

Top comments (1)

Mustafa ERBAY • Jun 4

One thing I’ve learned from distributed systems is that resource limits are often easier to understand than resource consumption.Most people know the limit.Far fewer know where the budget is actually being spent.
CPU, memory, connections, tokens, API quotas, subrequests — the pattern is usually the same. The bottleneck rarely hides inside the component everyone is watching.