Yesterday’s Cloudflare outage wasted a few hours of my time — not because the outage was confusing, but because my monitoring stack gave me zero context about what was actually failing.
Everything lit up red. Every alert fired.
But nothing told me whether the problem was:
my origin
Cloudflare’s edge
DNS
SSL
routing
My servers were completely fine the whole time.
The real issue ended up being Cloudflare’s Bot Management system (a feature file doubled in size and tripped them up).
The bigger discovery:
most monitoring tools cannot tell the difference between an origin outage and a CDN/edge outage.
So I built a simple tool today to diagnose exactly that:
Paste a URL and it checks:
origin health
Cloudflare/Vercel/AWS edge
DNS
SSL expiry
CDN failure patterns
I built this out of frustration, but if you’re interested I’d love to hear how your monitoring handled the outage yesterday.
Top comments (3)
Super interesting build! yesterday’s outage highlighted how most monitoring stacks can’t distinguish between edge failures and actual server issues, and that’s a huge problem for anyone running sites on platforms like Cloudways, DigitalOcean, Vultr, or even managed hosts like Kinsta. A tool that separates origin health from CDN or Cloudflare-level failures is exactly what was missing, because most of us wasted time debugging perfectly healthy servers while the edge was the real culprit.
Appreciate that - yeah the other day really exposed how fragile most monitoring setups are when the edge falls over.
Ours treated the Cloudflare issue exactly the same as an origin outage, so we ended up debugging stuff that wasn’t broken. It felt like we were flying blind for hours.
The real win is being able to see “origin green, edge red” at a glance so you don’t waste an afternoon on false alarms and so you can update your users before they inform you 😭😭🤣
Curious how your stack handled it - did you get any useful signal or was it also just a wall of red?
Love this. The hardest part of yesterday’s outage wasn’t Cloudflare going down — it was my monitoring treating an edge failure like an origin failure. Context matters. Being able to instantly see “origin healthy, edge broken” is the missing piece in most alerting stacks. Really cool solution.