Sitewatch

Posted on Mar 31

We Stopped Trusting Uptime Metrics. Here's What We Monitor Instead.

#webdev #devops #monitoring #javascript

We're going to make a claim that might sound controversial:

Uptime monitoring and website monitoring are not the same thing. Most people use the terms interchangeably. They shouldn't.

Uptime monitoring answers: "Did the server respond?"
Website monitoring answers: "Does the site work?"

Those sound similar. They're not. And the gap between them is where most silent failures live.

What uptime monitoring actually does

Let's be specific about what a typical uptime monitoring tool checks. There's no ambiguity here — this is the standard model that tools like UptimeRobot, Pingdom, Better Stack, and most others follow:

Send an HTTP request to your URL
Receive a response
Check the status code
If it's 200 OK → mark as healthy
If it's 5xx or timeout → mark as down
Report uptime percentage

That's it. That's the check.

It tells you whether your server is alive and responding to requests. It does that well. And for a long time, that was enough — because if the server was up, the site worked.

That's no longer true.

What uptime monitoring does NOT check

Here's a non-exhaustive list of things a standard uptime check will miss:

Asset integrity. Your HTML loads, but the JavaScript bundle it references returns 404 because you deployed and the CDN still serves old HTML pointing to deleted files. The page is a blank white screen. Status code on the document: 200.

MIME types. Your server returns a JS file with Content-Type: text/html. The browser downloads it, reads the header, silently refuses to execute it. Your SPA never boots. Status: 200.

Redirect chains. Your site redirects /page → /page/ → /page → /page/. The browser gives up after 20 hops. Your uptime tool followed the first redirect, got a 200, and moved on.

Content correctness. Someone changed your homepage in the CMS at 2am. Or your CDN started serving content from the wrong origin after a migration. The page loads — it's just the wrong page. Status: 200.

Third-party dependencies. Stripe's JS fails to load. Your checkout renders as an empty div. Your auth provider is down. Users can't log in. Your page loaded fine — the broken resource is on someone else's domain.

Regional divergence. Your site works from US-East. In Frankfurt, the CDN edge serves stale cached assets from three deploys ago. Single-region uptime checks never see this.

Every single one of these returns 200 OK. Every single one passes a standard uptime check. Every single one is broken for users.

What website monitoring checks instead

Website monitoring starts where uptime monitoring stops. Instead of asking "did the server respond?", it asks "did the response constitute a working page?"

That means:

Treating the HTML document as a manifest, not a destination. The document is step one. Website monitoring follows up — do the JS and CSS bundles it references actually load? Do they return the right MIME types? Do they return valid status codes?

Following redirect chains to completion. Not just the first hop. The full chain. Does it resolve? Does it loop? Does it end somewhere unexpected?

Fingerprinting content over time. If the response body changes without a deployment, you want to know. Content fingerprinting with SHA-256 hashes catches silent CMS edits, CDN origin drift, and cache poisoning.

Checking from multiple regions. CDN failures are regional by nature. A site can be perfect in one location and broken in another. Website monitoring runs the full check suite from multiple geographic locations independently.

Classifying root cause. Not just "something is wrong" but "why." Is this a deployment artifact? A CDN cache issue? A DNS drift? An application error? Root cause classification turns an alert into actionable information.

A practical comparison

Here's the same scenario through both lenses:

Scenario: You deploy a Nuxt 3 app to Vercel. The build succeeds. But a CDN edge node in Europe still serves the previous HTML, which references app.7fb2e.js. That file was deleted in the new build.

	Uptime monitoring	Website monitoring
HTTP status	200 OK ✅	200 OK (document)
Asset check	Not checked	`app.7fb2e.js` → 404 ❌
MIME validation	Not checked	N/A (asset missing)
Multi-region	Single region ✅	EU: broken ❌, US: ok ✅
Alert	None	Incident: deployment artifact missing, EU region
Root cause	N/A	CDN cache stale, Vercel ISR
User impact	Unknown	European users see blank page

Same site. Same moment. Completely different picture.

They're complementary, not competing

We want to be clear about something: we're not saying uptime monitoring is useless. It's not. You absolutely need to know when your server is down. A 503 or a timeout is a real outage and you need to know about it immediately.

The problem is treating uptime monitoring as sufficient. It's one layer. It catches one type of failure. The failures it misses — the "200 OK but broken" failures — are increasingly the ones that actually hurt users.

The monitoring stack for a modern web application should have both:

Uptime monitoring → Is the server alive? Is it responding? Is the infrastructure running?

Website monitoring → Does the site work? Are assets loading? Is the content correct? Does it work everywhere?

If you only have the first, you have a blind spot. And it's the kind of blind spot where your dashboard says 99.9% uptime while your users are staring at blank pages.

This is why we built Sitewatch

Sitewatch is website monitoring. Not uptime monitoring. The distinction matters to us because it's the entire reason the product exists.

We check asset integrity, MIME types, redirect chains, content fingerprints, and multi-region consistency. When something breaks, we classify root cause — infrastructure, application, or content delivery — and provide fix guidance tailored to your tech stack.

Every check runs a 2-of-3 retry confirmation model so you don't get woken up at 3am for a transient CDN hiccup. Duplicate alerts are deduplicated with SHA-256 fingerprinting and a 30-minute cooldown.

Free tier for 1 site. No credit card. Takes 30 seconds: getsitewatch.com

Uptime monitoring tells you the server is alive. Website monitoring tells you the site works. They're not the same question — and the answer to one doesn't imply the answer to the other.

Top comments (1)

Kai Alder • Apr 4

This distinction matters more than people realize. Had a production incident last year where our status page was showing 100% uptime while users were getting blank screens. Turned out a deploy had introduced a circular import that broke the main bundle — server returned 200, HTML loaded, but the app never mounted.

The MIME type issue you mentioned is sneaky too. I've seen this happen when nginx gets misconfigured after a cert renewal or when someone changes the static file config without realizing it affects Content-Type headers.

One thing I'd add: monitoring synthetic user flows alongside asset checks catches even more. We run a simple "can a user log in and reach the dashboard" check every 5 min. It's caught auth service degradations that no uptime tool would notice since the login page itself loads fine.

Do you find the 30-min cooldown sufficient for most teams? Curious if shorter windows ever make sense for high-traffic sites where even 30 min of silent failure is significant.