I Ran the Numbers on SaaS Downtime Costs — Here's What I Found

#webdev #saas #devops #monitoring

Most developers know downtime is bad. What I didn't expect was how bad when I actually sat down and worked through the math for a small SaaS.

Here's what the data says.

The Baseline: SMBs Average $8,000/Hour

Gartner's oft-quoted "$5,600 per minute" figure is real, but it's enterprise scale. Datto's 2023 State of the Channel Report surveyed small and medium businesses specifically.

The number: $8,000 per hour for SMBs.

Break that down for a lean indie SaaS:

Category	Calculation	Estimate
Direct revenue loss	(MRR ÷ 730 hrs) × 3 hrs	~$20 on $5k MRR
Churn risk	68% consider switching × affected users × ARPU	Hundreds → thousands
Engineering time	4 hrs detection + response × hourly rate	$400–$800
Trust/reputation	Unmeasurable in-session, measurable in 90-day renewals	?

The direct revenue number looks tiny. The churn risk and the compounding trust erosion are where small SaaS businesses actually bleed.

The Detection Gap Is the Real Problem

Here's the stat I keep coming back to from Splunk + Oxford Economics' 2024 research (2,000 executives across 53 countries):

41% of tech companies say customers often or always detect downtime before their internal team does.

Think about what that means in practice. Your user opens your app, gets an error, closes it. Maybe they tweet. Maybe they email you. Maybe they just... leave.

New Relic's Observability Forecast 2025 adds to this: 41% of IT leaders identify service issues through manual checks, customer complaints, or incident tickets — after the fact.

Without continuous monitoring, your window to detect an outage before a user does is somewhere between 3 and 6 hours on average. With monitoring, you're talking under a minute.

That gap is the entire ballgame.

The Average Outage Is Longer Than You Think

Cockroach Labs' State of Resilience 2025 found the average outage lasts 196 minutes before resolution. That's over three hours.

What Users Do During an Outage (It's Not Good)

Three behaviors happen in sequence when a user hits a broken product:

1. They don't wait. Google's research: 53% of mobile users abandon a site that takes more than 3 seconds to load. An outage is worse than slow.

2. They don't know it's temporary. Without a public status page, there's no signal that distinguishes "back in 5 minutes" from "this product is dead." Users fill that vacuum with the worst-case interpretation.

3. 68% consider switching. That's from a 2023 Zealousys survey on SaaS customer behavior after outages. After one incident. Not a pattern.

The Real Incidents for Reference

Some concrete examples to anchor the numbers:

CrowdStrike, July 2024 — Faulty sensor update, ~8.5M Windows endpoints affected. Fortune 500 losses: $5.4B (Parametrix). Delta alone: $500M.
GitHub, 2024 — 124 incidents, ~800 hours of degraded performance across the year.
J.Crew, Black Friday 2023 — 5-hour outage, ~$775K in estimated lost sales.

These aren't "this could happen to you" scare stories. The underlying failure modes — bad config push, dependency timeout, unmonitored endpoint — are the same failure modes that take down a solo SaaS at 2am.

A Simple Monitoring Setup

If you're not monitoring your endpoints yet, here's the minimum viable setup that closes the detection gap:

HTTP monitoring — ping your core endpoints (dashboard, API, login) on a 1–5 minute interval
Alert routing — Slack, Discord, email, or Telegram — whatever you actually check
Public status page — even a simple one tells users something is happening and you're on it

This doesn't need to be complex. It needs to exist.

The Bottom Line

For a small SaaS, the cost of downtime isn't a huge per-minute dollar figure. It's slower and more damaging: churn risk that compounds, trust that erodes, detection windows that stay open for hours because there's nothing watching.

The $8,000/hour SMB average is the number. The 41% customer-detects-first rate is the structural problem. Closing the detection gap is the fix.

If you want to set up monitoring without spending enterprise money on it, I built Stillup — uptime monitoring + public status pages, free plan available, no credit card needed.