Most developers know downtime is bad. What I didn't expect was how bad when I actually sat down and worked through the math for a small SaaS.
Here's what the data says.
The Baseline: SMBs Average $8,000/Hour
Gartner's oft-quoted "$5,600 per minute" figure is real, but it's enterprise scale. Datto's 2023 State of the Channel Report surveyed small and medium businesses specifically.
The number: $8,000 per hour for SMBs.
Break that down for a lean indie SaaS:
| Category | Calculation | Estimate |
|---|---|---|
| Direct revenue loss | (MRR ÷ 730 hrs) × 3 hrs | ~$20 on $5k MRR |
| Churn risk | 68% consider switching × affected users × ARPU | Hundreds → thousands |
| Engineering time | 4 hrs detection + response × hourly rate | $400–$800 |
| Trust/reputation | Unmeasurable in-session, measurable in 90-day renewals | ? |
The direct revenue number looks tiny. The churn risk and the compounding trust erosion are where small SaaS businesses actually bleed.
The Detection Gap Is the Real Problem
Here's the stat I keep coming back to from Splunk + Oxford Economics' 2024 research (2,000 executives across 53 countries):
41% of tech companies say customers often or always detect downtime before their internal team does.
Think about what that means in practice. Your user opens your app, gets an error, closes it. Maybe they tweet. Maybe they email you. Maybe they just... leave.
New Relic's Observability Forecast 2025 adds to this: 41% of IT leaders identify service issues through manual checks, customer complaints, or incident tickets — after the fact.
Without continuous monitoring, your window to detect an outage before a user does is somewhere between 3 and 6 hours on average. With monitoring, you're talking under a minute.
That gap is the entire ballgame.
The Average Outage Is Longer Than You Think
Cockroach Labs' State of Resilience 2025 found the average outage lasts 196 minutes before resolution. That's over three hours.
More from that same study:
- Only 20% of organizations describe themselves as fully prepared for outages
- 39% describe their outage handling as "reactive" with no formal protocols
- Only 2% can resolve an unplanned outage in 60 seconds or less
- Large enterprises are 49% more likely to have continuous monitoring than smaller orgs
That last one lands differently when you're a small team. The monitoring gap is, almost by definition, a small company problem.
What Users Do During an Outage (It's Not Good)
Three behaviors happen in sequence when a user hits a broken product:
1. They don't wait. Google's research: 53% of mobile users abandon a site that takes more than 3 seconds to load. An outage is worse than slow.
2. They don't know it's temporary. Without a public status page, there's no signal that distinguishes "back in 5 minutes" from "this product is dead." Users fill that vacuum with the worst-case interpretation.
3. 68% consider switching. That's from a 2023 Zealousys survey on SaaS customer behavior after outages. After one incident. Not a pattern.
The Real Incidents for Reference
Some concrete examples to anchor the numbers:
- CrowdStrike, July 2024 — Faulty sensor update, ~8.5M Windows endpoints affected. Fortune 500 losses: $5.4B (Parametrix). Delta alone: $500M.
- GitHub, 2024 — 124 incidents, ~800 hours of degraded performance across the year.
- J.Crew, Black Friday 2023 — 5-hour outage, ~$775K in estimated lost sales.
These aren't "this could happen to you" scare stories. The underlying failure modes — bad config push, dependency timeout, unmonitored endpoint — are the same failure modes that take down a solo SaaS at 2am.
A Simple Monitoring Setup
If you're not monitoring your endpoints yet, here's the minimum viable setup that closes the detection gap:
- HTTP monitoring — ping your core endpoints (dashboard, API, login) on a 1–5 minute interval
- Alert routing — Slack, Discord, email, or Telegram — whatever you actually check
- Public status page — even a simple one tells users something is happening and you're on it
This doesn't need to be complex. It needs to exist.
The Bottom Line
For a small SaaS, the cost of downtime isn't a huge per-minute dollar figure. It's slower and more damaging: churn risk that compounds, trust that erodes, detection windows that stay open for hours because there's nothing watching.
The $8,000/hour SMB average is the number. The 41% customer-detects-first rate is the structural problem. Closing the detection gap is the fix.
If you want to set up monitoring without spending enterprise money on it, I built Stillup — uptime monitoring + public status pages, free plan available, no credit card needed.
Top comments (0)