You've seen it everywhere. On hosting pages, SaaS pricing tables, cloud provider dashboards:
"99.9% uptime guaranteed"
Sounds impressive. Almost perfect. Like, what's 0.1%?
A lot, actually. Let me show you the math — and more importantly, what it means for your users, your revenue, and your sleep schedule.
The Math Nobody Does
99.9% uptime means your service is unavailable for 0.1% of the time.
Here's what 0.1% looks like across different time windows:
| Time Period | Allowed Downtime |
|---|---|
| Per day | 1 minute 26 seconds |
| Per week | 10 minutes 4 seconds |
| Per month | 43 minutes 49 seconds |
| Per year | 8 hours 45 minutes |
That last one is the one that should make you pause. 8 hours and 45 minutes of downtime per year — and your SLA is technically fine the whole time.
The Full SLA Cheat Sheet
Most people only know the "three nines" (99.9%). Here's the complete picture:
| SLA | Downtime/Year | Downtime/Month | Downtime/Day |
|---|---|---|---|
| 99% | 3 days 15 hrs | 7 hrs 18 min | 14 min 24 sec |
| 99.5% | 1 day 19 hrs | 3 hrs 39 min | 7 min 12 sec |
| 99.9% | 8 hrs 45 min | 43 min 49 sec | 1 min 26 sec |
| 99.95% | 4 hrs 22 min | 21 min 54 sec | 43 sec |
| 99.99% | 52 min 35 sec | 4 min 22 sec | 8.6 sec |
| 99.999% | 5 min 15 sec | 26 sec | 0.86 sec |
The jump from 99.9% to 99.99% — one extra "9" — reduces your annual downtime budget from 8.7 hours to 52 minutes. That's a 10x difference.
Calculate Your Own Uptime
The formula is simple:
Downtime = Total Time × (1 - Uptime %)
For example, a year has 365.25 × 24 = 8,766 hours.
At 99.9%:
8,766 hours × 0.001 = 8.766 hours ≈ 8 hrs 45 min
Or in JavaScript, if you want to build it yourself:
function calculateDowntime(uptimePercent, periodHours) {
const downtimeRatio = 1 - (uptimePercent / 100);
const downtimeHours = periodHours * downtimeRatio;
const downtimeMinutes = downtimeHours * 60;
const downtimeSeconds = downtimeMinutes * 60;
return {
hours: Math.floor(downtimeHours),
minutes: Math.floor(downtimeMinutes % 60),
seconds: Math.floor(downtimeSeconds % 60),
};
}
// 99.9% uptime over a year (8766 hours)
console.log(calculateDowntime(99.9, 8766));
// → { hours: 8, minutes: 45, seconds: 46 }
If you'd rather skip the math, tools like AlertSleep's uptime calculator let you punch in any percentage and get the breakdown instantly.
"But Our SLA Excludes Planned Maintenance"
This is the clause that quietly turns "99.9%" into "something much lower."
Many SLAs include language like:
"Uptime calculations exclude scheduled maintenance windows, force majeure events, and incidents caused by the customer."
In practice, this means a vendor can take their service down for a 4-hour maintenance window every month and still advertise "99.9% uptime" — because those hours simply don't count.
Always check:
- Does the SLA count maintenance windows as downtime?
- How much advance notice is required for scheduled maintenance?
- What's the compensation if they breach the SLA? (Hint: it's usually service credits, not money)
What Does Downtime Actually Cost?
Here's where it gets real. Abstract percentages become concrete when you map them to your business.
A rough formula used by most reliability engineers:
Cost of Downtime = Lost Revenue/hr + Productivity Cost/hr + Reputation Damage
For an e-commerce site doing $100k/day in revenue:
Revenue per hour = $100,000 / 24 ≈ $4,166/hr
At 99.9% uptime → 8.75 hours of downtime/year
→ Lost revenue: 8.75 × $4,166 ≈ $36,000/year
And that's before counting the customer support tickets, the social media complaints, and the users who never come back.
The "Five Nines" Problem
You'll sometimes see "five nines" (99.999%) thrown around by cloud providers. It sounds incredible — only 5 minutes of downtime per year.
But here's the uncomfortable truth: achieving five nines is mostly about architecture, not monitoring.
Five nines requires:
- Multi-region active-active deployments
- Zero-downtime deployments (blue/green or canary)
- Automatic failover with sub-second detection
- Chaos engineering to test failure scenarios
Most startups and even mid-size companies realistically operate at 99.5% to 99.95%. And that's fine — if you know it and plan for it.
The Difference Between Measured and Actual Uptime
Here's a subtle but important distinction.
Your hosting provider might achieve 99.99% uptime at the infrastructure level. But your application might only hit 99.5% because of:
- Memory leaks that require weekly restarts
- Slow database queries that cause timeouts (HTTP 504 — is that "downtime"?)
- Third-party API dependencies that go down
- SSL certificate expiry (this kills more sites than you'd think)
- Your own deployment going wrong at 2am
Your uptime is only as good as the weakest link in the chain. And the only way to know your real uptime — not your provider's uptime — is to monitor from the outside.
What to Actually Monitor
Most developers start monitoring too late and measure too little. Here's a baseline:
Minimum viable monitoring:
- [ ] HTTP status check every 1-5 minutes
- [ ] Response time tracking (a 503 that takes 30s is worse than a fast 503)
- [ ] SSL certificate expiry alert (set to 30 days before)
- [ ] Domain expiration alert (set to 60 days before)
Level up:
- [ ] Multi-region checks (your site might be down only in the US East)
- [ ] API endpoint monitoring (not just the homepage)
- [ ] Port monitoring for non-HTTP services
Alert channels that actually wake you up:
- SMS/phone call for critical alerts (email is too easy to miss at 3am)
- Slack/Teams for the team
- Status page for your users so they know you know
The Real Takeaway
99.9% uptime is not "always online." It's a budget — a budget of how much downtime your users are willing to accept before they find an alternative.
The question isn't "what SLA does my provider offer?" The question is:
What uptime does your business actually need — and how will you know when you're not hitting it?
The first step is measuring. You can't improve what you can't see.
If you're building something people depend on, set up external uptime monitoring today — not after the first outage. Tools like AlertSleep start free and take about 2 minutes to configure.
What SLA does your app target? And are you actually measuring it? Drop it in the comments.
Top comments (0)