Firat Celik

Posted on Apr 26

How I Monitor 80+ Cloud Services in Real-Time (And Get Notified Before My Users Do)

#programming #productivity #api #fullstack

A deep dive into building StatusWatch; push notifications for service outages, built with Flutter and a real time status aggregation backend.

The moment that started it all

It was a Friday evening. A client messaged me: "The checkout isn't working."

Stripe had been down for 38 minutes. I had no idea.

I did what most developers do opened five browser tabs, checked Stripe's status page, scrolled Twitter looking for complaints. By the time I confirmed the issue and messaged the client back, 45 minutes had passed.

That was embarrassing. More importantly, it was preventable.

I started looking for a mobile app that would just... tell me when my stack breaks. Something that monitors the services I actually use and sends a push notification the moment something goes wrong. I couldn't find exactly what I wanted, so I built it.

This is the story of how StatusWatch works under the hood.

The core problem: status pages are not built for developers

Every major cloud provider publishes a status page. AWS has one. GitHub has one. Stripe has one. They're public, they're (usually) accurate, and almost nobody checks them proactively.

Why? Because checking 10 different status pages manually every time something feels off is not a workflow. It's a punishment.

The real need is simple: when something in my stack breaks, I want to know immediately not when a user tells me.

StatusWatch solves this by polling the official status APIs of 80+ services, normalizing the data into a unified format, and pushing a notification to your phone the moment an incident is detected.

Architecture overview

At a high level, the system has three layers:

[Status APIs] → [Aggregation Backend] → [Flutter Mobile App]

Let me walk through each one.

Layer 1: Polling the status APIs

Most major services expose their status data in one of two formats:

Atlassian Statuspage (the most common) used by GitHub, Stripe, Vercel, Cloudflare, Supabase, Linear, and dozens more. The API returns a structured JSON response with components, incidents, and scheduled maintenances.

GET https://www.githubstatus.com/api/v2/summary.json

A simplified response looks like this:

{
  "status": {
    "indicator": "minor",
    "description": "Minor Service Outage"
  },
  "components": [
    {
      "name": "Git Operations",
      "status": "degraded_performance"
    },
    {
      "name": "API Requests",
      "status": "operational"
    }
  ],
  "incidents": [
    {
      "name": "Increased error rates on Git push",
      "status": "investigating",
      "created_at": "2025-04-10T14:22:00Z"
    }
  ]
}

Custom status APIs — services like AWS, Google Cloud, and Azure have their own formats. AWS Health Dashboard, for example, returns RSS/Atom feeds per region and service. These require individual parsers.

The normalization challenge

Each service returns data in a slightly different shape. To display everything uniformly in the app, I normalize every response into a shared schema:

{
  "service_id": "github",
  "name": "GitHub",
  "overall_status": "degraded",
  "components": [...],
  "active_incidents": [...],
  "last_checked": "2025-04-26T09:15:00Z"
}

This normalization layer is what makes it possible to show a consistent green/yellow/red dashboard regardless of which service you're looking at.

Layer 2: Detecting state changes and triggering notifications

Polling alone isn't enough. The key logic is detecting when a service's status changes — from operational to degraded, from degraded to outage, and back to operational.

Previous state: operational
Current state:  degraded
→ Trigger notification: "GitHub is experiencing degraded performance"

Previous state: degraded  
Current state:  operational
→ Trigger notification: "GitHub is back to normal ✓"

Each service's last known state is stored server-side. On every polling cycle, the new state is compared against the stored state. If they differ, a notification is queued.

Polling intervals

Not all services are polled at the same frequency. The free tier uses a 5-minute polling interval. Pro users get near-instant detection — polling runs every 60 seconds, and for major services like AWS, GitHub, and Stripe, we watch for webhook-based updates where available.

Push notifications

Notifications are delivered via FCM (Firebase Cloud Messaging) for Android and APNs for iOS. When a state change is detected, the backend sends a targeted push to all users who have that service in their watchlist.

The notification payload is minimal and actionable:

Title: ⚠️ Stripe — Incident Detected
Body: Payment API is experiencing elevated error rates
Tap: Opens incident timeline in StatusWatch

Layer 3: The Flutter app

The mobile app is built with Flutter, which gives us a single codebase for iOS and Android with native performance.

The dashboard

The main screen is a flat list of your followed services, each showing:

Service name and logo
Current status (green dot, yellow dot, red dot)
Active incident summary if one exists

The status colors follow a simple convention: green = operational, yellow = degraded performance, red = major outage or partial outage.

Incident timeline

Tapping any service opens its incident timeline a chronological feed of status updates from "Investigating" through "Identified", "Monitoring", and finally "Resolved". This is pulled directly from the status API and gives you the full picture without opening a browser.

Uptime heatmap (Pro)

For Pro users, each service shows a 90-day uptime heatmap — a grid where each cell represents one day, colored by uptime percentage. This gives a quick visual sense of a service's reliability history.

State management

The app uses Riverpod for state management. Each service's status is a stream that updates when the backend pushes new data. The UI rebuilds reactively — no manual refresh needed.

The services we monitor

One of the most time-consuming parts of building StatusWatch was the initial service integration work. Not every service has a clean, documented status API. Here's a breakdown of the current coverage:

Category	Services
Cloud	AWS, GCP, Azure, DigitalOcean, Hetzner, Fly.io, Railway
DevOps	GitHub, GitLab, CircleCI, Render, Linear, Jira
Hosting	Vercel, Netlify, Heroku, Cloudflare, Firebase
Database	Supabase, MongoDB Atlas, Neon, Redis Cloud, Upstash
Payments	Stripe, Paddle, Twilio, SendGrid, Resend
Monitoring	Datadog, Sentry, Grafana Cloud, PagerDuty, New Relic
AI	OpenAI, Anthropic, Hugging Face, Together AI

80+ services total, and the list grows with every release.

What I learned building this

Status APIs are inconsistently documented. Some services have beautifully documented status APIs. Others you have to reverse engineer from their status page's network requests. A few don't expose machine-readable status data at all, which means falling back to HTML scraping fragile, but sometimes the only option.

Push notification timing is everything. A notification that arrives 20 minutes late is almost useless. The entire value proposition is knowing before your users do. This made the polling architecture a first class concern from day one, not an afterthought.

Normalization is harder than it looks. "Degraded performance" on Stripe is not the same operational severity as "degraded performance" on GitHub. Building a useful unified model required understanding each service's incident taxonomy not just their API shape.

Free tier design matters for conversion. The 5-service limit on the free tier is intentional. Most solo developers track 5–8 services. Hitting that limit naturally creates the moment where a user thinks "I need one more" which is where the Pro upgrade conversation starts.

What's next

A few things on the roadmap:

Android widgets glanceable status on your home screen without opening the app
More services always expanding the list based on user requests
Custom webhook endpoints for services not on the official list
Team accounts shared watchlists and notification routing for small DevOps teams

Try it

StatusWatch is free to download. The free tier covers 5 services with 5-minute notification delays enough to see if it fits your workflow before committing to Pro.

If your stack touches AWS, Stripe, Supabase, OpenAI, or Vercel, you'll probably want to know about outages before your users do.

[Download StatusWatch on the App Store]

Built by a Firat Celik. Feedback welcome drop a comment or find me on X [@Bi9x6].

Tags: flutter, devops, sre, devtools, buildinpublic, mobile, backend, cloudinfrastructure