The Importance of Tracking Third-Party Status Pages
As a TechOps engineer, you are responsible for keeping abreast of the external services you use. In a modern team, that is pretty much every critical service - whether it's a cloud provider or a SaaS vendor.
Part of your incident management strategy should be to monitor the status of these services and be alerted when they are down or experiencing issues.
No service in your stack lives by itself. While debugging an outage in your application, you will need to know the status of the services it depends on.
Most third-party services have a public status page where they post updates about outages and maintenance.
Tracking the status of these services should be a crucial part of your incident management process.
Which Status Pages Should You Track?
The actual pages will depend on which services your applications depend on. The following lists have some of the most common ones.
Cloud Providers
- Microsoft Azure Note that Microsoft only publishes "widespread incidents" on their status page, i.e., incidents with a widespread impact. For incidents affecting services in your specific account, you should monitor your Azure health dashboard.
- Google Cloud Platform
- Amazon Web Services
- DigitalOcean
- Hetzner
- Fly.io
- Render
- Railway
- Linode
- Vercel
- Netlify
- Cloudflare
CI/CD
Source Code Repositories
Hosted Databases
- MongoDB Atlas
- AWS RDS
- Google Cloud SQL for PostgreSQL
- Azure SQL Managed Instance
- Planetscale
- Supabase
- Render PostgreSQL
CDN and DNS
LLMs and AI Assistants
Observability and Monitoring
Artifact Repositories
SMTP Providers
Payment Gateways
How to Track Status Pages
Tracking status pages can be done in multiple ways:
- Manually subscribing to the status page RSS feed, webhooks, Slack, or email notifications.
- Using a status page monitoring/aggregator tool.
This article was first published on https://dev.to/talonx/a-list-of-status-pages-every-techops-engineer-should-know-28kd.
Top comments (0)