Why You Should Care
If you spend a chunk of your morning clicking through AWS consoles, log dashboards, and status pages just to check if everything's okay—this might help.
I was working with a team that had the same problem. Turns out, the issue wasn't information overload. It was too many places to look.
The Situation
I joined a project as an external consultant. The ask was simple: "Help us improve our monitoring setup."
The basics were already in place:
- CloudWatch ✓
- Logs ✓
- Dashboards ✓
Incidents were rare—maybe once every few months. Nothing was broken.
But something felt off. The team couldn't shake the feeling that they weren't doing monitoring "right."
What I Found
Through interviews, a few concerns came up:
- Alert thresholds might be too loose
- Metrics weren't well organized
- External API monitoring was lacking
- Cost visibility needed work
All valid. But one thing caught my attention:
"Every morning, someone spends about 30 minutes manually checking different screens."
The Real Problem
I dug into that 30 minutes.
Most of it wasn't thinking time. It was navigation time.
- Open AWS console
- Go to dashboard
- Change time range
- Switch to another service
- Open external API status page
- Check cost explorer
- ...
The information was all there. It was just scattered across too many places.
The Insight
This wasn't a "not enough monitoring" problem.
It was a "too much context switching" problem.
Current state (Pull model):
You → check → CloudWatch, Logs, External APIs, Costs...
Goal (Push model):
Auto-generated report → arrives → You
The fix? Don't reduce what you look at. Reduce where you look.
The Plan
Shift from Pull to Push.
Instead of visiting dashboards every morning, have a report delivered to you.
I've done this before on another project using Jenkins. A scheduled job generated a morning summary and posted it to chat. Once it was set up, the morning check became "read what arrived."
Next Steps
- Inventory — List everything being checked and where
- Prioritize — Pick the essential metrics only
- Automate — Build a Jenkins job to generate and post a report
- Iterate — Add or remove based on real usage
Start small. Don't aim for perfect.
Why Not Just Skip the Morning Check?
Good question.
Monitoring has two jobs:
| Role | What it does | Examples |
|---|---|---|
| Incident response | Know when things break | Alerts, API checks |
| Baseline awareness | Know what "normal" looks like | Morning check |
Alerts tell you when things break. But if you've lost your sense of "normal," you won't know how serious an alert really is.
The morning check isn't about catching problems. It's about staying calibrated.
Making it easier isn't cutting corners. It's making sure it actually keeps happening.
Wrap Up
This isn't a silver bullet. I'm not even 100% sure it's the right call.
But the logic feels sound:
- Heavy monitoring investment doesn't fit this phase
- Losing awareness of "normal" is a real risk
- So make the habit sustainable
Implementation details coming later. For now, the direction is set.
I write about these kinds of engineering decisions and thought processes on my blog.
https://tielec.blog/
Top comments (0)