Open your Grafana. Your Datadog. Your CloudWatch.
Count the dashboards.
Now count the ones anyone opened in the last 30 days.
That ratio is usually 4:1 to 10:1.
Last audit, a platform team had 340 dashboards. 41 had a view in the last 30 days. The other 299 were still querying metrics, still costing money, still alerting, still confusing new engineers on-call.
The accumulation pattern is identical every time:
→ Each new service ships with a "starter" dashboard nobody ever customizes
→ Every incident creates 2-3 dashboards that are "really useful"
→ Every quarterly review creates 5 more "leadership-ready" dashboards
→ Every new hire builds their own because they don't know the existing ones work
Nobody deletes anything. Observability debt compounds like financial debt.
The damage:
- Real signals drown in unused noise
- Alert fatigue — teams mute critical alerts because they're next to 40 broken dashboards
- Your data-ingest bill scales with total metrics, not used metrics (Datadog charges per custom metric whether dashboards display them or not)
- On-call runbooks point to dashboards that stopped working 6 months ago
The fix is embarrassingly simple and painful:
→ Query dashboard view counts (Grafana API, Datadog API both expose this)
→ Delete everything with 0 views in 60 days. No exceptions. Yes, the one you built "just in case."
→ Adopt USE + RED framework. One dashboard per service. Golden signals only.
→ Link runbooks from alerts DIRECTLY to the SLO dashboard, not to a folder
Result: cleaner signals, 15-30% custom-metric reduction on Datadog, on-call actually sleeps at night.
If you have a dashboard folder from 2022 titled "Temp — will organize later," repost. You know exactly who this is for.

Top comments (0)