DEV Community

Dev Interrupted

The Problem with MTTR: Learning from Incident Reports w/ Courtney Nash

Tracking Mean Time To Restore (MTTR) is standard industry practice for incident response and analysis, but should it be? 

Courtney Nash, an Internet Incident Librarian, argues that MTTR is not a reliable metric - and we think she's got a point.

We caught up with Courtney at the DevOps Enterprise Summit in Las Vegas, where she was making her case against MTTR in favor of alternative metrics (SLOs and cost of coordination data), practices (Near Miss analysis), and mindsets (humans are the solution, not the problem) to help organization better learn from their incidents. 

Show Notes

Support the show:

Offers:

Episode source