DEV Community

Tom Harada
Tom Harada

Posted on • Updated on

Operational Excellence Tip #1

I like to call this technique "Top Errors Zero." Like Inbox Zero, it works by focusing on root causing and fixing your service or app's top errors and removing them.

First you have to record them (and potentially you need to group them if they have customer-specific information/ids). Once recorded, you need to surface them in a dashboard. E.g., create a table of "Top errors" sorted in descending count. Optionally you can include a time chart with when these errors are occurring. And this is as simple as grouping all API calls by error message.

Operational Excellence Tip #1 is: review this table regularly (e.g., weekly) and drive these errors to zero (e.g., as part of your on-call or ops work).

This doesn't cover bad UX or situations where you aren't raising errors in the right place. But in general you'll be surprised how much quality improves for your customers by driving down top errors to zero.

Update. Also use Pareto charts to bucket the top errors or issues to investigate.

Top comments (0)