DEV Community

Discussion on: How does your team handle critical production errors?

Collapse
 
nicolasini profile image
Nico S___

We get notified of issues in production in several ways: sentry, pingdom, our Customer Success Team

When an issue occurs in production we have a predefined process we go through:

  • Assign a production incident Marshall to drive the effort, this is the customer success team lead
  • Work with a product team member to investigate the issue
  • Recruit help from others when needed
  • Work towards a resolution
  • Create a Production Incident Report
  • Review the report in a Production Incident Retrospective
  • Schedule actions that came up from the retrospective

Works incredible well