How does everybody document and handle incidence response, especially within smaller team? (i.e. If everybody from the sub-team is on vacation how do you document steps to debug, troubleshoot, and fix issues and where does that documentation live, do you give non developers the ability to roll back deploys if issues arise or provide ways to remotely monitor your applications)
I'm trying to design and implement a procedure in which if an issue arises and the immediate developers aren't available, we target the right person to resolve the issue and provide them with all the resources required to resolve the issue or escalate it to us on vacation if the issue is critical.
Top comments (0)