Originally published on Failure is Inevitable.
Is love in the air? We think so. While we don’t have chocolate or flowers for you, we have something just as sweet. Here are some of the most exciting Tweets, content, and events happening in the SRE and resilience engineering community this February.
"I'm Just Doing my Job," An SRE Myth: Blameless SRE Darrell Pappa writes about how organizations can become more customer-centric. Featured in SRE Weekly #256.
On Not Being a Cog in the Machine: Honeycomb’s first SRE Fred Hebert writes about his thoughts on human processes, socio-technical systems, and observability.
Communication Tool Down? Here are 3 Ways to Handle it: Learn how to work through a communication tooling failure via chaos engineering, eliminating SPOFs, and more.
Slack’s Outage on January 4th 2021: Laura Nolan writes an in-depth retrospective on Slack’s recent incident.
4 Tips on Preparing for a [Great] Failure: SRE techniques for mitigating the impacts of system failure including building runbooks, assessing with SLOs, monitoring metrics, and more.
How Cloud Services Platform Teams Can Drive The Adoption Of Effective SRE Practices: Tina Huang writes about using cloud transformations to drive SRE adoption.
Teams have a new tool in their tool belts. Blameless Runbook Documentation is available for early access.
Runbooks are an industry best practice, empowering teams to codify the incident response process and drive process repeatability and consistency. These sets of instructions allow teams to resolve incidents faster with greater confidence and less toil.
Fill out this form to see Runbook Documentation in action.
Blameless Bi-Weekly Demo March 2 at 8 AM PST: Check out a live demo of Blameless as we walk you through operations best practices, and get your questions answered.
If you’re looking to share your insights with the SRE and resilience engineering community, we’d love to partner with you on content. Fill out our form here and we’ll reach out!