DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
How a Simple Python Validator Prevents Config Outages

How a Simple Python Validator Prevents Config Outages

Comments 2
3 min read
The Spot Instance That Killed Our Payments Service (And Why It Took Us 47 Minutes to Find It)

The Spot Instance That Killed Our Payments Service (And Why It Took Us 47 Minutes to Find It)

4
Comments 2
6 min read
SRE Explained: Because 'It Works on My Machine' is Not an SLO 🎯

SRE Explained: Because 'It Works on My Machine' is Not an SLO 🎯

3
Comments
9 min read
Why Your Monitoring Is Failing in Microservices (And What Actually Works)

Why Your Monitoring Is Failing in Microservices (And What Actually Works)

1
Comments
3 min read
SLI/SLO Framework

SLI/SLO Framework

Comments
4 min read
Capacity Planning Toolkit

Capacity Planning Toolkit

Comments
3 min read
On-Call Management Kit

On-Call Management Kit

Comments
4 min read
Postmortem Framework

Postmortem Framework

Comments
4 min read
Runbook Template Library

Runbook Template Library

Comments
3 min read
Chaos Engineering Toolkit

Chaos Engineering Toolkit

Comments
4 min read
Platform Developer Portal

Platform Developer Portal

Comments
3 min read
The AI Incident Report Template I Actually Use for Wrong Answers and Tool Failures

The AI Incident Report Template I Actually Use for Wrong Answers and Tool Failures

5
Comments
3 min read
3am Incident Response: What I Learned from 200+ Pages

3am Incident Response: What I Learned from 200+ Pages

Comments
2 min read
Runbook Automation: From 45-Minute Fixes to 90-Second Recoveries

Runbook Automation: From 45-Minute Fixes to 90-Second Recoveries

Comments
2 min read
Runbook Automation: From 45-Minute Fixes to 90-Second Recoveries

Runbook Automation: From 45-Minute Fixes to 90-Second Recoveries

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.