DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Augment a PagerDuty Incident with Root Cause

Augment a PagerDuty Incident with Root Cause

4
Comments
7 min read
SREview Issue #3

SREview Issue #3

3
Comments
2 min read
Nobody likes to wait in a Queue

Nobody likes to wait in a Queue

4
Comments
2 min read
Using Automation and SLOs to Create Margin in your Systems

Using Automation and SLOs to Create Margin in your Systems

4
Comments
4 min read
Bringing Operational Excellence to Dev with Github's Lauren Rubin

Bringing Operational Excellence to Dev with Github's Lauren Rubin

4
Comments
33 min read
SLO Adoption at Twitter

SLO Adoption at Twitter

2
Comments
7 min read
How SLIs Help You Understand Users' Needs

How SLIs Help You Understand Users' Needs

4
Comments
5 min read
SRE, DevOps Authors

SRE, DevOps Authors

9
Comments
1 min read
Promoting Continuous Learning with SRE

Promoting Continuous Learning with SRE

3
Comments
4 min read
Teamwork and Culture in the Era of Remote Work

Teamwork and Culture in the Era of Remote Work

6
Comments
4 min read
Managing Burnout During COVID-19

Managing Burnout During COVID-19

4
Comments
8 min read
You've Nailed Incident detection, what about Incident Resolution?

You've Nailed Incident detection, what about Incident Resolution?

5
Comments
6 min read
SREview Issue #2 June 2020

SREview Issue #2 June 2020

2
Comments
2 min read
Reduce Engineering Problems with a Resiliency Mindset

Reduce Engineering Problems with a Resiliency Mindset

3
Comments
8 min read
How DevOps and SRE Fit Together

How DevOps and SRE Fit Together

9
Comments
5 min read
Hints For Engineers During Outages

Hints For Engineers During Outages

2
Comments
1 min read
How SLOs Help Evernote's SRE Team Manage Tech Debt

How SLOs Help Evernote's SRE Team Manage Tech Debt

6
Comments
6 min read
How to master at SRE recruiting?

How to master at SRE recruiting?

3
Comments
1 min read
+Con Online 2020

+Con Online 2020

3
Comments
1 min read
What are you monitoring

What are you monitoring

5
Comments
2 min read
Disaster recovery of single node Kubernetes control plane

Disaster recovery of single node Kubernetes control plane

3
Comments
2 min read
High available Kubernetes cluster with single control plane node

High available Kubernetes cluster with single control plane node

6
Comments
4 min read
Load balancing algorithms

Load balancing algorithms

9
Comments
1 min read
Which Kubernetes Container Probe Should I Use?

Which Kubernetes Container Probe Should I Use?

6
Comments
4 min read
Cloud Native Computing Minsk Digest #7

Cloud Native Computing Minsk Digest #7

7
Comments
3 min read
loading...