DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
How We Built and Use Runbook Documentation at Blameless

How We Built and Use Runbook Documentation at Blameless

16
Comments 2
5 min read
SigNoz : Open-source alternative to DataDog

SigNoz : Open-source alternative to DataDog

24
Comments 2
3 min read
Lessons from Slack, GCP and Snowflake outages

Lessons from Slack, GCP and Snowflake outages

4
Comments
3 min read
Deep Dive into Docker Internals - Union Filesystem

Deep Dive into Docker Internals - Union Filesystem

30
Comments
10 min read
SRE2AUX: How Flight Controllers were the first SREs

SRE2AUX: How Flight Controllers were the first SREs

3
Comments
20 min read
Overview of Incident Lifecycle in SRE

Overview of Incident Lifecycle in SRE

1
Comments
11 min read
My DevOps learning path

My DevOps learning path

3
Comments
5 min read
How do you wrap your head around observability?

How do you wrap your head around observability?

49
Comments 13
1 min read
Introduce Chaos Platform 2.0 for Azure

Introduce Chaos Platform 2.0 for Azure

7
Comments
2 min read
What Is Nix and Why You Should Use It

What Is Nix and Why You Should Use It

9
Comments
7 min read
Top Reliability and Scaling Practices from Experts at Citrix, Greenlight Financial, and Incognia

Top Reliability and Scaling Practices from Experts at Citrix, Greenlight Financial, and Incognia

2
Comments
14 min read
Reliability as an Inseparable Part of Software Engineering

Reliability as an Inseparable Part of Software Engineering

3
Comments
5 min read
Getting Started as an SRE? Here are 3 Things You Need to Know.

Getting Started as an SRE? Here are 3 Things You Need to Know.

5
Comments
5 min read
How They SRE

How They SRE

8
Comments 1
1 min read
The Key Differences between SLI, SLO, and SLA in SRE

The Key Differences between SLI, SLO, and SLA in SRE

16
Comments
9 min read
How to Backup your Applications Data to S3 with Walrus

How to Backup your Applications Data to S3 with Walrus

6
Comments
2 min read
What is the right AWS Kubernetes distribution for you?

What is the right AWS Kubernetes distribution for you?

4
Comments
5 min read
Resilience Engineering – Don't Be Afraid to Show Your Vulnerable Side!

Resilience Engineering – Don't Be Afraid to Show Your Vulnerable Side!

4
Comments
4 min read
The True Cost of Building your Own Incident Management System (IMS)

The True Cost of Building your Own Incident Management System (IMS)

2
Comments
5 min read
Communication Tool Down? Here are 3 Ways to Handle it

Communication Tool Down? Here are 3 Ways to Handle it

3
Comments
5 min read
GCP DevOps Certification - Pomodoro Ten

GCP DevOps Certification - Pomodoro Ten

4
Comments
3 min read
Azure Front Door: An Overview

Azure Front Door: An Overview

6
Comments
3 min read
Managing health checks at scale

Managing health checks at scale

6
Comments
5 min read
"I'm Just Doing my Job," An SRE Myth

"I'm Just Doing my Job," An SRE Myth

3
Comments
5 min read
Executando AWS cli em múltiplas contas de maneira fácil

Executando AWS cli em múltiplas contas de maneira fácil

6
Comments
3 min read
loading...