DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How a Kubernetes Autoscaling Incident Took Down Our API — and How I Now Debug It in Minutes

How a Kubernetes Autoscaling Incident Took Down Our API — and How I Now Debug It in Minutes

Comments 1
2 min read
Kubernetes In-Place Pod Resize

Kubernetes In-Place Pod Resize

Comments
3 min read
how to prevent repeating same discussions engineering in a team

how to prevent repeating same discussions engineering in a team

Comments
3 min read
Why Your Engineering Wiki is a Graveyard (And How to Fix It)

Why Your Engineering Wiki is a Graveyard (And How to Fix It)

Comments
3 min read
Datadog: Observability Lessons from 50+ AWS Apps

Datadog: Observability Lessons from 50+ AWS Apps

5
Comments
7 min read
How to Make Engineering Knowledge Searchable (A Complete Guide)

How to Make Engineering Knowledge Searchable (A Complete Guide)

1
Comments
3 min read
Lessons in Testing, Performance, and Legacy Systems from /dev/mtl 2025

Lessons in Testing, Performance, and Legacy Systems from /dev/mtl 2025

Comments
7 min read
Shift-Left Reliability

Shift-Left Reliability

1
Comments
4 min read
Turning block/goose into an AI SRE Agent

Turning block/goose into an AI SRE Agent

1
Comments
3 min read
How We Architected Context: The Connect-Link-Query Pattern

How We Architected Context: The Connect-Link-Query Pattern

1
Comments
2 min read
Rightsizing Kubernetes Requests with the In-Place Vertical Pod Autoscaler

Rightsizing Kubernetes Requests with the In-Place Vertical Pod Autoscaler

2
Comments
3 min read
The Limitations of Text Embeddings in RAG Applications: A Deep Engineering Dive

The Limitations of Text Embeddings in RAG Applications: A Deep Engineering Dive

Comments
19 min read
AWS Security Series: AWS Access Key is Compromised. Now What? An Incident Response Playbook.

AWS Security Series: AWS Access Key is Compromised. Now What? An Incident Response Playbook.

Comments
3 min read
Kubernetes Is Not a Container Platform (And That Changes Everything)

Kubernetes Is Not a Container Platform (And That Changes Everything)

Comments
1 min read
Project: One App — Three Probes — Real Failures

Project: One App — Three Probes — Real Failures

1
Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.