DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
I Audited 40 Monitoring Setups. Here Are the 3 Blind Spots That Existed in All of Them

I Audited 40 Monitoring Setups. Here Are the 3 Blind Spots That Existed in All of Them

Comments
2 min read
5 Critical APM Monitoring Mistakes That Cost Companies $100k+/Year (And How to Fix Them)

5 Critical APM Monitoring Mistakes That Cost Companies $100k+/Year (And How to Fix Them)

1
Comments
6 min read
What “Read-Only Fridays” Quietly Reveal About Your Platform

What “Read-Only Fridays” Quietly Reveal About Your Platform

Comments 1
1 min read
Setup NUT on Proxmox

Setup NUT on Proxmox

Comments
3 min read
Chapter 2: Infrastructure as Code

Chapter 2: Infrastructure as Code

1
Comments
8 min read
API Uptime SLA: What 99.9% Really Means for Your Application

API Uptime SLA: What 99.9% Really Means for Your Application

Comments
6 min read
Your Traces Look Fine. Your Revenue Isn’t.

Your Traces Look Fine. Your Revenue Isn’t.

1
Comments
2 min read
O que realmente quebra em migrações de nuvem em larga escala — Solução !

O que realmente quebra em migrações de nuvem em larga escala — Solução !

Comments
4 min read
LGTM != Production Ready: Why your CI pipeline is missing the most important step

LGTM != Production Ready: Why your CI pipeline is missing the most important step

Comments
3 min read
Why AI SRE tools don't work (and what we're doing differently)

Why AI SRE tools don't work (and what we're doing differently)

4
Comments 2
4 min read
Linux Privileges:Peeling Back the Curtain Of How Linux Really Handles Users, Privileges, and Processes

Linux Privileges:Peeling Back the Curtain Of How Linux Really Handles Users, Privileges, and Processes

4
Comments
5 min read
Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)

Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)

Comments
3 min read
Reliability vs Uptime: Why Availability Fails at Scale

Reliability vs Uptime: Why Availability Fails at Scale

5
Comments 1
3 min read
SRE is the BEST Thing Ever

SRE is the BEST Thing Ever

Comments
4 min read
Getting Started with cURL

Getting Started with cURL

Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.