DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Importance of Graceful Shutdown in Kubernetes

Importance of Graceful Shutdown in Kubernetes

3
Comments
7 min read
Root Cause Analysis (RCA): entendendo a causa raiz de incidentes

Root Cause Analysis (RCA): entendendo a causa raiz de incidentes

8
Comments
2 min read
🚀 Mini Monitoring App in Go with Prometheus, Grafana & CI/CD

🚀 Mini Monitoring App in Go with Prometheus, Grafana & CI/CD

Comments 1
3 min read
The 67-Second OpenTelemetry Problem

The 67-Second OpenTelemetry Problem

Comments
4 min read
🔮 Une nouvelle manière de vulgariser la programmation : plonge dans le monde magique de Grand Père Kernel

🔮 Une nouvelle manière de vulgariser la programmation : plonge dans le monde magique de Grand Père Kernel

1
Comments
2 min read
The Human-in-the-Loop Factor: Partnering With Amazon Q During a Production Incident

The Human-in-the-Loop Factor: Partnering With Amazon Q During a Production Incident

2
Comments
11 min read
ComunicaOps: Criando Alicerces para Construção de Plataformas

ComunicaOps: Criando Alicerces para Construção de Plataformas

3
Comments
2 min read
Blue/Green e Canary no Kubernetes com Argo Rollouts [Lab Session]

Blue/Green e Canary no Kubernetes com Argo Rollouts [Lab Session]

15
Comments
11 min read
Amazon Cognito Observability Best Practices with Datadog

Amazon Cognito Observability Best Practices with Datadog

1
Comments
5 min read
Build C Projects Like a Pro: A Guide to Idiomatic Makefiles

Build C Projects Like a Pro: A Guide to Idiomatic Makefiles

1
Comments 2
7 min read
Amazon API Gateway Observability Best Practices with Datadog

Amazon API Gateway Observability Best Practices with Datadog

1
Comments
4 min read
Cost-Tracking and Model-Spend Monitoring with LiteLLM

Cost-Tracking and Model-Spend Monitoring with LiteLLM

1
Comments 2
2 min read
AI-Powered Kubernetes Debugging with Python and Ollama

AI-Powered Kubernetes Debugging with Python and Ollama

1
Comments
6 min read
Understanding `kube-system` in Kubernetes: A City Analogy You’ll Never Forget

Understanding `kube-system` in Kubernetes: A City Analogy You’ll Never Forget

6
Comments
2 min read
🚀 The Ultimate DevOps Emoji Glossary

🚀 The Ultimate DevOps Emoji Glossary

2
Comments
2 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.