DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Kubernetes Node Affinity and Anti-Affinity: Scheduling Workloads effectively

Kubernetes Node Affinity and Anti-Affinity: Scheduling Workloads effectively

6
Comments
4 min read
How to Deploy and Manage Kubernetes Add-Ons across multiple Clusters

How to Deploy and Manage Kubernetes Add-Ons across multiple Clusters

3
Comments
2 min read
Observability vs. Monitoring

Observability vs. Monitoring

2
Comments
2 min read
In 2025, I resolve to be proactive about reliability

In 2025, I resolve to be proactive about reliability

Comments
6 min read
Error Budgets in Practice: A Data-Driven Approach to Risk and Release Management

Error Budgets in Practice: A Data-Driven Approach to Risk and Release Management

9
Comments
11 min read
If you have bugs, you need a Bug Warden

If you have bugs, you need a Bug Warden

Comments
5 min read
we are doing DevOps job market Q&A with folks from Google, AWS, Microsoft etc.

we are doing DevOps job market Q&A with folks from Google, AWS, Microsoft etc.

2
Comments
1 min read
In 2025, I resolve to spend less time troubleshooting

In 2025, I resolve to spend less time troubleshooting

Comments
12 min read
What IBM's SRE Expert Wants You to Know About Observability - A Beginner's Guide

What IBM's SRE Expert Wants You to Know About Observability - A Beginner's Guide

1
Comments
3 min read
SRE for the SaaS

SRE for the SaaS

Comments
1 min read
Automation for the People

Automation for the People

1
Comments
2 min read
Rely.io October 2024 Product Update Roundup

Rely.io October 2024 Product Update Roundup

1
Comments
4 min read
AIOps Powered by AWS: Developing Intelligent Alerting with CloudWatch & Built-In Capabilities

AIOps Powered by AWS: Developing Intelligent Alerting with CloudWatch & Built-In Capabilities

8
Comments
5 min read
How to Configure a Remote Data Store for Prometheus

How to Configure a Remote Data Store for Prometheus

1
Comments
6 min read
Day 10: ls -l *

Day 10: ls -l *

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.