DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
The Importance of Using Granted for Managing Multiple AWS Accounts

The Importance of Using Granted for Managing Multiple AWS Accounts

1
Comments
2 min read
Virtualization - The Basics

Virtualization - The Basics

3
Comments 3
3 min read
AWS: Your Ally in Amplifying Reliability with GenAI

AWS: Your Ally in Amplifying Reliability with GenAI

5
Comments
5 min read
Como evitar problemas de "Zabbix poller processes more than 75% busy"

Como evitar problemas de "Zabbix poller processes more than 75% busy"

1
Comments
2 min read
AWS Cost Optimization: Periodic Deletion of ECR Container Images

AWS Cost Optimization: Periodic Deletion of ECR Container Images

10
Comments
5 min read
How to transfer forked repository which original is private in GitHub

How to transfer forked repository which original is private in GitHub

Comments
2 min read
On-Call Cookbook

On-Call Cookbook

1
Comments 1
3 min read
One Year of DevOps at Idus: Reflections and Learnings

One Year of DevOps at Idus: Reflections and Learnings

Comments
4 min read
O básico de mirror do Istio

O básico de mirror do Istio

2
Comments 1
5 min read
AWS Cert Manager integration with Prometheus with Domain Name

AWS Cert Manager integration with Prometheus with Domain Name

3
Comments
3 min read
Terraform Dynamic Blocks: Advanced Use Cases and Examples

Terraform Dynamic Blocks: Advanced Use Cases and Examples

5
Comments
9 min read
How to Release a Service

How to Release a Service

Comments
2 min read
How to easily start Backstage

How to easily start Backstage

2
Comments
3 min read
Demystifying Service Level acronyms and Error Budgets

Demystifying Service Level acronyms and Error Budgets

Comments
9 min read
“Automating VPC Peering in AWS with Terraform”

“Automating VPC Peering in AWS with Terraform”

Comments
3 min read
What are SLI, SLO and SLA, and Why are they important in SRE?

What are SLI, SLO and SLA, and Why are they important in SRE?

Comments
3 min read
Kubernetest (on-prem) master node and worker node associations.

Kubernetest (on-prem) master node and worker node associations.

Comments
1 min read
SQLServer service status monitoring on Windows with Prometheu.

SQLServer service status monitoring on Windows with Prometheu.

Comments
1 min read
Amazon Forecast : Best Practices and Anti-Patterns implementing AIOps

Amazon Forecast : Best Practices and Anti-Patterns implementing AIOps

6
Comments
4 min read
Networking 101: Back to School

Networking 101: Back to School

4
Comments 1
6 min read
How to delete all AWS resources using aws-nuke

How to delete all AWS resources using aws-nuke

7
Comments
2 min read
Definindo SLO - "Let Go!"

Definindo SLO - "Let Go!"

2
Comments
2 min read
Executing bash script commands in a sub-shell to manage status code and output

Executing bash script commands in a sub-shell to manage status code and output

1
Comments
2 min read
SRE vs DevOps vs SysAdmin

SRE vs DevOps vs SysAdmin

2
Comments 1
3 min read
LLMs in Amazon Bedrock: Observability Maturity Model

LLMs in Amazon Bedrock: Observability Maturity Model

14
Comments
7 min read
loading...