DEV Community

Site Reliability Engineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
6 Best Free OnCall Software in 2024, Open-Source and SaaS

6 Best Free OnCall Software in 2024, Open-Source and SaaS

Comments
4 min read
Basic Linux Syntax Frequently Used by Writer

Basic Linux Syntax Frequently Used by Writer

1
Comments 3
2 min read
Rolling Out a Robust On-Call Process to Your Team

Rolling Out a Robust On-Call Process to Your Team

Comments
4 min read
Configure an Intuitive Service Dashboard & Reduce Response Time

Configure an Intuitive Service Dashboard & Reduce Response Time

Comments
3 min read
Hiteshwar shares his thoughts on being an SRE

Hiteshwar shares his thoughts on being an SRE

Comments
4 min read
Suppressing Alert Noise during Scheduled Maintenance

Suppressing Alert Noise during Scheduled Maintenance

Comments
3 min read
Simple Log Monitors Using monitro.dev

Simple Log Monitors Using monitro.dev

Comments 3
1 min read
Understanding the Platform Engineering Maturity Model: A Path to Optimized Operations

Understanding the Platform Engineering Maturity Model: A Path to Optimized Operations

1
Comments
6 min read
Volume Testing With Apache Jmeter On Windows.

Volume Testing With Apache Jmeter On Windows.

7
Comments
5 min read
Improve App Availability with Preemptible Pods and PriorityClasses

Improve App Availability with Preemptible Pods and PriorityClasses

1
Comments
1 min read
Assessing DevOps Performance - DORA Metrics

Assessing DevOps Performance - DORA Metrics

1
Comments
9 min read
Journey of Streamlining Oncall and Incident Management

Journey of Streamlining Oncall and Incident Management

Comments
10 min read
Next Wave, Second Wave, it's still...DevOps to me

Next Wave, Second Wave, it's still...DevOps to me

4
Comments
3 min read
Understanding the Kubernetes Readiness Probe: A Tool for Application Health

Understanding the Kubernetes Readiness Probe: A Tool for Application Health

Comments
6 min read
From ground to production: Deploying Workload Identities on AKS

From ground to production: Deploying Workload Identities on AKS

3
Comments 1
8 min read
Platform Engineering: The Next Evolution of DevOps?

Platform Engineering: The Next Evolution of DevOps?

3
Comments
6 min read
How to become a good DevOps Engineer

How to become a good DevOps Engineer

4
Comments 2
3 min read
O básico de mirror do Istio

O básico de mirror do Istio

2
Comments 1
5 min read
OTEL Demo with EKS and New Relic

OTEL Demo with EKS and New Relic

7
Comments
4 min read
Top 5 BetterStack Alternatives For Status Page In 2024

Top 5 BetterStack Alternatives For Status Page In 2024

Comments
4 min read
Terraform Dynamic Blocks: Advanced Use Cases and Examples

Terraform Dynamic Blocks: Advanced Use Cases and Examples

5
Comments
9 min read
From your source code to zero-downtime, high availability, and secure production deployment in no time

From your source code to zero-downtime, high availability, and secure production deployment in no time

1
Comments
1 min read
The Importance of Using Granted for Managing Multiple AWS Accounts

The Importance of Using Granted for Managing Multiple AWS Accounts

1
Comments
2 min read
Virtualization - The Basics

Virtualization - The Basics

3
Comments 3
3 min read
AWS: Your Ally in Amplifying Reliability with GenAI

AWS: Your Ally in Amplifying Reliability with GenAI

4
Comments
5 min read
Como evitar problemas de "Zabbix poller processes more than 75% busy"

Como evitar problemas de "Zabbix poller processes more than 75% busy"

Comments
2 min read
AWS Cost Optimization: Periodic Deletion of ECR Container Images

AWS Cost Optimization: Periodic Deletion of ECR Container Images

8
Comments
5 min read
How to transfer forked repository which original is private in GitHub

How to transfer forked repository which original is private in GitHub

Comments
2 min read
On-Call Cookbook

On-Call Cookbook

1
Comments 1
3 min read
One Year of DevOps at Idus: Reflections and Learnings

One Year of DevOps at Idus: Reflections and Learnings

Comments
4 min read
AWS Cert Manager integration with Prometheus with Domain Name

AWS Cert Manager integration with Prometheus with Domain Name

2
Comments
3 min read
How to Release Service

How to Release Service

Comments
2 min read
How to easily start Backstage

How to easily start Backstage

2
Comments
3 min read
Demystifying Service Level acronyms and Error Budgets

Demystifying Service Level acronyms and Error Budgets

Comments
9 min read
“Automating VPC Peering in AWS with Terraform”

“Automating VPC Peering in AWS with Terraform”

Comments
3 min read
What are SLI, SLO and SLA, and Why are they important in SRE?

What are SLI, SLO and SLA, and Why are they important in SRE?

Comments
3 min read
Kubernetest (on-prem) master node and worker node associations.

Kubernetest (on-prem) master node and worker node associations.

Comments
1 min read
SQLServer service status monitoring on Windows with Prometheu.

SQLServer service status monitoring on Windows with Prometheu.

Comments
1 min read
Amazon Forecast : Best Practices and Anti-Patterns implementing AIOps

Amazon Forecast : Best Practices and Anti-Patterns implementing AIOps

6
Comments
4 min read
How to delete all AWS resources using aws-nuke

How to delete all AWS resources using aws-nuke

2
Comments
2 min read
Definindo SLO - "Let Go!"

Definindo SLO - "Let Go!"

2
Comments
2 min read
Executing bash script commands in a sub-shell to manage status code and output

Executing bash script commands in a sub-shell to manage status code and output

1
Comments
2 min read
Networking 101: Back to School

Networking 101: Back to School

4
Comments 1
6 min read
SRE vs DevOps vs SysAdmin

SRE vs DevOps vs SysAdmin

1
Comments 1
3 min read
Roles and Responsibilities Matrix

Roles and Responsibilities Matrix

Comments
5 min read
LLMs in Amazon Bedrock: Observability Maturity Model

LLMs in Amazon Bedrock: Observability Maturity Model

13
Comments
7 min read
On The Importance of End-to-End Monitoring for IoT

On The Importance of End-to-End Monitoring for IoT

2
Comments
2 min read
DevOps and SRE: A Collaborative Journey Towards Reliable Software Delivery

DevOps and SRE: A Collaborative Journey Towards Reliable Software Delivery

Comments
4 min read
Matriz de Papéis e Responsabilidades

Matriz de Papéis e Responsabilidades

2
Comments
6 min read
Docker Log Observability: Analyzing Container Logs in HashiCorp Nomad with Vector, Loki, and Grafana

Docker Log Observability: Analyzing Container Logs in HashiCorp Nomad with Vector, Loki, and Grafana

8
Comments
8 min read
How to send Alerts and Notifications with Telegram

How to send Alerts and Notifications with Telegram

7
Comments
3 min read
Kubectl Port-forward Flow Explained

Kubectl Port-forward Flow Explained

Comments
3 min read
2024 Site Reliability Engineering: Key Trends and Focus Areas for SREs

2024 Site Reliability Engineering: Key Trends and Focus Areas for SREs

Comments
7 min read
Inside the Kubernetes Control Plane

Inside the Kubernetes Control Plane

21
Comments 2
5 min read
Expand your root EBS Volume attached to your Windows EC2

Expand your root EBS Volume attached to your Windows EC2

Comments
2 min read
ARM vs x86 em Docker

ARM vs x86 em Docker

2
Comments
6 min read
Effortless Database Scaling: Migrate from RDS to Aurora Serverless V2

Effortless Database Scaling: Migrate from RDS to Aurora Serverless V2

Comments
2 min read
Why Should Devops/SRE learn Golang?

Why Should Devops/SRE learn Golang?

Comments
4 min read
Reciprocity, Companion Planting & DevSecOps

Reciprocity, Companion Planting & DevSecOps

1
Comments
3 min read
Por que os times precisam de SLOs, SLIs e Error Budget?

Por que os times precisam de SLOs, SLIs e Error Budget?

Comments
4 min read
loading...