Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Search
Log in
Create account
DEV Community
Close
Site Reliability Engineering
Follow
Hide
Posts
Left menu
đ
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Building a Multi-Tenant gRPC Development Platform with Ambassador and AWS EKS
Brian Annis
Brian Annis
Brian Annis
Follow
for
Place Exchange
Jul 7 '20
Building a Multi-Tenant gRPC Development Platform with Ambassador and AWS EKS
#
sre
#
kubernetes
#
devops
#
grpc
6
 reactions
Comments
Add Comment
9 min read
Kafka Chaos Engineering With Litmus
Konda Reddy L
Konda Reddy L
Konda Reddy L
Follow
Jul 6 '20
Kafka Chaos Engineering With Litmus
#
sre
#
litmuschaos
#
kubernetes
#
kafka
33
 reactions
Comments
Add Comment
10 min read
Blameless' SRE Journey
Hannah Culver
Hannah Culver
Hannah Culver
Follow
for
Blameless
Jul 6 '20
Blameless' SRE Journey
#
sre
#
startup
#
devops
8
 reactions
Comments
Add Comment
8 min read
LitmusChaos in CNCF Sandbox
Uma Mukkara
Uma Mukkara
Uma Mukkara
Follow
Jul 2 '20
LitmusChaos in CNCF Sandbox
#
kubernetes
#
litmuschaos
#
chaosengineering
#
sre
12
 reactions
Comments
Add Comment
3 min read
Twitter's Reliability Journey
Hannah Culver
Hannah Culver
Hannah Culver
Follow
for
Blameless
Jun 30 '20
Twitter's Reliability Journey
#
devops
#
sre
5
 reactions
Comments
Add Comment
6 min read
SRE Leaders Panel: Work as Done vs. Work as Imagined
Hannah Culver
Hannah Culver
Hannah Culver
Follow
for
Blameless
Jun 29 '20
SRE Leaders Panel: Work as Done vs. Work as Imagined
#
devops
#
devrel
#
sre
#
startup
3
 reactions
Comments
Add Comment
26 min read
Top Practices for Runbook Automation
Hannah Culver
Hannah Culver
Hannah Culver
Follow
for
Blameless
Jun 26 '20
Top Practices for Runbook Automation
#
sre
#
devops
16
 reactions
Comments
1
 comment
6 min read
Incident Postmortem Template
Adrian Hornsby
Adrian Hornsby
Adrian Hornsby
Follow
for
AWS
Jun 26 '20
Incident Postmortem Template
#
aws
#
devops
#
sre
#
tutorial
10
 reactions
Comments
Add Comment
6 min read
SRE: A Human Approach to Systems
Hannah Culver
Hannah Culver
Hannah Culver
Follow
for
Blameless
Jun 25 '20
SRE: A Human Approach to Systems
#
devops
#
sre
8
 reactions
Comments
Add Comment
7 min read
Leverage JIRA with Squadcast throughout the incident lifecycle
Asutosh
Asutosh
Asutosh
Follow
for
Squadcast
Jun 24 '20
Leverage JIRA with Squadcast throughout the incident lifecycle
#
sre
#
devops
#
bestpractices
#
incidentmanagement
1
 reaction
Comments
Add Comment
3 min read
Chaos Workflows with Argo and LitmusChaos
Karthik Satchitanand
Karthik Satchitanand
Karthik Satchitanand
Follow
for
LitmusChaos
Jun 23 '20
Chaos Workflows with Argo and LitmusChaos
#
sre
#
litmuschaos
#
kubernetes
#
chaosengineering
31
 reactions
Comments
1
 comment
8 min read
3 Common API Integration Mistakes and How to Avoid Them
Matt Hawkins
Matt Hawkins
Matt Hawkins
Follow
for
Hoss
Jun 23 '20
3 Common API Integration Mistakes and How to Avoid Them
#
showdev
#
api
#
devops
#
sre
4
 reactions
Comments
Add Comment
4 min read
Best Practices for Effective Incident Management
Hannah Culver
Hannah Culver
Hannah Culver
Follow
for
Blameless
Jun 19 '20
Best Practices for Effective Incident Management
#
sre
#
devops
7
 reactions
Comments
Add Comment
9 min read
IntroducciĂłn a IAM - DĂa #1 de caminando con un SRE
Falcon
Falcon
Falcon
Follow
Jun 19 '20
IntroducciĂłn a IAM - DĂa #1 de caminando con un SRE
#
aws
#
sre
#
devops
4
 reactions
Comments
Add Comment
6 min read
The Chaos Engineering Collection
Adrian Hornsby
Adrian Hornsby
Adrian Hornsby
Follow
for
AWS
Jun 16 '20
The Chaos Engineering Collection
#
aws
#
devops
#
sre
#
computerscience
19
 reactions
Comments
Add Comment
2 min read
Creating your own Chaos Monkey with AWS Systems Manager Automation
Adrian Hornsby
Adrian Hornsby
Adrian Hornsby
Follow
for
AWS
Jun 16 '20
Creating your own Chaos Monkey with AWS Systems Manager Automation
#
aws
#
devops
#
sre
#
computerscience
17
 reactions
Comments
Add Comment
13 min read
Chaos Engineering for cloud-native systems
Uma Mukkara
Uma Mukkara
Uma Mukkara
Follow
Jun 14 '20
Chaos Engineering for cloud-native systems
#
chaosengineering
#
litmuschaos
#
kubernetes
#
sre
30
 reactions
Comments
Add Comment
4 min read
Caminando con un SRE
Falcon
Falcon
Falcon
Follow
Jun 11 '20
Caminando con un SRE
#
devops
#
sre
#
aws
#
googlecloud
4
 reactions
Comments
Add Comment
2 min read
Slashing Buildkite deployment time by 75%
Rushikesh Magar
Rushikesh Magar
Rushikesh Magar
Follow
for
Place Exchange
Jun 3 '20
Slashing Buildkite deployment time by 75%
#
buildkite
#
sre
#
docker
#
devops
10
 reactions
Comments
Add Comment
5 min read
Towards More Effective Incident Postmortems
Anu-angie
Anu-angie
Anu-angie
Follow
for
Squadcast
Jun 3 '20
Towards More Effective Incident Postmortems
#
sre
#
incidentmanagement
#
bestpractices
2
 reactions
Comments
Add Comment
10 min read
Site Reliability Engineering: Afrontando el riesgo y los desastres
Juan A. ResĂŠndiz
Juan A. ResĂŠndiz
Juan A. ResĂŠndiz
Follow
Jun 1 '20
Site Reliability Engineering: Afrontando el riesgo y los desastres
#
sre
#
techicaldebt
#
softwaredesign
#
architecture
17
 reactions
Comments
Add Comment
12 min read
Prometheus blackbox_exporter; Unconventional Way
Sami Alhaddad
Sami Alhaddad
Sami Alhaddad
Follow
May 13 '20
Prometheus blackbox_exporter; Unconventional Way
#
prometheus
#
monitoring
#
observability
#
sre
6
 reactions
Comments
Add Comment
2 min read
Chaos Engineering âââHow to safely inject failure?
Adrian Hornsby
Adrian Hornsby
Adrian Hornsby
Follow
for
AWS
May 11 '20
Chaos Engineering âââHow to safely inject failure?
#
aws
#
sre
#
computerscience
#
devops
4
 reactions
Comments
Add Comment
6 min read
Feelings during incident response
Mads Hartmann
Mads Hartmann
Mads Hartmann
Follow
for
Glitch
May 8 '20
Feelings during incident response
#
devops
#
sre
#
podcast
23
 reactions
Comments
Add Comment
3 min read
A Reading List & Repo List đ for Learning DevOps, SRE, and Automation(w/Python)
Ari đŞ
Ari đŞ
Ari đŞ
Follow
May 7 '20
A Reading List & Repo List đ for Learning DevOps, SRE, and Automation(w/Python)
#
aws
#
devops
#
sre
#
python
14
 reactions
Comments
1
 comment
2 min read
Falando sobre SRE - Parte 01 - Uma breve introdução
Jhonatan Morais
Jhonatan Morais
Jhonatan Morais
Follow
May 3 '20
Falando sobre SRE - Parte 01 - Uma breve introdução
#
sre
#
ops
#
devops
#
reliability
8
 reactions
Comments
Add Comment
7 min read
Chaos Engineering ââWhat and who is a chaos engineer?
Adrian Hornsby
Adrian Hornsby
Adrian Hornsby
Follow
for
AWS
Apr 29 '20
Chaos Engineering ââWhat and who is a chaos engineer?
#
devops
#
aws
#
sre
#
computerscience
16
 reactions
Comments
2
 comments
4 min read
Why You Need A Microservice Catalog
John Laban
John Laban
John Laban
Follow
Apr 28 '20
Why You Need A Microservice Catalog
#
microservices
#
sre
5
 reactions
Comments
Add Comment
9 min read
Have there been more reliability incidents lately?
Ben Halpern
Ben Halpern
Ben Halpern
Follow
Apr 23 '20
Have there been more reliability incidents lately?
#
discuss
#
sre
16
 reactions
Comments
14
 comments
1 min read
6 Responsibilities of a Devops Engineer
Gourav Shah
Gourav Shah
Gourav Shah
Follow
for
School of Devops
Apr 25 '20
6 Responsibilities of a Devops Engineer
#
devops
#
sre
7
 reactions
Comments
Add Comment
2 min read
Retrying groups of tightly coupled tasks in Ansible
jeff
jeff
jeff
Follow
Apr 18 '20
Retrying groups of tightly coupled tasks in Ansible
#
sre
#
devops
13
 reactions
Comments
2
 comments
3 min read
Cleaning up Zookeeper Logs and Snapshots
Bryan Sazon
Bryan Sazon
Bryan Sazon
Follow
Apr 16 '20
Cleaning up Zookeeper Logs and Snapshots
#
zookeeper
#
devops
#
sre
#
clickhouse
8
 reactions
Comments
Add Comment
1 min read
How does deployment work at your organization?
Ben Halpern
Ben Halpern
Ben Halpern
Follow
Apr 5 '20
How does deployment work at your organization?
#
discuss
#
devops
#
sre
71
 reactions
Comments
73
 comments
1 min read
Visualize Google Cloud Billing data in Grafana with BigQuery
Bryan Sazon
Bryan Sazon
Bryan Sazon
Follow
Apr 14 '20
Visualize Google Cloud Billing data in Grafana with BigQuery
#
devops
#
grafana
#
googlecloud
#
sre
3
 reactions
Comments
2
 comments
2 min read
go apps + jaeger tracing
Alex Leonhardt
Alex Leonhardt
Alex Leonhardt
Follow
Apr 11 '20
go apps + jaeger tracing
#
go
#
observability
#
sre
#
devops
9
 reactions
Comments
2
 comments
1 min read
April Fools and the Broken Promises of One-off Hacks
Ben Halpern
Ben Halpern
Ben Halpern
Follow
for
The DEV Team
Apr 1 '20
April Fools and the Broken Promises of One-off Hacks
#
meta
#
sre
#
devops
129
 reactions
Comments
8
 comments
4 min read
DevOps Engineer vs. SRE?
Edvin
Edvin
Edvin
Follow
Apr 5 '20
DevOps Engineer vs. SRE?
#
discuss
#
sre
#
devops
10
 reactions
Comments
6
 comments
1 min read
Ask DEV: LightWeight APM for Kubernetes using OpenTelemetry?
Pranay Prateek
Pranay Prateek
Pranay Prateek
Follow
Apr 2 '20
Ask DEV: LightWeight APM for Kubernetes using OpenTelemetry?
#
discuss
#
sre
#
devops
#
kubernetes
5
 reactions
Comments
Add Comment
2 min read
Dreams and Nightmares of Ops
Jay Gordon
Jay Gordon
Jay Gordon
Follow
for
Microsoft Azure
Mar 20 '20
Dreams and Nightmares of Ops
#
oncall
#
sre
#
chaosengineering
34
 reactions
Comments
2
 comments
10 min read
Have you considered Site Reliability Engineering as a path?
Ben Halpern
Ben Halpern
Ben Halpern
Follow
Mar 13 '20
Have you considered Site Reliability Engineering as a path?
#
codenewbie
#
beginners
#
career
#
sre
66
 reactions
Comments
12
 comments
1 min read
Towards Operational ExcellenceâââPart 3
Adrian Hornsby
Adrian Hornsby
Adrian Hornsby
Follow
for
AWS
Mar 10 '20
Towards Operational ExcellenceâââPart 3
#
aws
#
devops
#
sre
#
computerscience
7
 reactions
Comments
Add Comment
11 min read
Towards Operational ExcellenceâââPart 2
Adrian Hornsby
Adrian Hornsby
Adrian Hornsby
Follow
for
AWS
Mar 10 '20
Towards Operational ExcellenceâââPart 2
#
aws
#
devops
#
sre
#
computerscience
7
 reactions
Comments
Add Comment
11 min read
SRE in laymanâs terms (4 core concepts)
Chen
Chen
Chen
Follow
Mar 2 '20
SRE in laymanâs terms (4 core concepts)
#
sre
#
devops
#
engineering
#
beginners
6
 reactions
Comments
Add Comment
4 min read
â Why I started developing đĄ my new software project by building a đ Continuous Deployment đ pipeline
Leonid Belkind
Leonid Belkind
Leonid Belkind
Follow
for
StackPulse
Mar 1 '20
â Why I started developing đĄ my new software project by building a đ Continuous Deployment đ pipeline
#
devops
#
sre
#
architecture
#
pipeline
7
 reactions
Comments
1
 comment
7 min read
List of DevOps/SRe Conferences in 2020
Adriano Canofre
Adriano Canofre
Adriano Canofre
Follow
Feb 27 '20
List of DevOps/SRe Conferences in 2020
#
devops
#
sre
#
conferences
6
 reactions
Comments
1
 comment
1 min read
Deploy an Angular App Using Google Cloud Run
Marouen Helali
Marouen Helali
Marouen Helali
Follow
Feb 25 '20
Deploy an Angular App Using Google Cloud Run
#
angular
#
sre
#
devops
#
tutorial
11
 reactions
Comments
4
 comments
4 min read
How does your team handle critical production errors?
Ben Halpern
Ben Halpern
Ben Halpern
Follow
Feb 17 '20
How does your team handle critical production errors?
#
discuss
#
sre
#
devops
9
 reactions
Comments
5
 comments
1 min read
Folks, what are some conferences in DevOps/SRE space that you look forward to?
Pranay Prateek
Pranay Prateek
Pranay Prateek
Follow
Feb 10 '20
Folks, what are some conferences in DevOps/SRE space that you look forward to?
#
kubernetes
#
sre
#
devops
7
 reactions
Comments
1
 comment
1 min read
7 Site Reliability lessons from Google and Amazon
Raoul Meyer
Raoul Meyer
Raoul Meyer
Follow
Jan 30 '20
7 Site Reliability lessons from Google and Amazon
#
sre
#
devops
53
 reactions
Comments
Add Comment
6 min read
My quest for identity in Software Engineering
Alex
Alex
Alex
Follow
Jan 29 '20
My quest for identity in Software Engineering
#
serverless
#
jamstack
#
devops
#
sre
7
 reactions
Comments
Add Comment
15 min read
Towards Operational ExcellenceâââPart 1
Adrian Hornsby
Adrian Hornsby
Adrian Hornsby
Follow
for
AWS
Jan 21 '20
Towards Operational ExcellenceâââPart 1
#
aws
#
devops
#
sre
#
computerscience
20
 reactions
Comments
Add Comment
10 min read
Molly Struve had a long winding journey to SRE... and other things I learned recording her DevJourney
Tim Bourguignon đŞđşđŤđˇđŠđŞ
Tim Bourguignon đŞđşđŤđˇđŠđŞ
Tim Bourguignon đŞđşđŤđˇđŠđŞ
Follow
Jan 21 '20
Molly Struve had a long winding journey to SRE... and other things I learned recording her DevJourney
#
devjourney
#
career
#
learning
#
sre
5
 reactions
Comments
Add Comment
3 min read
Beyond Blameless
Senior Oops Engineer
Senior Oops Engineer
Senior Oops Engineer
Follow
Jan 16 '20
Beyond Blameless
#
sre
#
devops
10
 reactions
Comments
Add Comment
6 min read
DevOps vs. Site Reliability Engineering (SRE)
Mike Pfeiffer
Mike Pfeiffer
Mike Pfeiffer
Follow
for
CloudSkills.io
Jan 15 '20
DevOps vs. Site Reliability Engineering (SRE)
#
devops
#
sre
53
 reactions
Comments
Add Comment
31 min read
SLOs with Stackdriver Service Monitoring
Yuri Grinshteyn
Yuri Grinshteyn
Yuri Grinshteyn
Follow
Jan 7 '20
SLOs with Stackdriver Service Monitoring
#
devops
#
sre
#
stackdriver
#
slos
7
 reactions
Comments
Add Comment
8 min read
The Future of Monitoring is Autonomous
gdcohen
gdcohen
gdcohen
Follow
Jan 6 '20
The Future of Monitoring is Autonomous
#
devops
#
kubernetes
#
sre
10
 reactions
Comments
Add Comment
6 min read
Resources to learn about DevOps cultural concepts and some tools
Edson C. (aka tuxpilgrim)
Edson C. (aka tuxpilgrim)
Edson C. (aka tuxpilgrim)
Follow
Dec 30 '19
Resources to learn about DevOps cultural concepts and some tools
#
devops
#
sre
7
 reactions
Comments
Add Comment
1 min read
The Night Before Code Freeze
Brandon Weaver
Brandon Weaver
Brandon Weaver
Follow
Dec 6 '19
The Night Before Code Freeze
#
devops
#
sre
#
ops
53
 reactions
Comments
1
 comment
4 min read
How To Get AWS Lambda Logs Into CloudWatch
Lou (đ Open Up The Cloud âď¸)
Lou (đ Open Up The Cloud âď¸)
Lou (đ Open Up The Cloud âď¸)
Follow
Nov 11 '19
How To Get AWS Lambda Logs Into CloudWatch
#
aws
#
devops
#
cloud
#
sre
8
 reactions
Comments
Add Comment
6 min read
Rapid Docker on AWS: How to monitor the application?
Andreas Wittig
Andreas Wittig
Andreas Wittig
Follow
Nov 8 '19
Rapid Docker on AWS: How to monitor the application?
#
aws
#
docker
#
devops
#
sre
10
 reactions
Comments
Add Comment
4 min read
loading...
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account