SRE book notes: The Evolution of Automation at Google

#sre #automation #books

These are the notes from Chapter 7: The Evolution of Automation at Google from the book Site Reliability Engineering, How Google Runs Production Systems.

This is a post of a series. The previous post can be seen here:

SRE book notes: Monitoring Distributed Systems

Hercules Lemke Merscher ・ Jan 18 ・ 2 min read

#sre #books #monitoring

doing automation thoughtlessly can create as many problems as it solves

It isn’t appropriate to automate every component of every system, and not everyone has the ability or inclination to develop automation at a particular time. Some essential systems started out as quick prototypes, not designed to last or to interface with automation.

Automate Yourself Out of a Job: Automate ALL the Things!

We graduated from optimizing our infrastructure for a lack of failover to embracing the idea that failure is inevitable, and therefore optimizing to recover quickly through automation.

A team not running automation has no incentive to build systems that are easy to automate.

The most functional tools are usually written by those who use them.

shipping and iterating rapidly might allow you to implement functionality faster, yet rarely makes for a resilient system.

A post worth reading, from the Engine Yard blog:

Pets vs. Cattle – EngineYard

The difference between the pre-virtualisation model and the post-virtualisation model can be thought of as the difference between pets and cattle.

DEV Community

SRE book notes: The Evolution of Automation at Google

SRE book notes: Monitoring Distributed Systems

Hercules Lemke Merscher ・ Jan 18 ・ 2 min read

Pets vs. Cattle – EngineYard

Top comments (0)

Read next

This is my first post!

Salesforce

How to Push Docker Image to Docker Hub

Court is in session: Top 10 most notorious C and C++ errors in 2024