The software development life cycle has come a long way - from a non-overlapping development model - the Waterfall Model to an iterative development model like Agile and DevOps. It’s interesting to notice that before the beginning of the DevOps movement (~2007-2008), SRE was born at Google (2003), to build the reliability and resiliency of the entire Google Infrastructure. Google in its SRE book, described how the collaborative efforts of DevOps engineers, SRE, and other engineers like Application Security engineers are vital for maintaining a product like Gmail.
Looking at the above example, it is safe to say that our growing dependency on applications, is what has propelled the widescale adoption of DevOps and SRE. Whether it’s to streamline our business functionalities or launch an app that simplifies our life, we need reliable and scalable systems at every step.
DevOps defines a software development approach with a shift in organizational culture towards agility, automation, and collaboration. It aims at eliminating siloes and bridging the gaps between the different departments of development and operations.
In this process, the code development goes through iterative steps of - Continuous Development, Continuous Integration, Continuous Testing, Continuous Feedback, Continuous Monitoring, Continuous Deployment, and Continuous Operations. Also popularly known as ‘7Cs of DevOps Lifecycle.’
Site Reliability Engineering or SRE plays a more comprehensive role in streamlining the end-user experience and is more concerned with incorporating software development practices into IT operations.
To put it simply, the SRE concept says that if a developer handles the task of IT operations, what are the places where automation can be brought into the picture. This means it expects to use automation as a means to fix many of the problems arising while managing applications in production.
SRE uses three service level agreements to measure the application performance -
- Service level agreements (SLAs) - to define the appropriate reliability, performance, and latency of the application, as desired by the end-user.
- Service level objectives (SLOs) - The target goals set by the SRE team to meet the expectations of SLAs.
- Service level indicators (SLIs) - to measure specific metrics (like system latency, system throughput, lead time, mean time to restore (MTTR), development frequency, and availability error rate) to conform to the SLOs.
- Both methodologies are focused on monitoring production and ensuring the operations management works as smoothly as expected.
- One of their fundamental principles is breaking siloes. It aims at bringing all the stakeholders (Dev team + Ops team) in the application development together. Believe in the model of ‘shared responsibility’ and ‘shared ownership.’
- Their common goal is to simplify the operations in the distributed system.
DevOps focuses on building the core of the product. The core is why the product is developed in the first place. It works on the aspect of customer requirements - the different needs and specifications. Taking an agile approach to software development with the continuous process of build, test, and deployment.
SRE teams narrow their focus around the fact of whether the core is really implemented. Whether the product is meeting the expectations of the customer. It monitors the metrics of the application performance and gives feedback to the DevOps team, about the direction of changes that need to be implemented.
The DevOps team is more experimental in nature. They write codes and test them constantly for bugs, or may be adding new features. They develop the core design of the product, give shape to it, and push it to production.
The SRE team on the other hand is investigative in nature. They constantly monitor the concerned metrics and give feedback on the possible lines of improvement. They are concerned more with the experience of the end-user. They perform an analysis of every problem, to see its frequency, and find ways of automating the repetitive operations.
Their goal is to find ways to innovate in recurring instances of bugs.
Be it DevOps or SRE the sole purpose of their existence can be distinguished by the fact, that they both aim at automating manual processes. It's not about just saving time in terms of doing tasks, but extends beyond the fact that anything done manually is prone to errors.
When it comes to automation in DevOps, it means automating deployment (tasks and new features). However, automation in SRE is automating redundancy. They convert the manual tasks into programmatic tasks to keep the tech stacks up and running.
Every set of tasks ever assigned has a goal associated with it. The goal of DevOps is to develop a template to drive activities towards collaboration. And SRE team focuses on formulating prescriptive measures to enhance the reliability of every deployed application.
Both SRE and DevOps have a shared goal of breaking siloed workflow, bringing automation to recurring manual tasks, and incorporating constant monitoring. Some prime areas where they face challenges are:
- CI/CD pipeline management: Implementing different automated tests at different stages of pipelines to ensure errorless codes.
- Monitoring and Alerting: The core function is to help us in increasing the reliability of our applications. Gaining 360-degree visibility into the system will help in diagnosing the health of services and gaining vital analytics.
- Incident Management: To understand the cause of service failures, the severity of a bug, or even to get alerted immediately when any requests start failing requires prompt communication.
Platforms enabling managed Microservices and managed Kubernetes help us in maintaining the lifecycle of applications, by addressing the above-mentioned challenges. Looking at the data on managed services market size, a projection of $ 274 billion by 2026, shows their potential in simplifying the manageability of applications.
With the global tech giants like Google, Amazon, and Netflix pioneering the adoption of DevOps and SRE, their ROI has grown in leaps and bounds. Furthermore, looking at their never-down robust infrastructure, it is evident that these methodologies are here for the long run.
Do give a look here! An insightful article on DevSecOps best practices.