DEV Community

Luis Serra
Luis Serra

Posted on

Why ephemeral and immutable infrastructure are so important in Cloud Native environments

In the cloud, we know exactly what we want a server to be, and if we want to change that we simply terminate it and launch a new server with a new AMI. This is enabled by a change in how you think about managing your resources in the cloud or a virtualised environment. Also it allows us to fail as early in the process as possible and by doing so mitigate the inherent risk in making changes.
Greg Orzell in “Building with Legos” a Netflix Tech Blog article

Introduction

For years, infrastructure management was based on various processes and routines that required manual intervention by engineers or technicians. While these practices were effective, the development landscape has undergone significant changes in recent years. The advent of agile methodologies, shorter development cycles, increased focus on time-to-market speed, distributed systems, and scaled environments have made it challenging for traditional infrastructure management to keep pace. Cloud transformation and the cloud-native trend were the ultimate push that evidenced a change need.

A new, more agile approach to infrastructure management was needed to respond to these challenges. Instead of treating infrastructure as unique, valuable “pets” that required significant time, effort, and resources to maintain, a more standardised, commoditised approach was needed. By viewing infrastructure as replaceable “cattle,” organisations can standardize their systems, reduce the risks associated with manual management, and ensure their infrastructure is equipped to meet the demands of modern development.

The pets vs cattle analogy were first used by Randy Bias to explain the difference between traditional and new approaches to server management.

In the old way of doing things, we treat our servers like pets, for example Bob the mail server. If Bob goes down, it’s all hands on deck. The CEO can’t get his email and it’s the end of the world. In the new way, servers are numbered, like cattle in a herd. For example, www001 to www100. When one server goes down, it’s taken out back, shot, and replaced on the line.

In this article, we delve into the challenges of utilising mutable and long-lived infrastructure and its effect on cloud-native transformations. We also explore the benefits of adopting an immutable and ephemeral infrastructure approach.

To provide practical insights, we will illustrate each topic with a real-world scenario from our experiences at xgeeks, demonstrating how utilising immutable and ephemeral infrastructure has aided one of our clients in achieving a cloud-native transformation.

Let’s get deep into the constraints of long-lived and mutable infrastructure

To understand the real benefits of immutable and ephemeral infrastructure, we need to get deep into the main challenges and constraints of a long-lived and mutable infrastructure in an agile development world:

  • An increase in operational complexity and consequently reduced reliability, the increase in distributed service architectures and dynamic scaling leads to a significant increase in maintenance and monitoring requirements, mainly due to changes in the runtime environment. Maintenance and configuration processes across multiple machines or servers are not compatible with flexible and continuously changing environments.
  • The previous point was a clear impact in the second, slower deployments. As infrastructure becomes unpredictable due to the multiple configurations and processes, the accuracy and consistency of information are diminished. This leads to a waste of time fixing configuration issues and debugging the runtime environment due to possible configuration drifts.
  • Next, there are also problems with the monitoring pain, imagine yourself searching for errors on a system running for a long time, with several processes running and several configuration changes over time.
  • And finally, there are fire drills or out-of-control events, like interventions, updates or patches that you don’t have full control of, a cloud provider reboot or a zone outage could be a good example. This will increase the costs with on-call teams, being notified to put your infrastructure up and running again.

n our client scenario at the beginning presented several challenges in implementing agile development processes. Despite initial efforts, the corporation has struggled to achieve desired results due to infrastructure issues.

Previously, the company was delivering their product every 3 months, allowing for manual correction of any configuration drift. However, with an increased push for more frequent product delivery, the simple task of managing four virtual machines where the backend and frontend were hosted became a significant challenge. Configuration drift caused by independently configured instances and later resource starvation resulting from missing log rotations causing database problems were just a few of the difficulties faced.

So, what is exactly immutable and ephemeral infrastructure?

To understand immutable infrastructure, first, we need to understand what immutable means. “Immutable” refers to something that cannot be changed, altered, or modified.

In the context of software development and infrastructure, “immutable” is used to describe systems, components, or resources that remain unchanged during their entire lifecycle. This means that once they are deployed, they cannot be updated or modified in any way. Instead, a new version of the system, component or resource must be created if changes are needed.

Now is the time to talk about ephemeral but first, let’s get deep into what the ephemeral term means. “Ephemeral” refers to something that is short-lived or temporary and does not persist for a long time.

In the context of infrastructure, the term “ephemeral infrastructure” refers to computing resources or components that are created dynamically and destroyed as needed, rather than being persistent and long-lived. This allows for greater flexibility, scalability, and ease of management in cloud-based or other dynamic computing environments.

As observed, both types of infrastructure differ in their design principles. While immutable infrastructure prioritizes stability through unchanging components, ephemeral infrastructure values flexibility through its ability to be easily replaced. By combining these two, an infrastructure is created that can quickly scale, deploy, and recover in response to changes in demand or conditions.

Coming back to our scenario, it became evident that those virtual machines needed to be transformed into immutable and ephemeral components. The persistence of these machines was hindering the client’s deployment process, so we needed to find a way to make these instances reproducible and externalize any non-reproducible elements.

What are the main advantages of using this type of infrastructure?

Now, let’s delve into the advantages of this method and why it helps organisations with their cloud-native transformation.

  • First, simplifying operations, once utilising automated deployment techniques allows for the substitution of outdated resources with updated versions, ensuring your systems remain in their original “known-good” state.
  • Second, there is continuous and faster deployment, and awareness of what is being run and its behaviour is maintained. Updating becomes a regular, ongoing process with fewer errors occurring during the production and all updates can be monitored through source control and CI/CD processes.
  • Next, we have mitigation of errors and increase reliability, new instances can be raised almost instantly and their lifecycle is now much shorter, this will reduce the risk of data loss or corruption, as well as the risk of configuration drifts, vulnerability surface, and the level of effort required to meet service level agreements. This helps organizations maintain a high level of reliability and stability, even as their workloads change and evolve over time.
  • Another advantage is preparation for fire drills or cloud-ready components. Once you know the desired state of each machine, operations like reboot, recovery and running can happen and you are much more confident when cloud reboots happen that your underlying instances should be handled gracefully and with minimal if any, application downtime.
  • The added benefit of improved scalability comes with the aforementioned advantage and this makes it easy to scale up or down as needed, without having to worry about the underlying hardware. This allows organisations to quickly respond to changing demands and to take advantage of new market opportunities.
  • And finally potential reduction of costs. Immutable infrastructure is ready to be dynamic which is very important when we are talking about provisioning infrastructure in a cloud provider. Another outcome in terms of reducing costs is a reduction in expenses related to the upkeep and upgrading of conventional, persistent servers.

Seems good so far, right? But now you are asking how did you implement it? Let’s get back into our scenario then. To begin with, we started to externalise the database instance to a Platform-as-a-Service (PaaS) solution to reduce the risk of downtime, which allows us to simplify operations and increase reliability. Then, we followed three steps to make these machines immutable resources: externalization of configurations, packaging, and provisioning. We transferred all configuration management responsibilities to tools such as Consul and Vault from HashiCorp to achieve service discovery, configuration management, health checks, and secure storage of sensitive data. We used Packer also from HashiCorp to create pre-configured virtual machine templates that can be quickly deployed to save time and reduce manual configuration errors. Finally, we established a deployment process for these machines using Terraform from HashiCorp, a leading Infrastructure as Code tool for provisioning.

After all these steps, the cloud was only one command away, since we were able to create reproducible infrastructure, which happened some months after, along with containerisation and so much more.

Immutable and ephemeral infrastructure can be found in all sizes and forms

So far we have repeatedly mentioned the terms infrastructure, machines and servers, but what can be turned into immutable and ephemeral infrastructure? Nearly everything can be, but let’s delve deeper.

Virtualization was the catalyst for the growth of immutable and ephemeral infrastructure. It was easy to create new servers, firewalls, etc. on a hypervisor, and if something went wrong, a new machine could be brought online with just a few clicks.

However, virtual machines became cumbersome due to their heavy weight and numerous layers of management, including the kernel, operating system, packages and dependencies, applications, and more. To address these issues, newer concepts such as containerization emerged, resulting in smaller, lighter, and simpler components for our infrastructure.

With the advent of tools such as Kubernetes, Apache Mesos, Nomad, OpenShift, and others, the concept of immutable and ephemeral infrastructure gained a new perspective. Not only can our servers be transformed into immutable and ephemeral components, but our services and applications can also be made easily replaceable.

Finally, cloud providers delivered the finishing touch to the world of immutable infrastructure. With the ability to provision infrastructure through simple API requests, nearly everything can be turned into immutable. Resources such as servers, firewalls, load balancers, applications, functions and more can now be set up quickly, efficiently, and most importantly, automatically, allowing us to keep pace with our company’s evolving requirements and demands.

To finalize our scenario follow-up, currently, our client has all kinds of sizes and forms of immutable infrastructure resources running in his company. After the cloud transformation, business increased, and with containerization already in place, Kubernetes implementation was just around the corner. At the moment, we have pods running our client applications, virtual machines for specific workloads and even serverless functions to automate some processes. The main key behind all these changes and implementations is an immutable and ephemeral infrastructure which gave our client the opportunity to follow the market with flexibility, speed, stability and reduced costs.

Top comments (0)