Are you daunted by horrifying terms like docker, image registry, container orchestration, docker swarm, kubernatics, kubectl? then this article is for you where I'll try to simplify and explore containers.
The concept of containerization dates back prior to 1956 of shipment industry. During those times all the shipping was done using loose boxes and barrels. It was a back breaking and time consuming work to get all those units loaded and offloaded in ships, boats and trucks. Soon, Malcolm McLean a United States businessman and entrepreneur revolutionized international trade and transport industry by developing modern days shipping containers.
Shipping container provided standardized unit of goods transportation which could be transported through truck, train and giant ships. The tools were developed to handle them and everyone knew how to use or store them resulting in tremendous improvement in shipping industries’ efficiency and margins.
This is effectively what containers are doing for software industry today. So let’s look at what are containers in IT terms?
Container is a sandbox environment for an application to execute. It encompasses everything that an application needs like code libraries, runtime, dependencies, files, images, compiled application code ready to run and configuration.
Container first came into picture in 1979 and since then it has come a long way. In 2013, docker came in the scene and mobilized widespread organization adoption of containers. As of now there are several container providers like docker, Linux containers LXC and LXD, CoreOS, hyper V containers and Windows server containers etc.
In earlier days before virtualization, the operating system installed on physical server will have entire system resource available. On that system, application and dependent libraries can be installed. This worked well but lack optimal resource utilization which acted as a key driver for concept of virtualization.
Then virtualization concept came, in which a hypervisor gets installed on physical server and have access to entire server resources. It then allocates required server resources in the form of virtual machines and a guest operating system gets installed on virtual machines which then controls allocated resources. This approach made larger pool of computing resources available to large group of people without compromising security and confidentiality. Since resources are utilized more efficiently, multitenancy brought cost-effectiveness.
Containers are one step ahead of virtual machines. At core containers are very similar to virtual machines except only underlying operating system virtualization unlike entire system virtualization in virtual machine.
Thus, in container technology a single guest operating system on host machine can run many different creative applications. This makes containers much more efficient, fast and lightweight in comparison to virtual machines. In simple terms, think of container as running virtual machine just without the overhead of spinning up entire operating system instead it shares the guest operating system’s kernel. This makes them exceptionally lightweight to only few megabytes as they do not need to reproduce entire operating system and hence just take few seconds to bootup.
Before going further into advantages of containers lets first look at some of the problem statements which contributed in rise of containers.
- Portability- One container for all environments – Think of a scenario when an application worked well in development environment but breaks in staging or production environments. This happens due to drift in runtime, dependency or configuration across different environments. The larger are the number of environments more are the chances for drift. Here, container (example Docker) solves this problem by packaging runtime, code, configuration and dependencies as a consistent unit for deployment. This artifact which is created as a part of build process can be then delivered to any machine or environment confidently and it will run correctly because it will bring along everything it needs to run.
2.Evolution of microservices - Microservice architecture is gaining lot of popularity these days. The giants like Facebook, Amazon etc. are adopting microservice largely. There are three primary advantages for adopting it -
a.Certain applications are easy to build and maintain, when they are broken in to small pieces or services instead of maintaining a monolithic application.
b.If any module or service needs upgrade, it can be done without impacting whole application.
c.If any module or service goes down, whole application remains largely unaffected as it is loosely coupled.
When it comes to deploying microservice on virtual machines there is a lot of wastage of resources like RAM, processor and disk space as each of it gets deployed on an instance. They are not utilized completely by microservice which is running in these virtual machines. So apparently, it’s not an ideal way to deploy a microservice architecture. Also, think of a large application where 50+ microservices needs to be deployed then using instances would not be feasible due to resource wastage. On the contrary, containers allow lot more density in terms of application running on the same hardware resulting in better resource utilization and cost optimization.
Container Orchestration
Containers due to their light weight and portable nature have made building and scaling cloud native application easier. It is easy to deploy and manage few of them but complexity grow exponentially as environment grows. Think of a large application with hundreds of containers and services, in such cases automation or a tool is required to place the containers appropriately, start and stop them as required or in other words manage lifecycle of container. This is when container orchestration tools come into picture. Some of the examples are AWS Elastic container service, AWS Elastic Kubernetes service, Apache MESOS, Docker Swarm, Azure container service etc.
Container orchestrator is a tool which can manage and automate tasks like -
- Host provisioning and container deployment
- Configuration and scheduling
- Host server resource allocation for containers
- Manage availability of containers
- Spinning up or removing containers according to workload across infrastructure
- Routing traffic and load balancing
- Container health monitoring and replacement of unhealthy containers
- Manage communication between containers and keep it secure
AWS Services for Containers
AWS has several offerings to help accelerate container deployment. Lets look at them one by one.
Amazon Elastic Container Registry
Elastic container registry is a fully managed AWS PaaS service. It eases the management, storage and deployment of docker images. It is the place where all the container images are placed and distributed to be reproduced as new containers. Think of AMI’s in case of EC2 instances, similarly container image is a base file which tells what all should be created while executing.
ECR is natively integrated with elastic container service, elastic Kubernetes service which makes deployment workflow simple and AWS IAM which controls access to each repository. It uses S3 in backend to store image durably.
It also eliminates the need to manage the infrastructure to host private repository and its availability and scaling. It offers a highly available and scalable platform to reliably host container images.
Amazon Elastic Container Service (ECS)
As containers gained popularity and enterprises started using them at large scale this is when AWS introduced their container management service for docker containers called elastic container service (ECS).
This AWS managed docker container orchestration service eliminates the need to manage containers and clusters and offers managed cluster of AWS EC2 instances to run application in docker containers.
ECS is widely used within Amazon in backend to power services such as AWS Batch, AWS Polly, AWS LEX, AWS SageMaker which means that it is tried and tested for availability, security, reliability and scale.
It is an AWS regional service that supports native integration with several AWS services like, AWS IAM, AWS Secret manager, AWS CloudFormation, AWS Elastic load balancer, AWS Route 53, AWS app Mesh which brings rich observability, security and traffic controls to application.
Since launch ECS has grown so fast that currently every hour five times more containers are being launched as compared to EC2 instances.
AWS ECS offers two container hosting models based on whether user wants to manage infrastructure or not -
• EC2 launch type
• Fargate – serverless offering
EC2 launch type- Infrastructure first approach
In this model ECS allow user to define and manage cluster or group of instances over which ECS deploy and manage containers. These container EC2 instances are same as any other EC2 instance, they can be on-demand or spot instances. The only difference is that they have an ECS agent installed on them which controls their lifecycle and takes care of the communication between ECS service and the instance, providing the status of running containers and managing running new ones. This agent can be installed manually on the instances or using pre-baked AMIs.
The instance on which containers are running appears in EC2 instance list and user can SSH them. Here user is responsible for monitoring, patching, scaling and security. Below is the responsibility matrix.
In this approach ECS runs tasks based on available infrastructure. This is explained in below steps:-
- When a task needs to be deployed via ECS service. In the backend list of all available instances is fetched. Then some of those instances which are already running any task and their resources are consumed or are of inappropriate size and configuration are filtered out.
- Instances are further filtered based on placement constraint and strategy, like spread containers across availability zones.
- Finally, tasks are deployed on selected instances.
Likewise, if placement constraints define instance family as r5 and if none are available then deployment fails.
Here it is needless to say that infrastructure is in the center of deployment and application is deployed based on infrastructure availability and suitability.
Lets now look at the deployment steps in AWS console for EC2 launch type. Firstly ECS cluster needs to be provisioned to run tasks over it. Following inputs are taken from user -
- Define cluster name
- Instance provisioning model- on-demand or spot instances
- EC2 instance type- like m5.large, r5.xlarge or t3.large instance type.
- Instance count – how many instances will be part of the cluster
- EC2 AMI ID- user can choose from a list of available AMI’s which already has ECS agent included in it.
- EBS volume size- in EC2 launch type user an choose to have persistent storage.
- Create Key-pair – in EC2 launch type allows SSH into the instance which gives granular control.
- Define VPC, Subnets and CIDR range.
- Security group inbound rules
- IAM role for container instance- ECS agent needs to communicate with ECS service. To enable these calls “ecsInstanceRole” is required.
Once above attributes are defined, ECS creates a CloudFormation template to create the resources. The instances are launched in autoscaling group to ensure that the defined desired count of healthy instances is always available.
Once above cluster is ready then we need to define “task definition” which is a blueprint of application. Below attributes are defined -
- First, we have to decide whether we want our task to be compatible with EC2 or Fargate launch type.
- Define name of task definition
- Define task role if container needs to interact with any other AWS service, like container app needs to put data in S3 bucket.
Network mode - There are several modes which calls for separate detailed discussion but in general for windows only default mode is supported and for Linux, bridge mode allows one to one mapping of instance and container port.
Define execution role for task “ecsTaskExecutionRole”
Task size- optional for EC2 launch type.
Define or add container – here container details like name,
container image repository path, memory limit, port mapping,
health check, container environment details, network settings,
logging etc. can be defined.
With above attributes Task definition is created next to run a task from the task definition. Task is an instance of task definition.Here task name, EC2 instance cluster, number of tasks need to run, task placement across instances and AZ’s are defined and task is ready to run.
In nutshell, task definition and tasks are the mechanism to tell ECS service which specific container image, with this much CPU and memory, with specific network settings, where it needs to be placed. Once this information is given and containers are spinned up, ECS service manages them by automatically scaling containers, placing them effectively across available compute resources.
So, to summarize, In ECS EC2 model the hosting approach is infrastructure centric. AWS manages the control plane but user is still responsible for management of data plane or in other words user is responsible for managing EC2 instances, developing containers and managing them. It is useful if user wants to define, control and manage infrastructure however AWS further offers another model to eliminate infrastructure management overhead. Let’s look at it next.
Fargate- The game changer- Application first approach
ECS made it swift to run docker containers in cloud but as we saw, we still have to manage underlying EC2 instances. To eliminate this overhead AWS came up with a serverless compute engine in November 2017 called Fargate. With Fargate there are no clusters of instances to be provisioned, patched and managed. We only have to define containers and the compute resource it requires and all compute requirement are taken care by AWS as and when required. All we do is build container image, define the CPU and memory requirements, define IAM policies, networking and launch it.
It is natively integrated with AWS ecosystem services AWS VPC, AWS IAM, CloudWatch and load balancers. As of now Fargate is the default and most popular choice to run docker containers in AWS. This feature switched the approach to application first, because now the requirements are owned by application developers instead of infrastructure availability and it responds to application requirements.
In ECS control plane is always managed by ECS service but with Fargate, data plane is also managed by AWS.
Let’s now look at the deployment steps in AWS console for Fargate launch type to understand the difference.
There are only four objects to be defined:-
1.Define container definition- Choose container image
2.Define task definition name
- Network mode supported for Fargate is “awsvpc”
- Define execution role for task “ecsTaskExecutionRole”
- Define task size. Task memory and CPU. Billing happens
based on these attributes
3.Define the Service- Service name, desired task count, security group and application load balancer
4.Configure cluster -Cluster name, VPC and subnets
Once above attributes are defined, application is ready to run.
Let’s look at the key difference between EC2 and Fargate launch type and their suitability in different scenarios.
Pricing:-
ECS with EC2 launch type is charged according to the resource usage.
ECS Fargate is billed according to VCPU and memory requirement defined in task.
AWS Elastic Kubernetes Service (EKS)
AWS Elastic Kubernetes service is a managed AWS service which allows to run Kubernetes on AWS without having to manage underlying Kubernetes control plane.
Kubernetes or “k8s” was developed by Google based on their experience of running containers and was released open source in 2014 to cloud native computing foundation.
It is now an open source container orchestration platform which like any other container management platform allows scheduling, scaling, distributing load across containers, replace failed containers etc.
EKS is natively integrated with AWS services like IAM, VPC and CloudWatch etc and provides secure and highly scalable way to run applications in Kubernetes cluster.
EKS is highly available- EKS by design is highly available as it deploys container servers across AZ’s. It monitors and automatically replaces unhealthy servers. It uses blue-green architecture to provide zero downtime during patching.
Offers Serverless Option- Fargate is serverless compute option for deploying containers.
Built with the Community- EKS provides upstream Kubernetes which means that full scale open source version is running on AWS which allows quick migration from on-prem to cloud without any re-factoring.
So far, we have touched upon concept of containerization and various services offered by AWS to deploy containers at scale. Each of these components requires separate articles for deep dive which I will try to cover in subsequent posts.
Reference- Amazon web services
Top comments (0)