Article by Jay Allen
One of the great things about AWS is the vast array of features available to software developers. Sadly, one of the most confusing things about AWS is...the vast array of features available to developers!
AWS provides multiple methods for deploying applications into the cloud. Two of these methods - AWS Lambda and Docker - have grown rapidly in popularity over the past several years. In this article, we compare the benefits of each and discuss when you might want to choose one over the other.
AWS Lambda is a "serverless" service that enables running code in the cloud. With Lambda, application developers can package code written in a variety of programming languages - including Java, Go, C#, Python, Powershell, Node.js, and Ruby - into a callable function that complies with their language's Lambda interface. They can then upload these Lambda functions to their AWS accounts, where they can be executed from anywhere over the Internet.
The word "serverless" is a bit of a misnomer here; obviously, AWS didn't find some magical way to run code without compute capacity! "Serverless" here means that the compute power used to run this code doesn't run in your AWS account. Rather, it's executed on one of a series of computing clusters run by AWS itself. This frees development teams up to focus on the business logic of their application rather than on managing compute capacity.
Lambda functions can be called, or invoked, through a variety of methods. One of the most common is by connecting your Lambda functions to AWS API Gateway, which exposes them as REST API calls. Lambda functions can also be used to implement customization and back-end processing logic for a large number of AWS services, including Amazon DynamoDB, Amazon Kinesis, and Amazon Simple Queue Service, among others. Lambda functions may also execute as scheduled tasks, and can even be executed directly from the AWS Command Line Interface (CLI) and the AWS Console.
AWS Lambda can be thought of as the original serverless technology on AWS. It wasn't the first serverless technology on the block. That honor may go to Google' App Engine, which has been doing its thing since 2008. (Lambda, first released in 2015, is comparatively a youngin'.) But it helped inspire a boom in the serverless technology industry that continues to this day.
In the bad ol' days of software deployment, developers threw their code onto clusters of production servers that might all have wildly different configurations. A web application might work for one user and then fail for a second user if the server to which the request was routed lacked a certain shared library or configuration setting.
Docker was created specifically to resolve this nightmare scenario. A Docker container is a unit of software that contains everything - code, dependent libraries, and configuration files - that an application requires to run. The container is then deployed to and run on a virtual machine.
The utility of Docker containers lies in their "run once, run anywhere" nature. Once you test a Docker container and verify that it functions as expected, that same container will run on any system to which you deploy it.
Unlike Lambda, Docker isn't inherently "serverless". Docker is best thought of as a packaging and deployment mechanism. There are multiple ways on AWS to run a Docker container, including:
- Elastic Container Service. ECS is AWS's scalable, enterprise-grade solution for running Docker containers. Containers can be deployed either on an Amazon EC2 cluster hosted in your AWS account or using Fargate, AWS's serverless container deployment solution. (For more, check out my recent article on using EC2 clusters vs. Fargate for your Docker deployments.)
- Elastic Beanstalk. AWS's "all-in-one" deployment technology will run your Docker container on a Docker-enabled EC2 instance.
- As an AWS Lambda Function . Here's where things get really confusing! Yes, you can implement code in a Docker container and expose it via a Lambda function. I'll talk a little about who you might want to do this below.
Both AWS Lambda and Docker containers are solid choices for deploying microservices architectures on AWS:
- Lambda functions map handily to REST API endpoints. You can use Lambda functions in conjunction with AWS API Gateway to quickly build out a REST API complete with advanced features such as user authentication and API throttling.
- Docker makes it easy to implement REST APIs using your favorite REST API framework - such as Node.js, Flask, Django, and many others. Because a Docker container is a deployable unit, you can easily partition your REST APIs into logical units and manage them through separate CI/CD pipelines.
But this raises the perennial question: Which one is better?
The first thing to point out is that this isn't necessarily an either/or question. Both Lambda and Docker are powerful technologies that development teams may choose to combine within a single project. For example, you may decide to implement your microservice as a series of Docker containers, and then use Amazon Simple Queue Service in conjunction with AWS Lambda functions to implement a loosely coupled communications framework between services.
But let's set that aside for now and focus on a narrower question: Which technology should you choose when implementing a microservices architecture?
As with most things in the world of the Cloud, there's no clear-cut answer here. But let's look at a few factors you should consider when making this decision for your own project.
When it comes to choice of programming languages and frameworks, Docker is the clear winner. AWS Lambda's support for programming languages is limited to the languages for which it defines an integration API. Docker, meanwhile, can host any language or framework that can run on a Dockerized Linux or Windows operating system.
The language and framework issue leads me to another issue: cloud lock-in. AWS Lambda isn't an industry standard - it's AWS's proprietary serverless tech. If you need to move to a new cloud provider (Azure, GCP) for any reason, your code may require significant rework to function on the new provider's equivalent serverless solution.
If you still want to leverage Lambda but are concerned about portability, I'd recommend following AWS's recommendations around Lambda code design. You can easily separate your function's execution logic out from the Lambda execution environment. This reduces your dependency on Lambda and makes your code more portable.
If your microservice could potentially be called hundreds of thousands of millions of times a day (or even hour), you'll want to ensure it can scale automatically to meet user demand. Fortunately, both AWS Lambda and Docker offer plenty of options to create a highly scalable microservice.
AWS Lambda creates an instance of your function to serve traffic to users . As that instance reaches capacity, Lambda will automatically create new instances of your function to meet demand. Lambda can "burst" from between 500 up to 3,000 instances per region to handle sudden traffix influxes, and can then scale up to 500 new instances every minute.
AWS also provides multiple options for scaling Docker containers. Containers deployed using Fargate, AWS's serverless container deployment solution, can be configured to scale out based on Amazon CloudWatch alarms. If you're deploying Docker containers to an EC2 cluster in your AWS account, you can even scale out the size of your cluster .
In general, both AWS Lambda and Docker containers can be configured to provide the performance required by most applications.
However, I'd be remiss if I didn't note the infamous Lambda cold start issue. Remember above how I said that Lambda will create a new instance of your function when it needs to scale out. This process requires time: the Lambda function code has to be downloaded to an EC2 instance in AWS's Lambda server farm, and the execution environment and its associated dependencies also take time to load and start. This is known as a cold start. It has a particularly hard impact on Java and .NET applications, both of which have weighty runtime environments.
Fortunately, as Mike Roberts at Symphonia points out, cold start isn't an issue for high-demand applications. It only becomes a factor in low-execution environments - e.g., when using a Lambda function as a callback from another AWS service, such as CodePipeline.
When it comes to dependency management - libraries that your application depends upon - Docker is king. As I discussed earlier, a Docker container is a self-contained package containing everything your application needs to run.
It's also possible to ship dependencies with your AWS Lambda functions as part of the function's ZIP file. However, things get complicated when you need to package OS-native dependencies. Furthermore, Lambda packages max out at 250MB, which can be an issue when packaging large dependency frameworks.
Fortunately, AWS Lambda's support for Docker containers means you can get the best of both worlds. By implementing your functions as Docker containers, you can package any dependency your application requires and ensure it always runs as intended. Docker containers on AWS Lambda can be up to 10GB in size, which is plenty of space for the vast majority of applications.
If your code is doing some sort of batch processing - processing DynamoDB events, filtering an Amazon Kinesis stream, generating large images, etc. - you'll need to concern yourself with execution times. Lambda functions can only run for up to 15 minutes before the service will time out. By contrast, Docker containers have no built-in limitations on workload runtimes.
As I mentioned earlier, Docker provides a simple and easy-to-understand deployment model that enables packaging a single microservice into a single Docker container. This is where AWS Lambda has often been at a disadvantage: since Lambda is a function-based service, it's proven more challenging to manage an entire service or application as a collection of interconnected Lambda functions.
Fortunately, new tools have come out over the past several years to address exactly this problem. AWS's Serverless Application Model (SAM) enables developers to design, develop, and deploy entire serverless apps directly onto AWS using Lambda and CloudFormation. Other tools, such as the open-source project Serverless, aim to create similar zero-infrastructure deployment experiences for serverless applications on AWS and other cloud providers.
In general, a "serverless" solution is going to cost you more than a non-serverless solution. We at TinyStacks discovered this recently when we moved all of our container workloads from Fargate to our own ECS EC2 clusters, resulting in a cost savings of 40%.
While we haven't done any direct cost comparisons with AWS Lambda, evidence from others suggests that it's one of the least cost-effective solutions going. An analysis this year by Eoin Shanaghy and Steef-Jan Wiggers on InfoQ found that running a workload on AWS Lambda can cost up to 7.5 times more than running the same workload on AWS Fargate with spot capacity. Given that we manage to run our workloads at a 40% discount on EC2 clusters compared to AWS Fargate, this shows you just how pricey Lambda really is.
For large-scale microservice workloads, we've found that running Docker containers on our own tightly managed EC2 cluster using ECS to be the ideal solution.
You may get good mileage from using Lambda selectively for smaller-scale workloads. However, we would recommend implementing your code in Docker containers wherever possible - even when Lambda is your preferred deployment mechanism. Docker containers not only port well across cloud providers but can also be used with numerous AWS services. This makes it easy to change your deployment and hosting strategy in response to your company's changing needs.