Understanding Concurrency And Throttling In AWS Lambda Functions

#aws #devops #cloud #lambda

If you are writing Lambda functions for production, it is important to think about how your function scales and how performance can be affected by usage.
Yeah, I know. You might probably be thinking that being a "serverless" service, you don't have to worry about provisioning and scaling when it comes to AWS Lambda functions. Sorry to burst your bubble, you definitely have to think of scaling when deploying production-level Lambda functions.

Concurrency in AWS Lambda is the number of requests that your function can process simultaneously.
Generally, new containers are spawn up each time your lambda function is invoked. So, for every new invocation to your lambda function, new resources are created behind the scenes to handle these requests. If more than one invocation is made to your function at any given point in time, new containers are being spawn up to handle these "Concurrent" requests.
Concurrency is a very important aspect to consider, because they can cause your applications to fail due to a concept called Throttling.

By default, there are 1000 units of concurrency allocated to each AWS account per region. This capacity is shared by all the functions in your account, per region. What this implies is that no function in your account can be invoked simultaneously by a rate of more than 1000 times. And if more functions are being invoked at the same time, they will share from the 1000 available units.
Of course, this is a soft limit and can be raised by filing a support ticket with AWS.

Throttling in AWS Lambda occurs when function invocations exceed the amount of available concurrency units. This causes your function not to run and you get a RateExceeded execption.

Lambda concurrency can be broadly divided into three: Unreserved concurrency, reserved concurrency and provisioned concurrency.

Unreserved concurrency: This is the default category where your functions belong to. In Unreserved concurrency, all lambda functions in your account will share from a common concurrency pool. Let's say you have three functions in your account. If function A is getting 900 simultaneous invocations and functions B is getting 100 invocations at a time, function C is essentially going to get throttled and cannot be invoked since the maximum concurrency units allocate to your account are being used up by Functions A and B.

Reserved Concurrency: This is how you provide guaranteed concurrency to a lambda function. Remember that with unreserved concurrency, if one function exceeds the concurrency limit, other functions in your account will be throttled by the Lambda service. With reserved concurrency, you deduct some concurrency units from the overall capacity and allocate them to one function. This provides the function with exclusive access to the reserved units.

For instance, if we have functions A, B and C and we provide function A with 200 reserved concurrency units. We now have 800 concurrency units left for functions B and C to share. Meanwhile function A will have exclusive access to 200 concurrency units. Keep in mind that with reserved concurrency, your function can't casually go above its allotted capacity. In our example above, if function A gets invoked more than 200 times at any instant, further invocations will get a RateExceeded exception (Throttling).

Provisioned Concurrency: This is a newer category that mostly deals with latency issues and helps prevent the problem of cold starts by providing dedicated execution environments. In essence, provisioned concurrency is more like reserved concurrency except that it provisions a specified number of execution environments - equal to the specified concurrency units. Hence, with an increase in invocations, your clients do not experience any latency as the execution environments have been provisioned before hand - essentailly eliminating cold start issues.

DEV Community

Understanding Concurrency And Throttling In AWS Lambda Functions

Top comments (0)

Read next

GIT for Beginners

Azure Verified Modules using Terraform

Docker

Migrating from AWS SageMaker to GCP Vertex AI: A Training Environment Transition