DEV Community


Posted on

How to run stateful AWS Lambda functions in any language using custom runtimes

An illustrated explanation of how Lambda environment works

AWS Lambda used to be limited to a handful of supported languages. Now almost any programming language can be supported through a custom runtime. This article gives an overview of how those custom runtimes work and how they can be used to bypass some Lambda limitations.

What is Lambda Runtime?

Lambda functions run in a sandboxed environment called execution context. It has resources specified in the function description, such as memory, timeout or the type of the runtime. The runtime is a Linux executable that conforms to a certain specification and serves as a wrapper to run the handler function within the execution context.

For example, C# handler code is packaged into a Zip file as compiled assemblies with their dependencies and proj.runtimeconfig.json that tells the .NetCore runtime what is in the package and how to run it. That .NetCore runtime is provided by AWS and already resides inside the Lambda sandbox when you upload your package.


Rust code, on the other hand, is compiled into a single executable file named bootstrap and is uploaded to a Lambda function with no runtime.

In both cases, Lambda will call a file called bootstrap when it wants to invoke the function. The difference between the two is that the .NetCore bootstrap is provided by AWS and the Rust bootstrap is provided by the user.

Function handler invocation

The Lambda execution environment has several environmental variables that tell the runtime about the Lambda API and the environment constraints:


From there, the runtime does whatever initialization it needs to do and starts talking to Lambda API endpoints via HTTP.


The first API call for the runtime is to ask for next invocation.

The web client inside the runtime sends an HTTP GET request to http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/next and waits for a response. It waits, and waits and waits, until it either times out or the API responds with event details when something invokes that function. If the request does time out the runtime should resend it in an infinite loop.


When the next invocation response does arrive it is up to the runtime to do something with the event details it contains. Most runtimes pass them to a handler function, but if you control the runtime you can do all the work inside the runtime itself.

If there is a response to be sent back to the caller of the Lambda function, the runtime puts the payload inside an HTTP Post, adds some headers and sends it all to a special response endpoint of the Lambda API: http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/$REQUEST_ID/response. The API responds with HTTP/202 and {"status":"OK"} in the body, and forwards the payload back to the caller.


If the runtime or the handler function failed during the invocation, the runtime is expected to report the error to a different endpoint: http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/$REQUEST_ID/error.

Then the runtime restarts the process by calling next invocation API again.


Custom runtimes on GitHub

The most popular custom runtime is for Rust. I am a user and a contributor to that project and can assure you that it works just fine. GitHub has a good choice of other custom runtimes:

If your language of choice has no Lambda runtime yet, you can follow this tutorial and the examples above to build one yourself.

Faster invocations and stateful Lambdas

Lambda functions are stateless by definition. You provide a handler function, it is being invoked by the runtime as one instance or 100 instances at the same time. It is great for scalability, but it doesn't work well for use cases with heavy initialization. Think of functions with large dependencies or obtaining DB connections. Those can be persisted on the local drive, but what if you want to keep them loaded in memory?

Timelapse video example

I worked on a project where a photo was uploaded to S3 every 1 second and had to be added at the end of a timelapse video stored in S3. We tried to use FFMPEG with C# in a Lambda function, but it was too slow because a lot of data had to be persisted on the disk between the calls and there was no way to keep it in memory using a standard .NetCore runtime. The cost/benefit of using Lambdas was not there and we ended up running an EC2 instance for that.

A custom runtime could have solved the problem by downloading the video, storing the end portion of it in memory and flushing encoded chunks only once every few frames, as long as it was within the lifetime of a single Lambda instance.


There may be many other examples where persisting large amounts of data or serialized objects on disk would be inefficient, so maintaining state between invocations is a make-or-break feature there.

Being "stateful" goes against the grain of Lambda ideology, but who cares about that if it solves your problem?

I am curious about the nature of the sandbox environment used by AWS for Lambdas. Please, share in the comments if you know what it is.

Top comments (1)

rimutaka profile image
Max • Edited

Unanswered questions:

  1. Do lambdas get charged for CPU time or for CPU time between invocations? Is it possible to get free CPU time in between?
  2. What is the degree of parallelism (number of cores) available to the runtime?
  3. How does cost of Lambda compare to the cost of EC2 per unit of computation?