Pubudu Jayawardana for AWS Community Builders

Posted on Feb 22 • Originally published at pubudu.dev

Understanding Lambda Tenant Isolation

#aws #lambda #serverless #saas

Lambda tenant isolation is one of the important security features that came out of the 2025 re:Invent season.

Achieving tenant isolation in SaaS applications is not straightforward, and taking the single-tenant route to solve it introduces its own scaling challenges. This new feature is not a silver bullet, but it does offer much better support for keeping tenants isolated at scale.

In this blog post, I discuss what this feature is and the problems it addresses.

Lambda execution environment

When a invoke request reached AWS Lambda service initially, it starts a virtual environment in a EC2 host worker. We call this an execution environment. An execution environment will download the code and required dependencies and process the request. If required, it will return the response.

One of the key attributes of this execution environment is that it will not be removed or deleted immediately after processing a single request. It will be kept in 'warm' state to serve another incoming request. When the next request comes in, Lambda service will use the execution environment that is already available to process it, without creating a new environment.

Likewise, when you invoke a Lambda function, if there are execution environments available for that Lambda function, Lambda service will use them to process the requests else will create new execution environments.

However, this approach can be a concern when it comes to a multi-tenant Lambda function, because execution environments share 'left over' stuff like:

Global variables
Objects initialized outside of the handler
files saved in /tmp space

Multi-tenant Lambda function

In this multi-tenant setup, a Lambda function is shared by more than one tenant. And based on how execution environments behave, irrespective of the tenant that invoke the Lambda function, Lambda service will use execution environments that are already available for that Lambda function. But, as mentioned earlier, execution environments sharing some data across executions can be a security issue.

Having data shared across execution environments can be a great optimization if those data are accessible only by the intended tenant. However, when multiple tenants use the same execution environments, tenants will have access to data that they are not intended to.

For example, if we take a single execution environment, tenant 1 might fetch some secrets from Secret Manager or save some files in /tmp directory in its execution. If the same execution environments used for an execution of tenant 2, tenant 2 will have access to the secrets fetched for tenant 1 or the contents of the /tmp directory saved by tenant 1.

Solution

One of the solution for this problem is to reset the execution environment just before processing each request. For example, unsetting any global variables or wipe the /tmp directory etc. However, this approach will not be practical at scale.

Another option is to go for tenant specific Lambda functions which is the single-tenant approach. In this case, tenants will have their own dedicated Lambda function each. This will solve the problem of unintended access to temporary data, because different execution environments belongs to different Lambada functions will not share the execution environments.

However having Lambda function per tenant is not scalable. When there are a lot of tenants available, you end up with a lot of Lambda functions that you need to manage. While this is possible when you use a IAC tool like CDK or Terraform, still in a situation where you need to update source code to introduce a functionality or a bug fix, you will need to update all these tenant specific resources which is not easy. And, most of the time, those resources are not fully utilized too.

What if we can have a single Lambda function shared by all of the tenant (the multi-tenant approach), yet have the isolation we need in a Lambda per tenant (single-tenant approach)?

Lambda tenant isolation

With Lambda tenant isolation, you can have a single Lambda function shared by all of the tenant, yet have the isolation you need in a Lambda per tenant. With this new feature, Lambda service will do the heavy lifting by creating execution environments that are dedicated to a specific tenant. This means that execution environments will not be shared across tenants, and tenants will not have access to data that they are not intended to.

But, how Lambda determines the incoming request is from which tenant? For that, we need to provide a tenant-id in the request to Lambda.
If you use Lambda invoke cli method, you can use the --tenant-id parameter as follows:

aws lambda invoke \
    --function-name tenant-aware-lambda \
    --payload '{ "name": "Bob" }' \
    --tenant-id t1 \
    response.json

If you use Lambda API, you need to provide the value using X-Amz-Tenant-Id as follows:

POST /2015-03-31/functions/tenant-aware-lambda/invocations HTTP/1.1
Host: lambda.eu-central-1.amazonaws.com
Content-Type: application/json
Authorization: AWS4-HMAC-SHA256 Credential=...
X-Amz-Tenant-Id: t1

{
    "name": "Bob"
}

Tenant id is case sensitive and it can be any alpha numeric character with maximum length 256 characters. There are few special characters such as hyphens (-), underscores (_), colon (:), equals (=), plus (+), at (@) and periods (.) allowed too.

One of the key attributes of tenant id is that we don't need to pre-register those tenant ids. We can pass any dynamic value as the tenant id and Lambda service will take care of creating and maintaining a pool of execution environments for each value passed. Also, we can use any number of unique tenant ids, there is no limit.

How to enable this feature in Lambda

If you use the AWS Console, you can go to the Lambda create wizard and enable this in the additional security section.

Also, options for CLI, CloudFormation and CDK also available.

CLI:

aws lambda create-function \
    --function-name tenant-aware-lambda \
    --runtime python3.14 \
    --zip-file fileb://tenant-aware-lambda.zip \
    --handler index.handler \
    --role arn:aws:iam:123456789012:role/execution-role \
    --tenancy-config '{"TenantIsolationMode": "PER_TENANT"}'

CloudFormation:

MyLambdaFunction:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: tenant-aware-lambda
      Runtime: python3.14
      Role: !GetAtt LambdaExecutionRole.Arn
      Handler: index.handler
      TenancyConfig:
        TenantIsolationMode: PER_TENANT
      Code:
        ZipFile: |
          .....
          .....
      Timeout: 10
      MemorySize: 128

CDK:

tenant_aware_lambda = _lambda.Function(
            self,
            "TenantAwareFunction",
            function_name="tenant-aware-lambda",
            runtime=_lambda.Runtime.PYTHON_3_13,
            handler="index.handler",
            code=_lambda.Code.from_asset("src/lambda/tenant_aware"),
            timeout=Duration.seconds(10),
            tenancy_config=_lambda.TenancyConfig.PER_TENANT,
        )

Please note: This feature can be enabled ONLY when the Lambda function is created. You cannot enable this for an existing Lambda function.

Try this yourself

I have created a sample application for you to see how this feature works. You can deploy it to your AWS environment using CDK with Python.

Clone the repository at github.com/pubudusj/lambda-tenant-isolation-demo and follow the steps below:

Create a virtual environment and activate it:

python3 -m venv .venv
source .venv/bin/activate

Install the dependencies:

pip install -r requirements.txt

Deploy the stack:

cdk deploy

This will create two Lambda functions:

A generic Lambda function (without tenant isolation)
A Lambda function with tenant isolation enabled

Both Lambda functions have a global variable counter which increments on each invocation. This is to simulate the shared state across executions. An API Gateway is also created with two endpoints to trigger these Lambda functions:

/execute_generic_lambda?tenant_id=<tenant_id> - invokes the generic Lambda function
/execute_tenant_aware_lambda?tenant_id=<tenant_id> - invokes the tenant-aware Lambda function

Testing

Generic Lambda function

First, let's test the generic Lambda function. Call the /execute_generic_lambda endpoint with a tenant id:

curl "<APIGW_BASE_URL>/execute_generic_lambda?tenant_id=tenant1"

You will see a response like:

{"tenant_id": "tenant1", "invocation_count": 1}

Now call the same endpoint again but with a different tenant id:

curl "<APIGW_BASE_URL>/execute_generic_lambda?tenant_id=tenant2"

You can see the invocation count keeps increasing regardless of the tenant id:

{"tenant_id": "tenant2", "invocation_count": 2}

This is because the generic Lambda function shares execution environments across all tenants. The global variable counter retains its value across invocations regardless of which tenant made the request. This is the exact problem we discussed earlier - any global state, cached data or files in /tmp are accessible across tenants.

Tenant-aware Lambda function

Now let's test the tenant-aware Lambda function. Call the /execute_tenant_aware_lambda endpoint with a tenant id:

curl "<APIGW_BASE_URL>/execute_tenant_aware_lambda?tenant_id=tenant1"

You will see a response like:

{"tenant": "tenant1", "invocation_count": 1}

Now call the same endpoint again with a different tenant id:

curl "<APIGW_BASE_URL>/execute_tenant_aware_lambda?tenant_id=tenant2"

This time, the invocation count resets:

{"tenant": "tenant2", "invocation_count": 1}

This is tenant isolation in action. Even though both requests are handled by the same Lambda function, Lambda service creates separate execution environments for each tenant. The global variable counter, or any other shared state in the execution environment, is isolated per tenant. Tenant 2 will never see the state left behind by Tenant 1.

Also note that in the tenant-aware Lambda function, we can access the tenant id from the Lambda context object (ex: using context.tenant_id in Python), instead of extracting it from the query parameters. Lambda service automatically makes the tenant id available in the context when tenant isolation is enabled. This is useful if you need to do some operations based on the tenant id - for example, fetch some data from other services.

In this example, I have used this tenant id available in the context object to publish a custom CloudWatch metric per tenant. This is helpful to monitor per-tenant invocation patterns, which can be useful for billing or auditing purposes.

Effect on Lambda concurrency and cold starts

One important thing to understand is how tenant isolation affects Lambda concurrency. Since execution environments are not shared across tenants, each tenant will need their own set of execution environments. This means that the overall number of concurrent execution environments can be higher compared to a non-isolated Lambda function where environments are freely shared.

For example, if you have 10 tenants each making concurrent requests, instead of reusing a pool of warm execution environments, Lambda needs to maintain separate pools per tenant. This can lead to more cold starts, especially for low-traffic tenants.

However, this is a trade-off worth making when tenant isolation is critical for your application.

Please note: Make sure to consider the Lambda concurrency limits in your account when enabling tenant isolation, especially when dealing with a large number of tenants.

Integration with API Gateway

As at now, only integration supports Lambda tenant isolation features is API gateway.

In the example project, I have mapped the incoming query string tenant_id to the integration request header X-Amz-Tenant-Id for the Lambda service.

tenant_aware_integration = apigw.LambdaIntegration(
    tenant_aware_lambda,
    request_parameters={
        "integration.request.header.X-Amz-Tenant-Id": "method.request.querystring.tenant_id"
    },
)

When using Lambda integration on API Gateway, it gives us a lot of flexibility in choosing what value to map as the X-Amz-Tenant-Id. It could be the source AWS account id, a value from the request body, or a header value. If authentication and authorization are enabled, it could even be a Cognito user group or a claim from a JWT token. This makes API Gateway a convenient place to resolve and pass the tenant id to Lambda.

What I'd like to see next

While Lambda tenant isolation is a solid step forward, there are a few areas where I think it can be even more valuable.

More integration support: Currently only API Gateway supports as a integration to Lambda with this feature. It would be great to see this extended to other integrations such as SQS to Lambda via event source mapping.
Native per-tenant metrics: Built-in CloudWatch metrics by tenant id would remove the need to publish custom metrics manually.
Per-tenant concurrency controls: The ability to set concurrency limits per tenant would help prevent a noisy tenant from consuming all the available concurrency.

Conclusion

Tenant isolation is a fundamental requirement in many SaaS applications. While the single-tenant approach provides strong isolation, it comes with significant operational overhead. The multi-tenant approach is operationally simpler but introduces security risks with shared execution environments.

Lambda tenant isolation gives us the best of both worlds. We get a single Lambda function that is easy to manage and deploy, while Lambda service ensures that the execution environments are isolated per tenant. This eliminates the risk of data leakage across tenants without the burden of managing separate Lambda functions for each tenant.

This is a great addition to the serverless toolbox. If you are building multi-tenant SaaS applications on AWS Lambda, this feature is worth exploring.

Resources

AWS Lambda tenant isolation documentation: https://docs.aws.amazon.com/lambda/latest/dg/tenant-isolation.html
Launch blog post: https://aws.amazon.com/blogs/aws/streamlined-multi-tenant-application-development-with-tenant-isolation-mode-in-aws-lambda

👋 I regularly create content on AWS and Serverless, and if you're interested, feel free to follow/connect with me so you don't miss out on my latest posts!

LinkedIn: https://www.linkedin.com/in/pubudusj
Twitter/X: https://x.com/pubudusj
Medium: https://medium.com/@pubudusj
Personal blog: https://pubudu.dev