Kazuya

Posted on Dec 5, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - What's new with AWS Lambda (CNS376)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - What's new with AWS Lambda (CNS376)

In this video, the AWS Lambda product management lead presents 2025's transformational serverless capabilities. Key launches include Lambda Managed Instances enabling multi-concurrency and EC2 pricing incentives for steady-state workloads, and tenant isolation for SaaS builders. New runtimes (Python 3.14, Java 25, Node.js 24) arrive within 90 days of community release. AWS Transform Custom uses generative AI to reduce runtime upgrade technical debt by 85%. Snapstart expands to Python and .NET. Enhanced observability includes CloudWatch metrics for event source polling, Avro format support for Kafka, and Provisioned Mode for SQS. Developer acceleration features include seamless console-to-IDE transition, LocalStack integration for local testing, and remote debugging. The MCP server now supports event-driven architectures with best practices baked in.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

New Primitives: Lambda Managed Instances and Tenant Isolation Transform Serverless Capabilities

I lead product management for AWS Lambda in my spare time. In my main time, I'm here blocking you all from lunch. So let's get right to it. Serverless has been around for a decade now, and in this time, usually if a technology manages to stick around, there's always biases that build up about what the technology can do, cannot do, and so on. 2025 in particular has been a transformational year for Lambda and serverless because we've built a bunch of new capabilities that enable entirely new classes of workloads for you to service with serverless. Let us get to it. I just have 20 short minutes with you, and I want to cover as much as possible.

Before we get into the launches, a key differentiation of why Lambda should exist in this world—and a lot of customers ask us that, believe it or not, it's been 10 years, but still—the main aspiration that we have to give to customers and builders is speed. Speed not just in code generation, which generative AI can take care of for you, but also the whole shipping cycle. Code generation is just one piece, but then you have to ship faster. If you ship faster, you're able to build engaging applications, collect more iterations and feedback from your users, and iterate on the application.

We try to make sure that an application is well-architected by default. We do this by enabling or taking care of all of the "-ilities" for you: scalability, availability, reliability, and many more such qualities. The primary benefit is that there is no infrastructure to manage. It reduces a tremendous amount of operations for you, which enables you to get more speed.

I've bucketed the launches into three cohorts. First is the new primitives, which allow you to do entirely new classes of workloads that you could not do before. We're delighted to introduce Lambda Managed Instances for you. The mental model here was that customers tell us Lambda or serverless really shines when you have needle-point workloads—when you're scaling your traffic from zero by 700 to 800 times, and when your scale to zero is important to you. In those use cases, Lambda was really fantastic.

But when your application achieves a really good level of adoption, scale to zero is no longer important. There's always some traffic, always some users to serve. At that time, I'm calling it steady-state traffic. For steady-state traffic, what builders wanted to do was optimize performance by way of adding more network, memory, or compute-specific instances. Lambda Managed Instances allow you to do that. You can choose where to run your Lambda function, for instance.

They wanted to still benefit from fully managed operations with the same developer experience. You have all of the same event source integrations, patching, routing, and scaling. We do all of that for you with Lambda Managed Instances. And you can optimize costs. If you know Lambda today, you know it is single-concurrency—one execution environment serves only one request. That was inefficient for some classes of workloads. With this, we're also adding the concept of multi-concurrency, so multiple requests can hit the same execution environment to be serviced.

When you comply with or bake in EC2's pricing incentives like Savings Plans and Reserved Instances, your savings really magnify. That's the key benefit here. I do realize that adding servers to serverless is a bit of a mental model shift, so we try to make it as intuitive as possible. The only additional thing that you have to do is create this concept called a capacity provider. There are APIs for it too, and console access obviously. In the capacity provider, you give it your preferences on what compute instances, memory instances, or what the scaling profile is that you want. Optionally, you can just let Lambda choose for you, and we'll try to continuously improve your price performance of the workload right off the bat.

Thinking a little bit about the use cases, we covered steady-state applications, which are now seamlessly handled. The entire gamut of applications—the functionality of your app that requires needle-point bursts or needle-point traffic management—you can leave on Lambda today. If you want the functionality that achieves steady state, real popularity, where there's always users to serve, you can move them over to Lambda Managed Instances. Performance-critical apps, the real SLA-bound apps for which you need specialized instances, are now opened up.

The variety of applications opens up from media data processing, web applications, and event-driven applications. You can migrate a whole lot more to servers now, and at times for those of you who have to struggle with regulatory requirements, you may have some preferences like only being able to use compute in local zone X or in availability zone Y. With managed instances, the mental models will be able to help you there. The other primitive is Lambda's tenant isolation feature. Especially if you're a SaaS builder, and SaaS builder is an analogy, but wherever you need isolation between requests from different tenants that go to the same execution environment, builders have to do a bunch of custom tooling and processes which again introduces pain in the CI/CD pipelines and shipping software cycles. What happens is that if you're using a global variable or if you are storing something to disk of that execution environment, and you try to reuse that execution environment for the next tenant, you should reuse it because that is what improves your utilization and which will drive down your costs and increase your margins. However, you have to clean up the execution environment which reduces your velocity.

Well-Architected by Default: Enhanced Runtimes, Observability, and Event-Driven Pattern Improvements

With tenant isolation, the only additional thing you have to do to use it is pass in the unique identification of that tenant. Sometimes it's a tenant ID, sometimes it's a JWT token that you use, pass in anything to us, and we guarantee that we'll give you a clean, isolated execution environment for that tenant. So it improves your CI/CD cycles and helps you to achieve isolation without any custom tooling whatsoever. Next, we'll switch to how we try to make you well architected by default. New runtimes are a key component of our value proposition for Lambda that we offer to customers. It's basically the programming languages, and I started my career as a developer. I like to use the latest features and the latest technologies. The performance fixes are baked in, security fixes are baked in, so new runtimes are highly in demand. We provide them to you with full management throughout the lifecycle of that runtime. Over the last couple of weeks, we've announced Python 3.14, Java 25, and Node.js 24, all available with patching and performance optimizations baked in.

What it does is improve developer productivity. You're able to ship safe software faster, and our mental model here is within 90 days of the runtime being available in the community, we want to make it available to you for use in Lambda. The other thing we do is upgrade the runtime. Upgrading with patches, for example if a Log4j vulnerability happens and needs a runtime patch, we do it all for you. However, the effort to upgrade runtime is high, particularly if you're one of those who has hundreds of thousands of functions. Upgrading is quite cumbersome. That is where we found one of the great use cases for generative AI. All of that technical debt that you take by not upgrading turns out generative AI is pretty good at it. In the keynote, we announced this new capability called AWS Transform Custom. This provides you generative AI based upgrades. They're incredibly easy to use and baked in. You can bake them into your pipelines, into your development or software release cycles seamlessly. In previews, we saw that it reduces your tech debt by up to 85 percent. So 85 percent of functions you could just accept as is what the generative AI recommendation was.

Go ahead and use it if you would like to upgrade your runtime. Snapstart, so serverless introduced the term cold starts in the dictionary. It happens when you initialize your function with your dependencies. We announced Snapstart for Java a couple years back. We've also announced it for Python and .NET now. What this does is it takes the snapshot of that entire execution environment when you start the function creation process or the function deployment process rather, and on subsequent invokes we restore that snapshot, so your initialization time is really reduced. What we see is the cold start times really benefit which leads to a lot of happy end users of your applications. There's no code changes required and no custom tooling for you to build. It's just a checkbox that you enable.

We are all good engineers who want to test our applications despite the removal of testers from the software cycle. Testing resilience is hard, but AWS has a service called Fault Injection Service, or FIS. It allows you to specify your conditions for testing, such as increasing latency or making sure downstream pieces like storage or databases are not available for a while. This helps you see how your application reacts to those stress conditions. This integration with FIS helps you do exactly that, and eventually it leads to seeing how your application reacts to real production scenarios, which helps you plan and prevent outages better and have more confidence as a developer.

No custom tooling is required. This is a picture of the type of support that FIS adds to you. You can test for delays here. This picture shows you the graphs of the state of the world with the delay that you see in your application that you would see in production otherwise. There is no surveillance without event-driven patterns, so we have to do something there. With serverless, one of the key things that developers tell us is that they do not have access to the underlying hardware, so observability is hard and needs to be better, especially as we poll the event sources like SQS or Kafka triggers on the customers' behalf.

Lack of visibility into this polling mechanism was a hindrance because what developers had to do was piece together the metrics that Kafka or SQS event sources provide with what the Lambda function provides. What we did was enable additional CloudWatch metrics. You see the poller count, lag, and throttles right there. This helps you detect your issues instantly, improve your time to resolution, and again, no custom tooling is required. Here is a picture of some of the additional metrics that you now get.

Customers that use Kafka in particular try to use the Avro format. Avro supports schema evolution and makes it easier to deserialize and serialize, but we did not support Avro format before. The pain was that builders had to manually add boilerplate code to every function they wanted to use to process their Kafka events. What we did a couple of months back was add the capability to auto-serialize and deserialize Avro events. This also gives you the ability to specify your schemas and evolve your schemas, leading to less code and with less code, as you know, there are fewer errors. Add schema evolution support and again, no custom tooling is required.

For some customers, they wanted even faster scaling, which required them to have specific control over the concurrency that they specify for the polling mechanism. Provisioned Mode for SQS enables them to pre-warm capacity to handle any spike instantly. We released Provisioned Mode for Kafka last year at re:Invent, so this is a follow-up for SQS. Again, it helps eliminate delays as your spikes happen, helps you meet SLAs, optimize costs, and no custom tooling is required again.

Accelerating Developer Velocity: From Console to IDE, Local Testing to Remote Debugging

Now, accelerating developers is about helping developers ship software faster, not just generate code faster. When we talk to developers, they say they go to the console just to test the feature and do basic Hello World capabilities, but after that they try to develop in their local machines' IDE, often VS Code, and then they try to test locally. That is what drives the faster accelerated development cycles. Before deploying to production, they want to test in the cloud. These three integration points or deflection points are what we capitalized on and converted them into three features that help you accelerate your development cycles.

The first is seamless console to IDE transition. What it allows you to do is build the bare bones application on the console, and with a single click, we package in all of the dependencies for you and it starts to light up on your local IDE.

You can start immediately coding on your local IDE without doing any additional manual work or downloading. Second is local testing. LocalStack is an AWS partner that assists with emulating and simulating a bunch of AWS services. An application is not just the function, but comprises storage, database, networking, and so on. LocalStack allows you to do all of this locally in your IDE. With LocalStack, we're deeply integrated now. It helps you develop offline fully on your local machine and test on your local machine, leading to faster iterations of your business logic without any additional custom tooling that you need.

Finally, at this phase, you're at the place where most developers say, "It works on my machine." You know it sucks to be you if it doesn't work on your machine, but you want to be good people and test in the cloud before you deploy to get a feel of what the real world looks like. Remote debugging enables you to do exactly that with just two clicks. You can simply enable your checkpoints or breakpoints, analyze your variables and source code the way you do locally while the code runs in production. Additional test cases that you can cover here are your IAM policies or permissions, your roles, your database connectivity, your VPC and network packs, and so on, without any additional custom tooling.

With the MCP server that we released in Q2, our idea was to enable best practices to be baked in. Customers tell us that generative AI assists in accelerating coding, but that code is not production ready. There's not much by way of input validations or error handling or status codes and so on. Our MCP server bakes those best practices in so your coding assistant can actually generate better quality code. It's more consistent quality that helps in reducing your code review cycles as well, overall accelerating your velocity. We have support for web apps, and this key picture shows you how we list and deploy web apps. We also added support for event-driven applications here. The MCP server can now help you set up your Kafka triggers and complete your event-driven architectures as well.

There are some other sessions on servers and some not here if you're interested. Some capabilities like durable functions were announced earlier this morning. Because this session was within four hours, I didn't have the permission to cover it, but tomorrow's session will cover that in great detail as well. Please join us. Our roadmap is now on GitHub. We welcome all participation there. Any future requests you have or any courses you want to give us, we might delete those posts, but we will read them at least. To delete, you also have to read. Please engage in discussions there. We're fairly customer obsessed, deeply customer obsessed rather, and we'd love to get your input into what we should build next. If you want to get involved in any private previews or developer previews, we can have the conversation there.

Thank you very much for joining me today. I know I spoke like a train. It was really twenty minutes compressed with a bunch of stuff that I could include but had to delete. Please join us in the other sessions. We would love to have you there. I was holding you from lunch, but you know, it was great to connect with you today. Stick around or get lunch, but it was great to connect with you today. Thank you so much.

; This article is entirely auto-generated using Amazon Bedrock.