The serverless landscape just shifted beneath our feet. With the release of AWS Lambda Durable Functions this December 2025, Amazon has introduced a new feature that developers have been asking for since the dawn of FaaS (Function as a Service).
In short, Durable Functions allow you to write stateful, long-running workflows entirely within your Lambda code using familiar await syntax. You can pause execution for days, weeks, or even up to a year without paying for idle compute, and pick right back up where you left off with local state intact.
Currently available exclusively in the us-east-2 (Ohio) region, this feature is slated for a wider global rollout in Q2 of 2026. But even with its limited availability, it poses a big question for cloud architects: With the ability to handle retries, waits, and complex state management directly in code, is the reign of AWS Step Functions over?
This article breaks down the new functional overlap, the distinct differences in developer ergonomics and pricing, and where each tool belongs in your 2026 serverless stack.
A Quick History Lesson
To understand why this update is such a big deal, we have to look at history. AWS Lambda and AWS Step Functions have been staples of serverless architectures for years. Lambda provided the raw compute, and Step Functions provided the orchestration and low-code interface.
Step Functions was originally introduced to solve a specific pain point: the infamous "Lambda calling Lambda" anti-pattern. Chaining functions together manually in code was brittle, hard to debug, and surprisingly expensive. Step Functions offered a robust, managed alternative designed to handle long-running, multi-step, transactional workloads. It took the burden of state tracking, error handling, and retry logic out of your code and into a managed service platform.
For years, the separation of responsibilies was clear. If you needed business logic, you wrote a Lambda. If you needed to coordinate that logic with a low-code interface, you built a Step Function state machine. This approach was particularly empowering for non-programmers or operations teams because Step Functions uses a configuration-based workflow (ASL - Amazon States Language) paired with an excellent visual editor. It allows you to drag-and-drop hundreds of zero-code integrations—connecting S3 events to DynamoDB puts to SQS queues—without writing a single line of Python or Node.js.
The Friction of State Machines
However, Step Functions hasn't been perfect. For a long time, it was strictly stateless between steps, meaning you had to pass-through large JSON payloads from one state to the next just to keep track of context.
Furthermore, the "configuration-first" nature of Step Functions created a barrier for pure software engineering workflows. You cannot easily execute a complex Step Function on your laptop. Debugging often involves deploying to the cloud, running an execution, checking the console logs, tweaking the ASL JSON, and redeploying. It works, but the feedback loop is painfully slow compared to local unit testing.
Where They Now Overlap
Functionally, Lambda Durable Functions and Step Functions now occupy the same neighborhood. If you look at their capabilities on paper, the lines are blurry. Both services now offer:
- Zero Cost for Waiting: Both services allow you to pause execution for up to a year without paying for idle compute time.
- Built-in Resilience: Both have robust mechanisms to catch errors with exponential backoff and retry failed steps or sub-routines automatically.
- State Persistence: Both "remember" exactly where they left off after a sleep period or an unexpected crash, ensuring workflows complete reliably.
If both tools can wait for a month and retry a failed API call, how do you choose?
The Great Divide: Code vs. Config
The real difference lies in ergonomics and the development lifecycle.
AWS Lambda Durable Functions is unapologetically code-first. It brings workflow orchestration inside the function. You use standard programming constructs—if statements, for loops, and try/catch blocks—to manage your flow. If you want to wait for a week, you don't drag a "Wait State" box onto a canvas; you simply write await step.sleep('7 days') in your code.
This unlocks the holy grail of serverless development: Local Debugging. Because it is just code, you can use standard unit testing frameworks (like Jest or Pytest), mock the durable steps, and step through your entire orchestration logic with breakpoints locally on your machine before ever touching the cloud. You can share libraries, use distinct variable scoping, and manage complexity using standard software engineering patterns. You can also perform full integration testing from your laptop – by generating temporary keys from the AWS console, you can securely interact with cloud services over authenticated and encrypted connections to AWS services.
AWS Step Functions remains configuration-first. Where it shines is in Visual Observability. When a complex business process fails at 3 AM, an on-call engineer (who may not be a coder) can look at the Step Functions graph in the AWS Console, see exactly which green box turned red, inspect the input and output data at that exact moment, and understand the failure.
It also excels at Integration. If your workflow is mostly "gluing" AWS services together—e.g., Take file from S3 -> Transcribe it -> Save results to DynamoDB -> Email user via SES—Step Functions lets you do this with essentially zero code. Doing the same in Durable Functions would require importing the AWS SDK for four different services and writing boilerplate code for every single step.
The Pricing Equation
Finally, let's talk about the wallet, because the billing models are fundamentally different.
- Step Functions (Standard Workflows): You are billed based on State Transitions. Roughly $25 per million transitions.
- Lambda Durable Functions: You are billed based on Requests + Compute Duration. You pay for the time your code is actually executing on the CPU.
Let's look at a basic example: A workflow that loops 10,000 times to process small items.
- In Step Functions: You pay for every single step in that loop. A loop with just 3 states (Task, Choice, Wait) running 10,000 times equals 30,000 state transitions. That bill can climb surprisingly fast for high-throughput looping.
- In Lambda Durable Functions: You pay for the aggregate compute time it takes to run the loop's logic. Since modern Lambda runtimes pause and resume incredibly efficiently, iterating through a loop in memory with short bursts of CPU activity is virtually free compared to the cost of 30,000 discrete state transitions.
Conversely, for a workflow that consists solely of "Wait 6 months, then send one email," Step Functions is incredibly cheap (only a few transitions), while Durable Functions is also cheap (zero compute during wait), but involves slightly more overhead for management.
Conclusion
So, does AWS Lambda Durable Functions replace Step Functions?
If you are a developer: Yes, mostly.
If you live in your IDE and value unit testing, fast local debugging cycles, and managing complex logic through code constructs rather than JSON configuration, Durable Functions is the tool you have been waiting for. It removes the context-switching tax of managing ASL files and lets you build complex, durable backends using the languages you already love.
If you are an architect or ops-focused: No.
Step Functions remains the king of high-level infrastructure orchestration. Its superior visual observability, hundreds of zero-code integrations, and ability to be easily audited by non-technical stakeholders make it indispensable for "macro" workflows that tie together disparate AWS services.
The sheer number of AWS services can be overwhelming, but a lot of that is just due to how software naturally evolves and the need to keep old things working—which businesses definitely appreciate. But for me personally, I'll be choosing Durable Functions over Step Functions in most cases going forward because I'm a developer first.
Top comments (0)