At the re:Invent 2025 keynote held in Las Vegas, AWS announced and launched Durable Functions for AWS Lambda, and for many serverless developers, this marked a major shift.
For the first time, AWS introduced a code-first approach to building long-running, stateful workflows directly inside Lambda, without requiring developers to define and manage Step Functions state machines.
As serverless applications grow more complex, workflows often need to:
- Pause for minutes or hours
- Wait for external events
- Retry safely after failures
- Resume without losing state
Until December 2025, AWS Step Functions were the primary solution. They are powerful and reliable, but they rely on JSON-based workflow definitions that feel disconnected from application code.
AWS Durable Functions change that model entirely.
This article explains:
- What AWS Durable Functions are
- How they work under the hood
- How they compare to Step Functions
- A full working example (handler + CDK)
- Real execution screenshots
- IAM and deployment configuration explained
What Are AWS Durable Functions?
AWS Durable Functions allow you to write long-running, stateful workflows directly inside a Lambda function using standard code (Node.js and Python for now).
Instead of defining workflows using Amazon States Language (ASL), you write normal async logic using a Durable Functions SDK.
Durable Functions are called durable because they:
- Persist execution state automatically
- Resume from the last completed step
- Support long waits and external events
- Survive Lambda restarts and failures
Think of it like writing a normal function, except AWS guarantees it will never forget where it left off.
Durable Function SDK: Core APIs
The Durable Execution SDK adds orchestration primitives to Lambda:
| API | Purpose |
|---|---|
step(name, fn) |
Executes business logic with built-in retries and automatic checkpointing. |
wait(name, duration) |
Suspends execution for a specified duration (up to 1 year) without compute charges. |
waitForCallback(name) |
Pauses execution until an external event or human approval signal is received. |
createCallback() |
Creates a callback that external systems can complete |
waitForCondition(fn) |
Waits for a condition to be met by periodically checking state |
parallel(tasks) |
Executes multiple branches with durable operations in parallel with optional concurrency control |
invoke(name, payload) |
Invokes another durable or non-durable function with the specified input |
runInChildContext(name, fn) |
Runs a function in a child context with isolated state and execution tracking |
How Durable Functions Work: The Replay Model
Durable Functions rely on a replay model.
Here’s what happens when your function pauses:
- Your Lambda runs and reaches a
wait() - AWS checkpoints the execution state
- The Lambda invocation ends
- Later, AWS replays the function from the beginning
- Completed steps are skipped
- Execution resumes from the pause point
This allows workflows to pause for minutes, hours, or days without consuming Lambda runtime.
Pro‑Tip: Determinism Is Critical
Because of replay, orchestration code must be deterministic:
- No
Math.random()during orchestration - No reading current time directly
- No external API calls in orchestration logic
Cold Starts & Replay Performance
At first glance, replay sounds expensive, but AWS optimizes heavily:
Checkpoint state is cached in the execution environment
Previously completed steps are skipped, not re-executed
Cold starts only replay orchestration logic, not step bodies
In practice, replay overhead is milliseconds, even for multi-step workflows.
What Are AWS Step Functions?
AWS Step Functions are a fully managed workflow service where logic is defined using Amazon States Language (ASL).
They provide:
- Visual workflow diagrams
- Built-in retries and error handling
- Native integration with 200+ AWS services
Step Functions are excellent for service-heavy orchestration, but they are configuration-first, not code-first.
Durable Functions vs Step Functions: Limits & Constraints
This is the key question readers ask: “Why would I still use Step Functions?”
Durable Functions: Constraints
State Size Limit: Durable execution checkpoint data is currently limited (≈256 KB). Large payloads should be stored externally (S3, DynamoDB).
Visibility: Durable Functions rely on logs and execution history, not a visual graph UI.
Extremely Long Sequences: Step Functions handle massive branching workflows (thousands of states) more efficiently than deep replay chains.
Step Functions: Tradeoffs
JSON‑heavy definitions
Harder to unit test
Workflow logic split from application code
Both tools remain valuable; it’s about choosing the right abstraction.
Testing Durable Functions (Huge Advantage)
One of the biggest benefits of Durable Functions is its testability.
Because workflows are just code, you can:
Unit test orchestration logic
Mock DurableContext
Use standard tools like Jest or Vitest
import { LocalDurableTestRunner } from "@aws/durable-execution-sdk-js-testing";
import { handler } from "../src/handlers/cartReminder";
describe("cartReminder handler", () => {
beforeAll(async () => {
await LocalDurableTestRunner.setupTestEnvironment({ skipTime: true });
});
afterAll(async () => {
await LocalDurableTestRunner.teardownTestEnvironment();
});
it("sends a reminder after the wait when the cart is not checked out", async () => {
const runner = new LocalDurableTestRunner({
handlerFunction: handler,
});
const execution = await runner.run({
payload: {
userId: "user-123",
cartId: "cart-456",
email: "user@example.com",
},
});
expect(execution.getStatus()).toBe("SUCCEEDED");
expect(execution.getResult()).toMatchObject({
userId: "user-123",
cartId: "cart-456",
email: "user@example.com",
reminderSent: true,
timestamp: expect.any(String),
});
});
});
Full Working Example: Order Workflow
This Durable Function:
- Receives an event when a user adds an item to their cart
- Waits 24 hours
- Sends a reminder if the cart hasn't been checked out
- Returns a summary of the action taken
Durable Function Handler (TypeScript)
import {
DurableExecutionHandler,
withDurableExecution,
} from "@aws/durable-execution-sdk-js";
type CartReminderInput = {
userId: string;
cartId: string;
email: string;
};
type CartReminderResult = {
cartId: string;
userId: string;
reminderSent: boolean;
timestamp: string;
};
export const makeHandler = () => {
const durableHandler: DurableExecutionHandler<
CartReminderInput,
CartReminderResult
> = async (event, context) => {
const { cartId, userId, email } = event;
const startTime = new Date().toISOString();
await context.step("cart-added", async () => {
console.log(`User ${userId} added cart ${cartId} at ${startTime}`);
});
// Wait 24 hours before checking cart status
await context.wait("wait-before-reminder", { hours: 24 });
// Check if cart was already checked out (mocked logic)
const cartCheckedOut = false; // Simulate lookup
if (!cartCheckedOut) {
await context.step("send-reminder", async (stepContext) => {
stepContext.logger.info("Sending cart reminder", { userId, cartId, email });
});
return {
cartId,
userId,
email,
reminderSent: true,
timestamp: new Date().toISOString(),
};
}
return {
cartId,
userId,
email,
reminderSent: false,
timestamp: new Date().toISOString(),
};
};
return withDurableExecution(durableHandler);
};
Deploying with AWS CDK
You can deploy the Durable Function using AWS CDK with just a few steps.
const customRole = new iam.Role(this, "CustomRole", {
assumedBy: new iam.ServicePrincipal("lambda.amazonaws.com"),
});
const cartReminderFn = new NodejsFunction(this, "CartReminderFn", {
runtime: lambda.Runtime.NODEJS_LATEST,
entry: "../src/handlers/cartReminder/index.ts",
handler: "handler",
durableConfig: {
executionTimeout: Duration.hours(25),
retentionPeriod: Duration.days(30),
},
role: customRole,
});
customRole.attachInlinePolicy(
new iam.Policy(this, "DurablePolicy", {
statements: [
new iam.PolicyStatement({
actions: [
"lambda:CheckpointDurableExecution", // Save progress
"lambda:GetDurableExecutionState", // Resume function state
],
resources: [`${cartReminderFn.functionArn}:*`],
}),
],
})
);
executionTimeout: The amount of time that Lambda allows a durable function to run before stopping it, between a second and 366 days. If exceeded, the function will fail.
retentionPeriod: The duration for which AWS retains the execution history and state after the workflow completes. Useful for logs, audit, or manual retries. It must be between 1 and 90 days
For this article, I set it to 1 minute
Test execution
Durable configuration & Executions.
Durable Operations
Event History
When Should You Use Each?
Use Durable Functions when:
You prefer writing workflows in code
Logic is Lambda-based
You want clean unit tests and no extra state machines
Use Step Functions when:
You need visual debugging and monitoring
Your workflow spans many AWS services
You prefer declarative configuration (JSON/ASL)
Final Thoughts
The 2025 launch of AWS Durable Functions gives developers a new, elegant way to build workflows directly inside Lambda. No state machines. No JSON. Just code.
If you're building serverless apps and prefer async/await to YAML and JSON, Durable Functions are made for you.




Top comments (0)