DEV Community

Cover image for AWS Durable Functions vs Step Functions: Is Code-First Orchestration the New Standard?

AWS Durable Functions vs Step Functions: Is Code-First Orchestration the New Standard?

At the re:Invent 2025 keynote held in Las Vegas, AWS announced and launched Durable Functions for AWS Lambda, and for many serverless developers, this marked a major shift.

For the first time, AWS introduced a code-first approach to building long-running, stateful workflows directly inside Lambda, without requiring developers to define and manage Step Functions state machines.

As serverless applications grow more complex, workflows often need to:

  • Pause for minutes or hours
  • Wait for external events
  • Retry safely after failures
  • Resume without losing state

Until December 2025, AWS Step Functions were the primary solution. They are powerful and reliable, but they rely on JSON-based workflow definitions that feel disconnected from application code.

AWS Durable Functions change that model entirely.

This article explains:

  • What AWS Durable Functions are
  • How they work under the hood
  • How they compare to Step Functions
  • A full working example (handler + CDK)
  • Real execution screenshots
  • IAM and deployment configuration explained

What Are AWS Durable Functions?

AWS Durable Functions allow you to write long-running, stateful workflows directly inside a Lambda function using standard code (Node.js and Python for now).

Instead of defining workflows using Amazon States Language (ASL), you write normal async logic using a Durable Functions SDK.

Durable Functions are called durable because they:

  • Persist execution state automatically
  • Resume from the last completed step
  • Support long waits and external events
  • Survive Lambda restarts and failures

Think of it like writing a normal function, except AWS guarantees it will never forget where it left off.

Durable Function SDK: Core APIs

The Durable Execution SDK adds orchestration primitives to Lambda:

API Purpose
step(name, fn) Executes business logic with built-in retries and automatic checkpointing.
wait(name, duration) Suspends execution for a specified duration (up to 1 year) without compute charges.
waitForCallback(name) Pauses execution until an external event or human approval signal is received.
createCallback() Creates a callback that external systems can complete
waitForCondition(fn) Waits for a condition to be met by periodically checking state
parallel(tasks) Executes multiple branches with durable operations in parallel with optional concurrency control
invoke(name, payload) Invokes another durable or non-durable function with the specified input
runInChildContext(name, fn) Runs a function in a child context with isolated state and execution tracking

How Durable Functions Work: The Replay Model

Durable Functions rely on a replay model.

Here’s what happens when your function pauses:

  1. Your Lambda runs and reaches a wait()
  2. AWS checkpoints the execution state
  3. The Lambda invocation ends
  4. Later, AWS replays the function from the beginning
  5. Completed steps are skipped
  6. Execution resumes from the pause point

This allows workflows to pause for minutes, hours, or days without consuming Lambda runtime.

Pro‑Tip: Determinism Is Critical

Because of replay, orchestration code must be deterministic:

  • No Math.random() during orchestration
  • No reading current time directly
  • No external API calls in orchestration logic

Cold Starts & Replay Performance

At first glance, replay sounds expensive, but AWS optimizes heavily:

  • Checkpoint state is cached in the execution environment

  • Previously completed steps are skipped, not re-executed

  • Cold starts only replay orchestration logic, not step bodies

In practice, replay overhead is milliseconds, even for multi-step workflows.

What Are AWS Step Functions?

AWS Step Functions are a fully managed workflow service where logic is defined using Amazon States Language (ASL).

They provide:

  • Visual workflow diagrams
  • Built-in retries and error handling
  • Native integration with 200+ AWS services

Step Functions are excellent for service-heavy orchestration, but they are configuration-first, not code-first.

Durable Functions vs Step Functions: Limits & Constraints

This is the key question readers ask: “Why would I still use Step Functions?”

Durable Functions: Constraints

State Size Limit: Durable execution checkpoint data is currently limited (≈256 KB). Large payloads should be stored externally (S3, DynamoDB).

Visibility: Durable Functions rely on logs and execution history, not a visual graph UI.

Extremely Long Sequences: Step Functions handle massive branching workflows (thousands of states) more efficiently than deep replay chains.

Step Functions: Tradeoffs

  • JSON‑heavy definitions

  • Harder to unit test

  • Workflow logic split from application code

Both tools remain valuable; it’s about choosing the right abstraction.

Testing Durable Functions (Huge Advantage)

One of the biggest benefits of Durable Functions is its testability.

Because workflows are just code, you can:

  • Unit test orchestration logic

  • Mock DurableContext

  • Use standard tools like Jest or Vitest

import { LocalDurableTestRunner } from "@aws/durable-execution-sdk-js-testing";
import { handler } from "../src/handlers/cartReminder";

describe("cartReminder handler", () => {
  beforeAll(async () => {
    await LocalDurableTestRunner.setupTestEnvironment({ skipTime: true });
  });

  afterAll(async () => {
    await LocalDurableTestRunner.teardownTestEnvironment();
  });

  it("sends a reminder after the wait when the cart is not checked out", async () => {
    const runner = new LocalDurableTestRunner({
      handlerFunction: handler,
    });

    const execution = await runner.run({
      payload: {
        userId: "user-123",
        cartId: "cart-456",
        email: "user@example.com",
      },
    });

    expect(execution.getStatus()).toBe("SUCCEEDED");
    expect(execution.getResult()).toMatchObject({
      userId: "user-123",
      cartId: "cart-456",
      email: "user@example.com",
      reminderSent: true,
      timestamp: expect.any(String),
    });
  });
});

Enter fullscreen mode Exit fullscreen mode

Full Working Example: Order Workflow

This Durable Function:

  1. Receives an event when a user adds an item to their cart
  2. Waits 24 hours
  3. Sends a reminder if the cart hasn't been checked out
  4. Returns a summary of the action taken

Durable Function Handler (TypeScript)

import {
  DurableExecutionHandler,
  withDurableExecution,
} from "@aws/durable-execution-sdk-js";

type CartReminderInput = {
  userId: string;
  cartId: string;
  email: string;
};

type CartReminderResult = {
  cartId: string;
  userId: string;
  reminderSent: boolean;
  timestamp: string;
};

export const makeHandler = () => {
  const durableHandler: DurableExecutionHandler<
    CartReminderInput,
    CartReminderResult
  > = async (event, context) => {
    const { cartId, userId, email } = event;
    const startTime = new Date().toISOString();

    await context.step("cart-added", async () => {
      console.log(`User ${userId} added cart ${cartId} at ${startTime}`);
    });

    // Wait 24 hours before checking cart status
    await context.wait("wait-before-reminder", { hours: 24 });

    // Check if cart was already checked out (mocked logic)
    const cartCheckedOut = false; // Simulate lookup

    if (!cartCheckedOut) {
      await context.step("send-reminder", async (stepContext) => {
        stepContext.logger.info("Sending cart reminder", { userId, cartId, email });
      });

      return {
        cartId,
        userId,
        email,
        reminderSent: true,
        timestamp: new Date().toISOString(),
      };
    }

    return {
      cartId,
      userId,
      email,
      reminderSent: false,
      timestamp: new Date().toISOString(),
    };
  };

  return withDurableExecution(durableHandler);
};
Enter fullscreen mode Exit fullscreen mode

Deploying with AWS CDK

You can deploy the Durable Function using AWS CDK with just a few steps.

const customRole = new iam.Role(this, "CustomRole", {
  assumedBy: new iam.ServicePrincipal("lambda.amazonaws.com"),
});

const cartReminderFn = new NodejsFunction(this, "CartReminderFn", {
  runtime: lambda.Runtime.NODEJS_LATEST,
  entry: "../src/handlers/cartReminder/index.ts",
  handler: "handler",
  durableConfig: {
    executionTimeout: Duration.hours(25),
    retentionPeriod: Duration.days(30),
  },
  role: customRole,
});

customRole.attachInlinePolicy(
  new iam.Policy(this, "DurablePolicy", {
    statements: [
      new iam.PolicyStatement({
        actions: [
          "lambda:CheckpointDurableExecution", // Save progress
          "lambda:GetDurableExecutionState", // Resume function state
        ],
        resources: [`${cartReminderFn.functionArn}:*`],
      }),
    ],
  })
);
Enter fullscreen mode Exit fullscreen mode

executionTimeout: The amount of time that Lambda allows a durable function to run before stopping it, between a second and 366 days. If exceeded, the function will fail.

retentionPeriod: The duration for which AWS retains the execution history and state after the workflow completes. Useful for logs, audit, or manual retries. It must be between 1 and 90 days

For this article, I set it to 1 minute

Test execution

Durable configuration & Executions.

Durable Operations

Event History

When Should You Use Each?

Use Durable Functions when:

  • You prefer writing workflows in code

  • Logic is Lambda-based

  • You want clean unit tests and no extra state machines

Use Step Functions when:

  • You need visual debugging and monitoring

  • Your workflow spans many AWS services

  • You prefer declarative configuration (JSON/ASL)

Final Thoughts

The 2025 launch of AWS Durable Functions gives developers a new, elegant way to build workflows directly inside Lambda. No state machines. No JSON. Just code.

If you're building serverless apps and prefer async/await to YAML and JSON, Durable Functions are made for you.

Top comments (0)