I recently posted a small code snippet in a LinkedIn poll and asked what sounded like a simple question:
Is this code deterministic?
Those are usually the dangerous questions.
I asked on purpose. I’ve been spending time talking with folks much smarter than me, reading docs, and honestly leaning on code assistants to sanity-check my thinking as I go. Durable execution has a way of surfacing edge cases you don’t normally think about, and I wanted to learn in public—right alongside everyone else.
The discussion that followed (in the original post) was excellent. It also showed how easy it is to mix together concepts like determinism, replay, retries, and idempotency. This post is my attempt to slow things down and separate those ideas, using the original example and AWS’s guidance on deterministic code in AWS Lambda durable functions.
Here’s the code that started it all:
import { withDurableExecution, DurableContext } from '@aws/durable-execution-sdk-js';
export const handler = withDurableExecution(
async (event: any, context: DurableContext) => {
const orders = event.orders.sort((a, b) => a.priority - b.priority);
const results = [];
for (const order of orders) {
const result = await context.step(`process-${order.id}`, async () => {
return processOrder(order);
});
results.push(result);
}
return { processed: results.length, timestamp: Date.now() };
}
);
Most people voted “No, non-deterministic.”
That’s the correct answer—but not always for the reasons people first reach for.
Let’s walk through it.
Problem 1: Equal-priority ordering is under-specified
What’s happening
event.orders.sort((a, b) => a.priority - b.priority);
Modern JavaScript engines (ES2019+) guarantee that Array.prototype.sort() is stable. If two orders have the same priority, their relative order is preserved.
So no, JavaScript isn’t secretly reordering your data.
Why it still matters (and why this one is subtle)
I’ll be honest: this one felt nit-picky to me at first. If the input is the same and the sort is stable, it feels like everything should be fine.
The important realization is this: a stable sort preserves whatever order the input already had—but it doesn’t explain why that order exists.
In this code, the implicit rule becomes:
“If priorities are equal, keep whatever order the input arrived in.”
If that order is intentional and guaranteed, great. Nothing wrong here.
But if it’s incidental—maybe merged upstream, aggregated from multiple sources, or simply not meant to be meaningful—then the workflow’s step ordering now depends on an accident of the input.
Nothing is broken. But you may have just encoded behavior you didn’t mean to encode.
One more thing worth saying out loud: if the order in which steps are created doesn’t matter, you may not need to sort at all. Sorting only makes sense if you’re enforcing a real business rule like “highest priority goes first.”
Why not just wrap the ordering in a step?
This is a very common reaction—and a reasonable one.
Yes, you could wrap the sort in a step and checkpoint it. That would make the ordering fully durable and replay-stable.
But steps are not free.
They add latency. They cost money. They count toward operation limits. And they exist primarily to protect work that is slow, expensive, or has side effects.
Pure, fast, in-memory logic like sorting is already replayable. Re-running a sort of 10 items during replay is usually far cheaper than checkpointing it. Even with larger lists, the trade-off depends on size, cost, and intent.
The rule of thumb I like is this:
If the logic is pure, fast, and deterministic, don’t rush to wrap it.
If you can’t make it deterministic, or replaying it is expensive, that’s when a step makes sense.
Fix
If ordering matters, make it explicit and deterministic, without mutating the input:
const orders = [...event.orders].sort(
(a, b) => a.priority - b.priority || a.id.localeCompare(b.id)
);
Problem 2: Date.now() is non-deterministic
What’s happening
timestamp: Date.now()
This value is computed at runtime, so every execution produces a different number.
Why it matters
In this handler, the timestamp is just part of the returned response. It doesn’t affect control flow or step scheduling, so it’s harmless today.
But time-based APIs are explicitly called out in the durable execution docs as a common source of non-determinism. If this value later gets stored as workflow state, passed into a step, or used in a conditional, replay behavior can change in ways that are very hard to reason about.
This is less “this is wrong” and more “this is easy to trip over later.”
Fix
If the timestamp actually matters, capture it once inside a step so it replays consistently:
const timestamp = await context.step("timestamp", async () => Date.now());
Problem 3: Side effects hidden inside a single step
Ben Kehoe correctly points out a subtle but important issue in this code.
What’s happening
await context.step(`process-${order.id}`, async () => {
return processOrder(order);
});
A durable step can fail and be retried. Once a step completes, it won’t be re-run on replay—but retries can re-execute the step body.
If processOrder performs multiple side effects, a failure partway through can cause those side effects to run again.
Why it matters
This is not a determinism problem.
It’s not a replay problem either.
This is a retry safety problem.
If a step body can’t safely run more than once, retries can produce duplicate effects unless everything inside the step is idempotent.
Fix
Be intentional about retry boundaries and align steps with retry-safe work.
Problematic version:
await context.step(`process-${order.id}`, async () => {
await chargeCard(order);
await writeAuditRecord(order);
await sendConfirmation(order);
});
If this fails after chargeCard, a retry may re-run everything.
Safer version:
await context.step(`charge-${order.id}`, async () => {
return chargeCard(order, { idempotencyKey: order.id });
});
await context.step(`audit-${order.id}`, async () => {
return writeAuditRecord(order);
});
await context.step(`notify-${order.id}`, async () => {
return sendConfirmation(order, { idempotencyKey: order.id });
});
This doesn’t magically make things idempotent. It just limits the blast radius when retries happen.
Step semantics and retry intent
AWS Lambda durable functions also give you control over how steps retry.
By default, steps use AtLeastOncePerRetry semantics. If a step fails or the Lambda is interrupted, the runtime may re-execute the step body. In this mode, the retry count acts as a lower bound on executions.
If you have a step that must never run more than once, you can use StepSemantics.AtMostOncePerRetry with zero retries. In that case, a failure surfaces as an error instead of re-running the step.
Put simply:
- AtLeastOncePerRetry → max attempts is a lower bound
- AtMostOncePerRetry → max attempts is an upper bound
Neither is “better.” They just encode different assumptions.
Putting it all together
Once you make ordering explicit, keep non-deterministic values under control, and think carefully about retry boundaries, the handler becomes much easier to reason about.
Here are two durable-safe ways to structure it, depending on how independent your work items are and how much concurrency you want.
Durable-safe handler (step by step)
import { withDurableExecution, DurableContext } from '@aws/durable-execution-sdk-js';
export const handler = withDurableExecution(
async (event: any, context: DurableContext) => {
const orders = [...event.orders].sort(
(a, b) => a.priority - b.priority || a.id.localeCompare(b.id)
);
for (const order of orders) {
await context.step(`validate-${order.id}`, async () => {
if (!order.id) throw new Error("Missing order id");
});
await context.step(`charge-${order.id}`, async () => {
return chargeCard(order, { idempotencyKey: order.id });
});
await context.step(`notify-${order.id}`, async () => {
return sendConfirmation(order, { idempotencyKey: order.id });
});
}
return { processed: orders.length };
}
);
Durable-safe handler using context.map()
context.map() changes the shape of the problem a bit. Each item becomes its own durable unit of work.
That matters because:
- Failures are isolated to a single item
- Completed items don’t get re-run because something else failed
- Concurrency becomes a first-class knob (
maxConcurrency)
The trade-offs are real too:
- Large lists can emit a lot of steps quickly
- Strict sequencing is harder to express
import { withDurableExecution, DurableContext } from '@aws/durable-execution-sdk-js';
export const handler = withDurableExecution(
async (event: any, context: DurableContext) => {
const orders = [...event.orders].sort(
(a, b) => a.priority - b.priority || a.id.localeCompare(b.id)
);
const mapResult = await context.map(
"process-orders",
orders,
async (ctx: DurableContext, order: any) => {
await ctx.step(`validate-${order.id}`, async () => {
if (!order.id) throw new Error("Missing order id");
});
await ctx.step(`charge-${order.id}`, async () => {
return chargeCard(order, { idempotencyKey: order.id });
});
await ctx.step(`notify-${order.id}`, async () => {
return sendConfirmation(order, { idempotencyKey: order.id });
});
return { orderId: order.id, status: "ok" };
},
{ maxConcurrency: 5 }
);
const results = mapResult.getResults();
return { processed: results.length, results };
}
);
Final takeaway
Durable execution encourages you to slow down just a bit and be explicit about ordering, retries, idempotency, and where work can safely be repeated.
That’s exactly why this question was worth asking—and why the conversation around it was worth having.
Top comments (0)