Photo by Heather McKean on Unsplash
Define typesafe workflows running as AWS Lambda Durable Functions using XState
Code is available in this repository
Background
One of the most exciting announcements made during last re:Invent, at least for the serverless enthusiasts, was the introduction of Lambda Durable Functions.
At first glance, it appeared to me that Durable Functions might be a nice option for relatively simple workflows. If you need to define a complex workflow, you would probably lean towards AWS Step Functions, which allow creating full-blown state machines.
Problem
I am somehow biased when it comes to type safety in programming. I really appreciate the help from a compiler, and I feel much safer when the illegal state is unrepresentable.
That is one of the reasons I like to use AWS Step Functions. If the flow is created as a state machine, all possible transitions are strictly defined. Your business logic will never end up in a state that is not described in the state machine.
However, AWS Step Functions have their flaws. The main, from my perspective, is that state machine definition is created using Amazon States Language (ASL) in combination with JSON Path or JSONata. I can't expect any type safety there. In other words, if I need to implement any non-trivial change in the flow, I need to run it to see if it even works.
AWS Durable Functions introduced new possibilities. Now the flow can be defined using TypeScript, which should provide much more confidence compared to JSON-based ASL.
Goal
I would like to end up with a typed state machine definition that I could run on the AWS Durable Functions and benefit from native features, like observability and retries.
Experiment
When working on the side projects, I have been using the XState library. It allows defining state machines and using the actor model to strictly describe the business logic. Usually, XState is mentioned in the context of the FE, but the model is universal, so it can be used on the BE too.
If you want to learn more about the ideas behind XState, please check the official documentation
Implementation
Code is available in this repository
I create a simple flow that would perform two operations and return a message combined from their outputs.
As a starting point, I use a template of the Durable Function provided by SAM.
State Machine
XState is an amazing implementation of the state charts and the actor model. In my example, I barely scratch the surface of the available possibilities.
I define states and transitions. I also have actors to perform async operations. The idea is to wrap these operations in a durable context, so they can be checkpointed and retried by the Durable Functions runtime.
To achieve it, I am passing DurableContext object to the state machine's context so I can use it in my actors.
Actors
In the world of the states machines, actors are entities that perform some actions. I am aware that this might look a bit confusing if you haven't used XState before. But really, the idea is not that complicated.
I define functions to perform the actors' logic. In my case, they simply return the string. I also pass the DurableContext (from Durable Functions SDK) as an argument, so I can use it to create checkpoints:
const getSuccessMessage = async (
durableContext: DurableContext,
): Promise<string> => {
return durableContext.step("success_step", async () => {
durableContext.logger.info("[machine] Executing success step");
return Promise.resolve("success message from first step");
});
};
For the step with retries, the function logic is the same:
const getMaybeMessage = async (
durableContext: DurableContext,
): Promise<string> => {
return durableContext.step(
"maybeStep",
async () => {
durableContext.logger.info("[machine] Generating message");
if (Math.random() < 0.4) {
durableContext.logger.info("[machine] Generating success message");
return Promise.resolve("success message from maybe step");
} else {
durableContext.logger.info("[machine] Generating failure message");
return Promise.reject(new Error("Execution failed"));
}
},
{
retryStrategy: (error, attemptCount) => {
if (attemptCount > 5) {
return { shouldRetry: false };
}
const delay = Math.pow(2, attemptCount - 1);
return { shouldRetry: true, delay: { seconds: delay } };
},
},
);
};
The magic happens if the function returns Promise.reject. In that case, the execution is stopped, and it will be retired. This is exactly what we want.
To turn functions into actors, I use the fromPromise helper.
const getMaybeMessageActorLogic = fromPromise(
async ({ input }: { input: { durableContext: DurableContext } }) =>
getMaybeMessage(input.durableContext),
);
State Machine
Now let's define the flow. One of the most awesome features of the XState is that it goes well with TypeScript, allowing us to have strongly typed state machines. I use the .setup method to define the types expected in the machine
export const simpleMachine = (durableContext: DurableContext) => {
return setup({
actors: {
getSuccessMessageActorLogic,
getMaybeMessageActorLogic,
startFromInitialStateLogic,
},
types: {
context: {} as {
durableContext: DurableContext;
message: string;
},
output: {} as { message: string },
},
}).createMachine({
// machine implementation
})
The setup looks inconspicuously but for me it is a game-changer. It provides strong types in the state machine definition.
The state machine definition looks exactly how you would expect:
export const simpleMachine = (durableContext: DurableContext) => {
return setup({
// setup definition
}).createMachine({
initial: "initial",
context: {
durableContext: durableContext,
message: "",
},
output: ({ context }) => ({ message: context.message }),
states: {
initial: {
invoke: {
src: "startFromInitialStateLogic",
input: ({ context }) => ({ durableContext: context.durableContext }),
onDone: {
target: "success_step",
},
},
},
success_step: {
invoke: {
src: "getSuccessMessageActorLogic",
input: ({ context }) => ({ durableContext: context.durableContext }),
onDone: {
target: "maybe_step",
actions: assign({
message: ({ event, context }) => event.output,
}),
},
},
},
maybe_step: {
invoke: {
src: "getMaybeMessageActorLogic",
input: ({ context }) => ({ durableContext: context.durableContext }),
onDone: {
target: "end",
actions: assign({
message: ({ event, context }) =>
event.output + " " + context.message,
}),
},
},
},
end: {
type: "final",
},
},
});
};
In my simple case, the flow is linear and straightforward. I don't even use explicit events to move through states.
Type safety
As we have types defined in the setup stage, now, for example, we can't use the wrong actor's name.
And if we try, there is an error:
An actor's input and returned output are strongly typed. The same goes for the context, state machine output, and events, including their payload.
Visual representation
The thing I really like about AWS Step Functions is the fact that the state machine is presented in the console as a graph. The cool thing with the XState is that you can have the same experience.
You need to go to https://stately.ai/, which is the company behind XState, and open the visual editor. The experience is somewhat similar to the AWS Step Function. We can start creating workflows in the editor or paste the TypeScript code to have a flow rendered.
The visual editor is a proprietary tool and requires creating an account. For public projects, it is free.
Run on Durable Functions
In my case, I am using SAM, so I simply run sam build && sam deploy in the terminal.
Let's test the function in the console.
By checking the execution history, I can confirm that all steps were properly checkpointed. One of them failed and was retired. On the timeline, I can see that the second retry was performed according to the backoff strategy
The list of events
Summary
The result of my experiment is more than promising. XState provides a powerful framework for using state machines and the actor model. It helps define even very complex flows, with multiple nested machines acting as actors. It works smoothly with TypeScript, providing type safety for state machines.
AWS Durable Functions come with a handy SDK that makes it easy to use checkpoints and replies.
The described approach might be useful if you need a safe way to define complex workflows that are easy to test locally. I will definitely explore this direction further.





Top comments (0)