At Hash Rekayasa Teknologi, we've been developing and using MocoBaaS, a Backend-as-a-Service solution.
One of the features for implementing business logic is Custom Script.
This feature has served us well for many use cases.
However, there are some use cases that consist of multiple steps. They can be implemented by "chaining" multiple scripts, one script triggering another. While this can get the job done, it's hard to keep track of the steps that were executed.
Imagine we have a use case like Marketplace Order:
DISCLAIMER: It may not represent a real world use case.
- Create Order
- Confirm Payment
- Confirm Delivery
- Confirm Completed
It can be done by defining this flow:
- Script:
create-order
- Triggered By: HTTP source
- Triggers:
create-order-success
event
- Script:
confirm-payment
- Triggered By: Event source
- Triggers:
confirm-payment-success
event
- Script:
confirm-delivery
- Triggered By: Event source
- Triggers:
confirm-delivery-success
event
- Script:
confirm-completed
- Triggered By: Event source
With the flow above, the scripts were executed as is. There is no centralized mechanism of tracking the executed steps, whether they were executed properly or not.
Serverless Workflow to the rescue
Among workflow languages out there, we choose Serverless Workflow. It's a vendor-neutral, open-source and community-driven workflow ecosystem.
The workflow definition can be written in JSON or YAML format.
And then there are SDKs available in various programming languanges, like Java, Go, TypeScript, .NET, Python.
The Marketplace Order use case above can be defined like this:
id: marketplaceorder
version: "1.0"
specVersion: "0.7"
name: Marketplace Order Workflow
description: Create and process orders on the marketplace.
start: CreateOrder
functions:
- name: createOrderFunction
operation: mocobaas://marketplace-order#create-order
- name: confirmPaymentFunction
operation: mocobaas://marketplace-order#confirm-payment
- name: confirmDeliveryFunction
operation: mocobaas://marketplace-order#confirm-delivery
- name: confirmCompletedFunction
operation: mocobaas://marketplace-order#confirm-completed
states:
- name: CreateOrder
type: operation
actions:
- functionRef: createOrderFunction
transition: ConfirmPayment
- name: ConfirmPayment
type: operation
actions:
- functionRef: confirmPaymentFunction
transition: ConfirmDelivery
- name: ConfirmDelivery
type: operation
actions:
- functionRef: confirmDeliveryFunction
transition: ConfirmCompleted
- name: ConfirmCompleted
type: operation
actions:
- functionRef: confirmCompletedFunction
end: true
And this is the diagram visualization:
If you are new to Serverless Workflow, or workflow in general, you may have so many questions about that π
I recommend you to watch this presentation:
And then read the official Serverless Workflow examples and specification:
- Version 0.7: examples, specification.
- Version 0.8: examples, specification.
Let me continue the story...
What we need to build is a runtime implementation that execute workflows based on the definitions.
Golang has become an important part of our stack at Hash Rekayasa Teknologi. So we simply choose the Go SDK for Serverless Workflow. Although I didn't try other SDKs, I'm sure there shouldn't be much difference to what I'm using here.
The most important question with the SDK: What it does and doesn't?
It does:
- Parse workflow JSON and YAML definitions.
- One workflow definition has a hierarchical structure. Each definition from top level to sub-levels will be represented as a Model, such as Workflow, State, Action, Function, Retry.
It doesn't:
- There is no Workflow Instance representation. For the execution, you have to define the unique identifier yourself.
- The duration values in ISO 8601 duration format are not parsed.
- The workflow expressions in jq format are not parsed.
With those limitations, there doesn't seem to be much we can do with the SDK. Just parse the workflow definition and use the hierarchical structure as a guide for executions.
package sw
import (
"errors"
"os"
"path/filepath"
"github.com/google/uuid"
"github.com/serverlessworkflow/sdk-go/v2/model"
"github.com/serverlessworkflow/sdk-go/v2/parser"
)
type StartWorkflowResult struct {
InstanceID string `json:"instanceId"`
}
var workflows map[string]*model.Workflow
func LoadWorkflows() error {
const definitionsDir = "definitions"
dirEntries, err := os.ReadDir(definitionsDir)
if err != nil {
return err
}
workflows = make(map[string]*model.Workflow)
for _, entry := range dirEntries {
name := entry.Name()
path := filepath.Join(definitionsDir, name)
wf, err := parser.FromFile(path)
if err != nil {
return err
}
workflows[name] = wf
}
return nil
}
func StartWorkflow(name string, input map[string]interface{}) (*StartWorkflowResult, error) {
wf, ok := workflows[name]
if !ok {
return nil, errors.New("Workflow not found: " + name)
}
instanceID := uuid.NewString()
// Start a new instance.
// Parameters: instanceID, wf, input
return &StartWorkflowResult{instanceID}, nil
}
Here we store the Workflow models in a map, so the LoadWorkflows()
function only needs to be called once.
And then the StartWorkflow()
function will be called in every execution.
Take notes for the implemented features
We may not implement all the features in the specification. One thing we can do is document them. Each feature will have status:
- implemented according to the spec π’π’
- implemented, but not according to the spec or using own standard π’π΄
- not/not yet implemented π΄
I took notes on a spreadsheet. You can see it here.
I use my native language, Bahasa Indonesia.
And it's not complete. I take note of a definition only when I start implementing it.
Let's see one example, the Function Definition:
- As we know, service call is defined here.
- The workflow runtime is written in Go, while the scripts are written in JavaScript (Node.js).
- MocoBaaS already has an internal RPC mechanism, so we want to use "custom" type.
- In spec v0.8, there's "custom" type. But as of this writing, the Go SDK only supports spec v0.7.
As you can see, we tried to stick to the spec as far as we could. But sometimes we have to use our own standards.
Executing workflow
The Marketplace Order Workflow has a linear flow, from create order to confirm completed. This is the directory structure containing workflow definition and scripts:
.
βββ marketplace-order
βββ definition.sw.yaml
βββ scripts
βββ confirm-completed.js
βββ confirm-delivery.js
βββ confirm-payment.js
βββ create-order.js
The end result will be a JSON like this:
{
"createOrder": true,
"confirmPayment": true,
"confirmDelivery": true,
"confirmCompleted": true
}
When the workflow is executed, starting with create-order.js
, data is a new object:
module.exports = async (ctx) => {
return {
data: { createOrder: true },
};
};
Next, confirm-payment.js
extends the data from previous state:
module.exports = async (ctx) => {
return {
data: { ...ctx.data, confirmPayment: true },
};
};
And so on.
Tracking workflow execution
As written in the spec:
Depending on their workflow definition, workflow instances can be short-lived or can execute for days, weeks, or years.
There is no recommendation on how to store the tracking information. Any database can be used.
We need to handle these requirements:
- One instance can have more than one states.
- The state's data input is typically the previous state's data output.
- If the state is the workflow starting state, its data input is the workflow data input.
- When workflow execution ends, the data output of the last executed state becomes the workflow data output.
For example, we have two tables:
- instances
- instance_states
The Marketplace Order Workflow execution can be stored like this:
Retrying actions
If a state returning an error, we can leave it as the final result or define a retry policy.
For example, we have a Chance of Success Workflow.
Directory structure:
.
βββ chance-of-success
βββ definition.sw.yaml
βββ scripts
βββ chance.js
chance.js
will randomize a boolean. If true, returns data. If false, returns error:
const chance = require("chance").Chance();
module.exports = async (ctx) => {
const isTrue = chance.bool({ likelihood: ctx.data.likelihood });
if (!isTrue) {
return {
error: { message: "failed" },
};
}
return {
data: { message: "success" },
};
};
And the workflow definition contains a retry definition:
id: chanceofsuccess
version: "1.0"
specVersion: "0.7"
name: Chance of Success Workflow
description: Try your chance of success. Retry if failed.
start: TakeAChance
functions:
- name: chanceFunction
operation: mocobaas://chance-of-success#chance
retries:
- name: chanceRetryStrategy
delay: PT10S
maxAttempts: 3
states:
- name: TakeAChance
type: operation
actions:
- functionRef: chanceFunction
retryRef: chanceRetryStrategy
end: true
With that retry definition, the runtime will perform this mechanism:
- The maximum attempts is 3 times.
- There is 10 second delay between retries.
- If we get data before maxAttempts, there will be no more retries.
- If maxAttempts is reached, there will be no more retries, regardless of the result.
Before we can use the delay duration, it needs to be parsed. For example, I use sosodev/duration and it works well.
Diagram visualization
Generating a diagram visualization from the workflow definition is really helpful, especially when you have complex workflows.
One way is that you can use the Web Editor in the official website. It can generate diagram from JSON or YAML, but the linter in the text editor will always expect JSON.
For VS Code users, there's an official extension, But as of this writing, it is outdated, only supports spec v0.6.
A better alternative is to use an extension from Red Hat. It supports spec v0.8. It also works well with spec v0.7. The only requirement is you must name the definition files to *.sw.json
, *.sw.yaml
or *.sw.yml
.
Caveat:
Looks like those tools use the same generator, as they produce the same diagram visualization. I noticed that they can only visualize the flow, but don't include other details, such as functions or retries.
Closing Thoughts
Workflow is quite a huge feature. And as you can see, Serverless Workflow offers a great flexibility between standard and customization. But if you need more training wheels in using a workflow system, there may be better solutions out there.
We haven't implemented most of the Serverless Workflow features yet.
For example, the workflow expressions I mentioned above. Using a library like itchyny/gojq looks promising, although I haven't tried it.
But at least this little effort is enough for a minimal functioning system.
Well, hope you enjoyed this article and found it useful π
Top comments (0)