I've been somewhat obsessed with creating workflow engines for the better part of a decade. The idea of constructing a 'mega' machine from an army ...
For further actions, you may consider blocking this person and/or reporting abuse
Welcome to the party! I have made Laravel Workflow which is built entirely on top of queued jobs. A workflow will queue and run multiple times, as the workflow runs activties and makes progress. There is no dedicated coordinator, broker, etc.
github.com/laravel-workflow/larave...
I'm so excited to see how workflows are gaining popularity. Every new compitetor is more validation.
This is interesting
I was building something like this myself, great to know about this repo !!
Interesting... Will check it out. Interested to see how it compares to github.com/serverlessworkflow/synapse
Why not show a sample workflow definition and DSL?
That's a good callout. I've updated the article to include an example. Thanks!
This looks cool. As a long-time user of Nextflow and CWL, and a big fan of Go, I have a few thoughts on it. I am currently a very heavy proponent of Nextflow ; even if you are building your own system, you might be interested to spend time trying it out and getting a feel for how it works, in order to inform your own design choices. Some really relevant points from that system,
Tork looks pretty cool and I will be sure to keep an eye on it. Go has a lot of features that were sorely missed in the workflow-engine-scene of circa~2014 so its exciting to see what kind of innovations might be possible with it. Ultimately a lot of the most important features of these engines comes from their robustness on diverse infrastructure, and ease of use for complex workflows (hundreds of tasks running with full concurrency + parallel execution + async handling).
Thanks Steve! I really appreciate the feedback and recommendation.
Some thoughts:
My main reason for choosing YAML for Tork was that YAML comes with almost no baggage (whitespace controversy aside 🙂). Nothing in YAML means anything unless you give it a meaning. This is unlike -- say Python (or any other high-level language, not trying to pick on Python) -- where there's a significant learning curve. Moreover, many of the constructs of most high-level languages don't make sense in a distributed environment because the language was simply not designed for that. Even commonplace things such as loops and if statements (not to mention 3rd party libraries) don't directly translate to a distributed environment. So you end up having to have to use a subset of the language and it just ends up being a sugarcoat for your APIs. YAML comes "naked", familiar to many, is a no-brainer to learn even for non-programmers and has tons of support in terms of parsing, schema validation etc. To mitigate whitespace-error and typo-related issues, I'm working on adding autocomplete support to my pipeline editor on the Tork Web UI.
The runtime in Tork is essentialy Docker (with plans to add support for WASM and Podman in the future if there's a demand for it). I was actually trying to get out of the world where each type of task has to be defined in Go code which would require recompiling the engine, creating custom forks and having knowledge of Go programming -- which all felt too high-friction and could benefit less people. Using Docker as the runtime lets users write just about anything they want using one of Docker's many images and removes the need for recompilation of the engine. My intention here is that Tork only provide the necessary infrastructure to execute tasks in a distributed manner and not worry about WHAT you are running in these tasks.
Example:
This is a highly contrived example. But the point is that the user can write their scripts in whatever language they're comfortable with rather than being limited to Go.
post
task to upload the state after the task executes. e.g:Hope this helps
Is Apache Airflow suitable for such workflow engine solution?
I think that there's definitely an overlap in terms of goals between Airflow and Tork -- both designed as a general-purpose workflow engine. But the way they go about it is quite different. For example, workflows in Airflow are written in Python and require redeployment along with Airflow when they change. In Tork, on the other, workflows are written in plain YAML and do not require redeployment of the engine. Both approaches have pros and cons. I think that at the end of the day both engines are suitable for many use cases so it really boils down to personal preference.
Why not use a message queue like RabbitMQ, or in bigger organisations, Kafka?
The default broker implementation is in memory which is suitable for experimentation.
But Tork does support RabbitMQ for a distributed setup:
github.com/runabol/tork#running-in...