TL;DR: After months exploring multi-agent orchestration with OpenClaw and Lobster, I hit a wall: no existing tool offered simple declarative spec + runtime-agnostic execution + first-class control flow. So I designed duckflux, a minimal YAML-based workflow DSL with loops, conditionals, parallelism, and events built in. The spec is now at v0.7, the TypeScript runtime ships as a CLI (
quack) and an embeddable library (@duckflux/core), with pluggable event hub backends (in-memory, NATS, Redis) and built-in execution tracing. Full docs at duckflux.openvibes.tech.
Table of Contents
- Previously, on this series
- The gap that remained
- What is duckflux
- Alternatives considered
- The spec at a glance
- The TypeScript runtime
- What's next
Previously, on this series
This article is the third in a series about building deterministic multi-agent development pipelines. If you're joining now, here's the short version.
In the first article, I documented two months of trial and error trying to build a code -> review -> test pipeline with autonomous AI agents. The core thesis: LLMs are unreliable routers, they forget steps, miscount iterations, skip transitions. Orchestration must be deterministic and implemented in code, not delegated to inference. After five failed attempts (Ralph Orchestrator, OpenClaw sub-agents, a custom event bus, skill-driven self-orchestration, and plugin hooks), I found Lobster, OpenClaw's built-in workflow engine. It was close, but lacked native loop support. I contributed a pull request adding sub-workflow steps with loops.
In the second article, I zoomed out. The problem wasn't just orchestration, it was multi-agents x multi-projects x multi-providers x multi-channels. I compiled a dataset of agent configuration formats across providers, proposed the Monoswarm pattern (a monorepo layout for managing agent swarms), and identified the still-missing piece: an orchestration layer that ties agent events to workflow transitions across projects.
Both articles ended with the same conclusion: we need a proper workflow DSL.
The gap that remained
Lobster was the closest thing to what I needed, but it was designed for linear pipelines with approval gates. My pull request added loops, but the deeper issues remained:
- No conditional branching (
if/then/else). - No parallel execution of multiple agents.
- No event system for inter-agent coordination.
- No typed expressions, conditions were shell commands returning exit codes.
- Tied to OpenClaw's runtime, not portable to other environments.
I looked at the broader landscape:
| Tool | Where it falls short |
|---|---|
| Argo Workflows | Turing-complete YAML disguised as config. A conditional loop requires template recursion, manual iteration counters, and string-interpolated type casting. |
| GitHub Actions | No conditional loops. Workarounds require unrolling or recursive reusable workflows. |
| Temporal / Inngest | Code-first (Go/TS/Python SDKs). The code IS the spec. No declarative layer. |
| Airflow / Prefect | DAGs are acyclic by definition, conditional loops are architecturally impossible. |
| n8n / Make | Visual-first, JSON-heavy specs. Loop constructs require JavaScript function nodes. Specs are unreadable as text. |
| Lobster | Linear pipelines with approval gates. No native loops, no parallelism, no conditionals. |
The gap was clear: no existing tool combines a simple declarative spec + runtime-agnostic execution + first-class control flow (loops, conditionals, parallelism) + events.
So I built one.
What is duckflux
duckflux is a minimal, deterministic, runtime-agnostic DSL for orchestrating workflows through declarative YAML. The spec is at v0.7, with a complete TypeScript runtime and a documentation site at duckflux.openvibes.tech.
The design principles are deliberate:
- Readable in 5 seconds -- any developer understands the flow by glancing at the YAML.
- Minimal by default -- features are only added when absolutely necessary.
- Convention over configuration -- sensible defaults everywhere.
- Runtime-agnostic -- the DSL defines WHAT happens and in WHAT ORDER. The runtime decides HOW.
- String by default -- every participant receives and returns strings unless a schema is explicitly defined, like stdin/stdout, the universal interface.
- Reuse proven standards -- expressions use Google CEL (used in Kubernetes, Firebase, Envoy), schemas use JSON Schema, format is YAML.
The simplest possible workflow:
flow:
- type: exec
run: echo "Hello, duckflux!"
That's it. One flow, one step. No boilerplate, no mandatory fields beyond what's needed.
A more realistic example: an agentic coding pipeline where a planner breaks work into tasks, then a loop fetches each task, a coder implements it, and a reviewer checks it:
id: agentic-coding-pipeline
name: Agentic Coding Pipeline
version: "0.7"
defaults:
timeout: 10m
cwd: ./repo
inputs:
goal:
type: string
required: true
description: "High-level description of what needs to be built"
taskQueueUrl:
type: string
required: true
maxRounds:
type: integer
default: 3
minimum: 1
maximum: 10
participants:
planner:
type: exec
run: >
claude -p
"Break the following goal into discrete coding tasks.
Return a JSON array of {id, description} objects.
Goal: " + workflow.inputs.goal
timeout: 5m
output:
type: array
items:
type: object
required: true
fetchTask:
type: http
url: workflow.inputs.taskQueueUrl + "/next"
method: GET
headers:
Accept: application/json
coder:
type: exec
run: >
claude -p
"Implement the following task in the current repository.
Task: " + fetchTask.output.description
timeout: 15m
onError: retry
retry:
max: 2
backoff: 10s
reviewer:
type: exec
run: >
claude -p
"Review the changes for the following task. Return a JSON
object with 'approved' (boolean) and 'feedback' (string).
Task: " + fetchTask.output.description
timeout: 10m
output:
approved:
type: boolean
required: true
feedback:
type: string
flow:
- planner
- loop:
max: workflow.inputs.maxRounds
steps:
- fetchTask
- coder:
input:
task: fetchTask.output.description
- reviewer:
input:
task: fetchTask.output.description
output:
approved: reviewer.output.approved
feedback: reviewer.output.feedback
rounds: loop.iteration
Compare this to the same scenario in Argo Workflows (~40 lines of template recursion), GitHub Actions (~50+ lines with unrolled iterations), or Temporal (~35 lines of Go code that requires compilation and a server).
Alternatives considered
Before landing on a custom YAML format, I evaluated two other approaches:
Extending Argo Workflows. Argo's YAML is expressive, but its power came from 6+ years of incremental feature additions. A conditional loop in Argo requires template recursion, manual iteration counters, and string-interpolated type casting, 13+ lines for what should be 6. The complexity is the feature, not a bug, and that's the problem.
Mermaid as executable spec. Mermaid sequence diagrams already have loop, par, and alt constructs. The DX for reading and writing is excellent, and diagrams render natively in GitHub. However, extending Mermaid for real workflow concerns (retry policies, timeouts, error handling, typed variables) requires hacking Note blocks for config and $var for expressions, creating a custom parser as proprietary as a new YAML format, just disguised as something familiar.
Custom minimal YAML (chosen). A new format, intentionally constrained, inspired by Mermaid's visual clarity but with the extensibility and tooling ecosystem of YAML. The tradeoff: a new DSL to learn, but one designed to be readable in 5 seconds and writable in 5 minutes.
The spec at a glance
The full spec is at github.com/duckflux/spec, with complete documentation at duckflux.openvibes.tech. Here's a walkthrough of the key features.
Participants
Participants are the atomic unit of work. Each has a type that determines its behavior:
| Type | Description |
|---|---|
exec |
Shell command |
http |
HTTP request |
mcp |
MCP server tool call |
workflow |
Sub-workflow (composition) |
emit |
Fire an event to the event hub |
Participants can be defined in three ways: in a reusable participants block, as named inline steps (with as), or as anonymous inline steps (without a name at all):
# Reusable (in participants block)
participants:
build:
type: exec
run: npm run build
flow:
# Reference a reusable participant
- build
# Named inline (one-off, but addressable by name)
- as: notify
type: http
url: https://hooks.slack.com/services/...
method: POST
# Anonymous inline (output accessible only via the I/O chain)
- type: exec
run: echo "done"
Implicit I/O chain
One of the most impactful features added since v0.2: the output of each step is automatically passed as input to the next step, forming a chain analogous to Unix pipes.
flow:
- type: exec
run: curl -s https://api.example.com/data
- type: exec
run: jq '.items[] | .name'
- type: exec
run: wc -l
Each step receives the previous step's output on stdin. No explicit input mapping needed for linear pipelines. When a participant also has an explicit input mapping, the runtime merges the chained value with the explicit mapping.
Control flow
Loops -- repeat until a CEL condition is true or N iterations:
- loop:
until: reviewer.output.approved == true
max: 3
steps:
- coder
- reviewer
Parallel -- run steps concurrently:
- parallel:
- as: lint
type: exec
run: npm run lint
- as: test
type: exec
run: npm test
Conditionals -- branch based on CEL expressions:
- if:
condition: tests.status == "success"
then:
- deploy
else:
- rollback
Guards -- skip a single step conditionally:
- deploy:
when: reviewer.output.approved == true
Wait -- pause for an event, a timeout, or a polling condition:
# Wait for an external event
- wait:
event: "approval.received"
match: event.requestId == submitForApproval.output.id
timeout: 24h
# Sleep
- wait:
timeout: 30s
# Poll until a condition is true
- wait:
until: now >= timestamp("2024-04-01T09:00:00Z")
poll: 1m
timeout: 48h
Set -- write values into a shared execution context without producing output:
- set:
token: workflow.inputs.api_token
region: env.AWS_REGION
- as: fetchData
type: http
url: "'https://api.example.com/data'"
headers:
Authorization: "'Bearer ' + execution.context.token"
set is transparent to the I/O chain: the chain passes through unchanged.
Exec input passing semantics
How input reaches an exec subprocess depends on its type:
-
Map input -> environment variables. When the resolved input is an object, each key-value pair is injected as an environment variable. The
runcommand references them via shell interpolation (${KEY}). - String input -> stdin. When the resolved input is a string, it's passed via stdin, enabling Unix pipe-style chaining.
# Map input: keys become environment variables
- as: deploy
type: exec
run: ./deploy.sh --branch="${BRANCH}" --env="${TARGET_ENV}"
input:
BRANCH: workflow.inputs.branch
TARGET_ENV: execution.context.environment
# String input: passed via stdin
flow:
- type: exec
run: echo '{"name": "World"}'
- type: exec
run: jq -r '.name'
Expressions with Google CEL
All conditions, input mappings, and output mappings use Google CEL. CEL is non-Turing-complete, sandboxed (no I/O, no side effects), type-checked at parse time, and has a familiar C/JS/Python-like syntax:
- if:
condition: reviewer.output.approved == false && loop.iteration < 3
The runtime ships with the full CEL standard library: has, size, matches, contains, startsWith, endsWith, timestamp, duration, filter, map, exists, all, and more.
CEL was chosen over JavaScript eval (security surface, runtime dependency), custom mini-DSLs (implementation burden), and JSONPath/JMESPath (poor logic support).
Variable namespaces
Since v0.3, input and output are participant-scoped: inside a participant, input means "my input" and output means "my output". Workflow-level I/O lives under workflow.inputs.* and workflow.output.
Key runtime variables:
| Namespace | Description |
|---|---|
workflow.inputs.* |
Workflow input parameters |
workflow.output |
Workflow final result |
<step>.output |
A step's output (auto-parsed if JSON) |
<step>.status |
success, failure, or skipped
|
execution.context.* |
Shared read/write scratchpad (set via set) |
env.* |
Environment variables (read-only) |
loop.iteration |
Current loop iteration index |
input |
Current participant's resolved input |
Events
emit publishes events, wait subscribes. Events propagate both internally (within the workflow) and externally via the event hub:
- as: notifyProgress
type: emit
event: "task.progress"
payload:
taskId: workflow.inputs.taskId
status: coder.output.status
ack: true # block until delivery confirmed
Error handling
Configurable per participant, per flow step invocation, or globally via defaults, with four strategies:
# Global defaults
defaults:
onError: retry
retry:
max: 2
backoff: 5s
participants:
coder:
type: exec
run: ./code.sh
onError: retry # retry with exponential backoff
retry:
max: 3
backoff: 2s
factor: 2 # exponential: 2s, 4s, 8s
deploy:
type: exec
run: ./deploy.sh
onError: notify # redirect to a fallback participant
Error strategy resolution chain: flow override > participant > defaults > fail.
Inputs and outputs
Everything is string by default, like stdin/stdout. Schema is opt-in via JSON Schema (written in YAML):
inputs:
repoUrl:
type: string
format: uri
required: true
branch:
type: string
default: "main"
output:
approved: reviewer.output.approved
score: reviewer.output.score
Input mapping supports flow-level overrides that merge with the participant's base input (instead of replacing it), so you never have to repeat shared configuration on every call:
participants:
fetch_page:
type: exec
input:
NOTION_TOKEN: execution.context.token # base input, always present
run: curl -sS "https://api.notion.com/v1/pages/$(cat)" -H "Authorization: Bearer ${NOTION_TOKEN}"
flow:
- fetch_page:
input:
PAGE_ID: workflow.inputs.story_id # merged with base input
JSON Schema for editor support
A JSON Schema ships with the spec, giving you autocomplete and validation in VS Code for free:
{
"yaml.schemas": {
"./duckflux.schema.json": "*.duck.yaml"
}
}
Workflow files use the .duck.yaml convention (e.g., deploy.duck.yaml, review-loop.duck.yaml).
The TypeScript runtime
The original plan was a Go runner, chosen for its native CEL implementation (cel-go) and single-binary distribution. After prototyping, I switched to TypeScript: Go's plugin model can't support extensibility via npm packages, which is the core extensibility primitive for duckflux plugins. The runtime targets Bun and ships as both a CLI tool and an embeddable library.
Packages
| Package | Description |
|---|---|
duckflux |
CLI tool (quack run, quack lint, quack validate) |
@duckflux/core |
Engine, parser, CEL evaluator, event hub (in-memory) |
@duckflux/hub-nats |
Optional NATS JetStream event hub backend |
@duckflux/hub-redis |
Optional Redis Streams event hub backend |
Installation
# Universal installer (auto-detects apt, brew, bun, npm; falls back to standalone binary)
curl -fsSL https://duckflux.github.io/apt-repo/install.sh | bash
# Or via Homebrew
brew install duckflux/tap/quack
# Or via npm/bun
npm install -g duckflux # or: bun add -g duckflux
# Or run without installing
npx duckflux run workflow.yaml
Standalone binaries (no Node.js or Bun required) are also available for macOS, Linux, and Windows on the GitHub Releases page.
CLI usage
# Run a workflow
quack run deploy.duck.yaml --input branch=main --input env=staging
# Run from stdin
echo '{"branch": "main"}' | quack run deploy.duck.yaml
# Validate (schema + semantics)
quack lint deploy.duck.yaml
# Validate with inputs
quack validate deploy.duck.yaml --input branch=main
# Start the web server UI for visual workflow observation
quack server --trace-dir ./traces
# Version
quack version
Library usage
Drop @duckflux/core into any TypeScript project and run workflows in-process:
import { executeWorkflow } from "@duckflux/core/engine";
import { parseWorkflowFile } from "@duckflux/core/parser";
const workflow = await parseWorkflowFile("./pipeline.yaml");
const result = await executeWorkflow(workflow, { env: "production" });
console.log(result.output); // structured output
console.log(result.steps); // per-step results, timings, errors
No subprocess, no serialization overhead, full TypeScript types.
Event hub backends
Async workflows that emit and wait on events work out of the box with the built-in in-memory hub. Scale up to NATS or Redis when you need cross-process delivery:
| Backend | Package | Cross-process | Use case |
|---|---|---|---|
| In-memory | built-in | No | Development, testing, single-process |
| NATS JetStream | @duckflux/hub-nats |
Yes | Distributed, multi-process |
| Redis Streams | @duckflux/hub-redis |
Yes | Distributed with persistence |
quack run workflow.yaml --event-backend nats --nats-url nats://localhost:4222
quack run workflow.yaml --event-backend redis --redis-addr localhost:6379
Execution tracing
Every run can produce a structured trace, written incrementally as each step completes. Choose the format that fits your workflow:
# Trace to JSON (default)
quack run workflow.yaml --trace-dir ./traces
# Trace to SQLite (queryable with any SQL client)
quack run workflow.yaml --trace-dir ./traces --trace-format sqlite
Each trace captures every step (participants and control-flow constructs alike) with timing, inputs, outputs, errors, and retry counts.
Spec v0.7 feature coverage
The runtime implements the complete duckflux v0.7 spec:
-
Participant types:
exec,http,emit,workflow(+mcpstub) -
Control flow:
loop,parallel,if/else,whenguards,set,wait - I/O chaining: step output flows automatically as input to the next step
-
Expressions: full CEL standard library (
has,size,matches,timestamp,duration, and more) -
Error strategies:
fail,skip,retry(exponential backoff), redirect to fallback participant - Input semantics: map input -> env vars, string input -> stdin
- Input merge: flow override merges with participant base input instead of replacing it
-
Timeouts: per-step, per-participant, or global via
defaults - Output schema validation: validate step and workflow output against JSON Schema definitions
- Circular sub-workflow detection: prevents infinite recursion in nested workflows
What's next
Tooling and ecosystem
The documentation site at duckflux.openvibes.tech covers everything from getting started to the full library API. A browser-based visual editor for building workflows is planned.
On the roadmap
Features deliberately deferred from v0.7, to be prioritized based on real-world demand:
-
DAG mode -- explicit step dependencies (
depends: [stepA, stepB]) for complex graphs - Durability / resume -- workflow survives a runtime crash and resumes from where it stopped
- Matrix / fan-out -- combinatorial execution (e.g., tests across 3 Node versions x 2 OS)
- Persistent mode -- workflow running as a daemon, reacting to events continuously
- Caching between runs -- reuse outputs from idempotent steps across executions
The thesis, revisited
The journey from Protoagent to Lobster to duckflux converged on one insight: LLMs should do what they're good at (writing code, analyzing code, making decisions), and code should do what code is good at (sequencing, counting, routing, retrying).
duckflux is the code side of that equation. A deterministic orchestration layer where the flow is explicit, the execution is predictable, and the spec is readable by both humans and machines.
Links:
- duckflux docs -- Full documentation site
- duckflux spec -- DSL specification (v0.7)
- duckflux on npm -- TypeScript runtime
- Article 1 -- Building a deterministic pipeline with Lobster
- Article 2 -- Multi-agents x multi-projects x multi-providers x multi-channels
Top comments (1)
Sounds really cool! When can I use these two plugins to integrate duckflux into openclaw