Sibling Rivalry? How to Make Kestra Tasks Talk to Each Other

#kestra #orchestration #automation

I'm putting together a video on sibling tasks in Kestra and thought I would build this idea out loud with you in a blog first.

Let's jump in!

What are sibling tasks?

Sibling tasks are tasks that have the same parent task. For the rest of this blog, we'll refer to sibling tasks within the context of a dynamic or looping construct.

So in this case, we can be more specific and say sibling tasks are the tasks defined within the tasks list of a loop. If we use ForEach as an example, we could have something that looks like this:

- id: my_loop
  type: io.kestra.plugin.core.flow.ForEach
  values: ["alpha", "beta"]
  tasks:
    - id: first_sibling
      type: io.kestra.plugin.core.debug.Return
      format: "Data for {{ taskrun.value }}"

    - id: second_sibling
      type: io.kestra.plugin.core.log.Log
      # Accessing the sibling's output for the current iteration
      message: "The first sibling said: {{ outputs.first_sibling[taskrun.value].value }}"

The ForEach, or my_loop, is the parent, with the siblings being first_sibling and second_sibling.

In this example you can see our log message contains info passed from first_sibling with the expression syntax. You will see this in the docs as "sibling task lookup". Pay attention to the differences between sequential and dynamic tasks, like looping.

Execution & `TaskRun` context

You may be familiar with the concept of context or scope and I think this is where things can start to get confusing.

A task run is a single execution of an individual task within a larger workflow execution. When you are within a ForEach's task list, you'll want to use taskrun.value to get the value associated with that run or iteration.

This might be a bit different to how you've done looping with programming languages where you are scoped within the brackets of the loop. You don't need to think about iterations, just the work happening within the loop.

Think of it like a "You Are Here" marker on a map. If you were hiking on a trail and came across a trail map with no marker to tell you where you are, it's not very helpful. You know you are on a trail and you could probably make a guess of where you are based on landmarks or time spent on the trail, but you could also be really wrong. And then get really lost.

Bringing it back to programming and workflows, using a key to indicate what run or iteration becomes hugely important when you are running tasks concurrently. This means the context needs a hint at what iteration you are interested in. I know I'm looking for data in the loop, but which run?

Outputs

Outputs are how you pass data between tasks or flows. Standard tasks usually reference outputs via {{ outputs.task_id.value }}. This gives you the value of a specific task's output.

In this case, we want to pass data between sibling tasks, which may look something like this {{ outputs.first_sibling[taskrun.value].value }}. This gives you the values of a specific run or iteration from the first_sibling task's output. This is very specific and helps prevent any conflict with concurrency or tasks firing in parallel.

We can't assume ForEach child tasks will run in parallel unless we set loop concurrency or use a parallel task.

I highly recommend reading our best practices page on this topic.

Outputs vs. execution & TaskRun context

But let's say we want to access data outside of the loop. We can do that too, modifying our example from above:

- id: my_loop
  type: io.kestra.plugin.core.flow.ForEach
  values: ["alpha", "beta"]
  tasks:
    - id: first_sibling
      type: io.kestra.plugin.core.debug.Return
      format: "Data for {{ taskrun.value }}"

    - id: second_sibling
      type: io.kestra.plugin.core.log.Log
      # Accessing the sibling's output for the current iteration
      message: "The first sibling said: {{ outputs.first_sibling[taskrun.value].value }}"

- id: outside
  type: io.kestra.plugin.core.log.Log
  # Accessing the data for the `alpha` run specifically
  message: "{{ outputs.first_sibling['alpha'].value }}"

This would give us the "alpha" iteration's output from the first_sibling task as a value.

Where it gets confusing

This tends to get confusing when you start using a mix of outputs and taskRun. When your output exists outside of the loop, you use the standard syntax. Inside the loop? You need to give Kestra the key (or value) associated with the particular iteration.

Additionally, your error messages can seem non-intuitive. You can will typically get an error where the value isn't found when you forget to use a key.

So let's look again at this example:

- id: outside
  type: io.kestra.plugin.core.log.Log
  message: "{{ outputs.first_sibling['alpha'].value }}" # correct

But if we ran this with the key, the value of the iteration we want to capture, we'll get an error.

- id: outside
  type: io.kestra.plugin.core.log.Log
  message: "{{ outputs.first_sibling.value }}" # will error

Unable to find `value` used in the expression `{{ outputs.first_sibling.value }}` at line 1

When to use double quotes

Double quotes around your expressions give a clear indicator that this value is a string. Without them, you may get that unintuitive error that the value in the expression cannot be found.

When I'm debugging my workflows, I usually try to make sure I have a key for my iteration and I've wrapped my expression in double quotes before I do any more extensive investigation. That usually solves it for me.

This should show up as a linting warning with a nice yellow squiggle, but it depends on the task properties and types you are working with.

When to use `ForEach` and `ForEachItem`

A bigger topic is the differences between ForEach and ForEachItem, which we have a great best practices page on.

The TL;DR is use ForEachItem when you want subflows and you are doing a lot of work on a lot of things.

Let me know if in the comments if you want a blog or video (or both!) on this topic.

Wrapping up

So that covers sibling tasks and outputs and expressions, oh my! Hopefully this adds a little more supplemental details to our docs on this topic.

As always, let me know if there is something you want me to explain in a little more detail! I'll get a video related to this topic out next week too.