Crismo Team

Posted on Apr 8 • Originally published at processcamp.io

Nilla Care: Signals, Errors, Escalation & Termination in BPMN

#bpmn #processmodeling #tutorial #learning

Where Safety Is the Product

Nilla Care is a small hospital where risk management isn't a department -
it's a culture. Energy blackouts, equipment failures, medication errors - any of these
could endanger patients and staff. To keep risks in check, Nilla Care maintains a set of
controls: fire extinguishers in the right locations, backup generators that actually start,
medication protocols that nurses actually follow.

But having controls isn't enough. You have to test them. Regularly. Rigorously.
This quarter, Nilla Care needs to test 10 controls. The process for doing so will teach
us four advanced BPMN events: signals, errors,
** escalations*, and **termination*.

  ## The Signal Event: Broadcasting to Anyone Listening

Before we step inside the hospital, let's understand signals with a simpler example.
Imagine a tofu company wins a major reference customer. They want to announce it - not to
a specific person, but to the world. A blog post, a logo on the website, maybe a webinar.

In BPMN, this undirected broadcast is a signal event. Unlike a message event
(which is sent to a specific recipient), a signal event broadcasts information
that isn't addressed to anyone in particular. Think of it like a radio station - it
transmits, and anyone tuned to the right frequency picks it up.

On the sending side, a signal end event (or throwing intermediate signal)
broadcasts the information. On the receiving side, a signal start event (or
catching intermediate signal) listens and triggers a process when the signal arrives.
The sender doesn't know - or care - who's listening. That's what makes it different
from a message.

Signals are versatile. They can be used as start events, intermediate events (both catching
and throwing), boundary events (interrupting and non-interrupting), and end events. They
can even trigger event sub-processes. Of all the advanced events, signals have the widest
range of uses.

  ## The Error Event: When Something Breaks

An error event represents a technical failure - something went wrong that
shouldn't have. In BPMN, errors are always exceptional and always interrupting. There's
no "non-interrupting error" because when a system fails, you can't just keep going as if
nothing happened.

Error events have three forms:

  - **Error end event** - placed inside a sub-process to signal that something
      failed. It terminates the sub-process immediately.

Attached error event - a catching boundary event on a sub-process. When the error end event fires inside, this boundary event catches it and routes the token to an alternative "error handling" path.
Error start event - used only to trigger an interrupting event sub-process.
BPMN assumes errors are exceptions, so they only start event sub-processes, never
normal processes.

## The Escalation Event: Bumping It Up

Not every problem is a catastrophic failure. Sometimes things just need to be
bumped up to a higher authority. A delivery is five days late - that's not
an error (the system didn't crash), but it's a problem that someone more senior should
know about.

The escalation event handles this. Unlike errors, escalations can be
non-interrupting. The original work can continue while the escalation triggers
a parallel path. The late delivery keeps being tracked, and the customer gets
notified about the delay.

Escalation events are especially useful for communication between parent processes and
sub-processes. A throwing escalation event inside a sub-process can trigger a catching
escalation event in the parent scope - connecting the two levels without stopping either.

  ## The Terminate Event: The Nuclear Option

The terminate event is the most drastic end event in BPMN. When a token
reaches a terminate end event, it doesn't just end its own path - it kills every token
in the current process instance, including tokens in sub-processes. Everything stops.
Immediately.

This only exists as an end event - which makes sense. Its entire purpose is to end
everything, right now. You use it when a situation is so severe that no other work in
the process should continue. A fraudulent transaction detected? Terminate. Kill the
token that's still waiting at the parallel gateway, kill the token in the verification
sub-process, kill everything.

Use it sparingly. In most cases, you can achieve the same result with cancelling event
sub-processes or attached interrupting events. But when you need a clean, unmistakable
"stop everything" - the terminate event is your tool.

  ## Nilla Care&apos;s Control Testing Process

Now let's see all four events in action. The process begins with defining the scope and
parameters for the quarterly control assessment. Last year's control report is pulled
in as a reference - controls that failed before get extra scrutiny this time.

The actual testing lives inside a sequential multi-instance sub-process.
Each of the 10 controls goes through the same flow, one by one. For each control, two
tests run in sequence:

  - **Assess control design** - is the control designed correctly for the risk
      it&apos;s supposed to mitigate? A fire extinguisher rated for electrical fires in a kitchen
      with grease fires would fail this test.

Check operational performance - does the control actually work? Is the fire extinguisher charged, accessible, and in the right location?

If both tests pass, the control is marked as passed, and the sub-process moves to the
next instance. Controls 1 through 6 sail through without issues.

  ## Control 7: The Escalation

Control number 7 doesn't pass. One or both tests reveal a problem. Now the team
needs to define remediation measures - specific actions to fix the
control so it actually works.

After documenting the remediation plan, a throwing escalation event fires.
This triggers an event sub-process that implements the remediation in
parallel: the fix is applied, then documented.

Meanwhile, the main sub-process flow continues - compiling a dedicated control testing
report for this specific control. Two paths running simultaneously: one fixing the
problem, one documenting the test results. Neither blocks the other.

When both paths complete - no more active tokens - the sub-process instance for
control 7 finishes, and the sequential multi-instance moves to control 8.

  ## Control 8: The Deadline

Control 8 has a different problem. During the design assessment, it turns out that everyone
qualified to perform the test is on sick leave or vacation. The task stalls.

Each control test has a deadline. When that deadline passes, a cancelling timer
event sub-process fires - note the solid (not dashed) border on the start event.
This is interrupting: the token in the design assessment task gets killed.

The team provides a justification for the missed deadline and flags the control for the
next testing round. The current instance of the sub-process completes - but only this
instance. The sequential multi-instance continues with control 9.

This is a crucial distinction: an interrupting event sub-process cancels
the current instance of the multi-instance sub-process, not the entire
multi-instance. The loop continues.

  ## Controls 9 and 10: The System Failure

Control 9's design assessment passes. But during the operational performance check, the
IT system used to document testing results crashes completely. This triggers an
** attached error event** on the sub-process boundary.

Here's where it gets critical. Unlike the event sub-process from control 8 (which only
cancelled the current instance), an attached boundary event cancels the entire
multi-instance sub-process - all remaining instances, not just the current one.
Control 10 never gets tested through this path.

This makes perfect sense: the IT system is down. It's not a problem with one control -
it's a systemic failure. Every remaining control would face the same issue.

The token exits the sub-process and takes the alternative path: manual control
testing for the remaining controls (9 and 10). When that's done, the central
compliance team compiles the global control report, aggregating results
from all 10 controls - the six that passed, the one that needed remediation, the one
that missed its deadline, and the two that required manual testing.

  ## Choosing the Right Event

Nilla Care's process demonstrates when to use each event type:

  - **Signal** - broadcast information without a specific recipient. Use when
      the sender doesn&apos;t know (or care) who&apos;s listening.

Error - a critical failure that must stop the current scope. Always interrupting. Use for system crashes, data corruption, or anything that makes continuing impossible.
Escalation - a problem that needs attention from a higher level, but doesn't necessarily stop the current work. Can be interrupting or non-interrupting. Use for delays, quality issues, or anything that requires management awareness.
Terminate - kill every token in the process. Use as a last resort when the entire process instance must stop immediately.

The key insight: BPMN gives you a spectrum of severity. Escalation is a yellow flag.
Error is a red flag that stops the sub-process. Terminate is pulling the emergency brake
on the entire train.

  > **Spectrum of severity:** Signal = broadcast (anyone listening).

Escalation = yellow flag (bump it up). Error = red flag (stop this scope).
Terminate = emergency brake (kill everything).

This is part of the Learn BPMN series on ProcessCamp - 11 real-world scenarios to master process modeling. Try modeling this yourself in Crismo - free, no signup needed.

DEV Community

Nilla Care: Signals, Errors, Escalation & Termination in BPMN

Where Safety Is the Product

Top comments (0)