<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Arkadiusz Przychocki</title>
    <description>The latest articles on DEV Community by Arkadiusz Przychocki (@arkstack).</description>
    <link>https://dev.to/arkstack</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2387123%2Fa38712e5-6d78-4172-ac2d-e4792429f880.jpg</url>
      <title>DEV Community: Arkadiusz Przychocki</title>
      <link>https://dev.to/arkstack</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/arkstack"/>
    <language>en</language>
    <item>
      <title>Where StructuredTaskScope Ends: Building the Flow Layer in Exeris</title>
      <dc:creator>Arkadiusz Przychocki</dc:creator>
      <pubDate>Thu, 07 May 2026 09:59:58 +0000</pubDate>
      <link>https://dev.to/arkstack/where-structuredtaskscope-ends-building-the-flow-layer-in-exeris-29e6</link>
      <guid>https://dev.to/arkstack/where-structuredtaskscope-ends-building-the-flow-layer-in-exeris-29e6</guid>
      <description>&lt;p&gt;This is the fifth article in the Exeris Kernel series.&lt;/p&gt;

&lt;p&gt;The series has been building one coherent architectural picture:&lt;br&gt;
&lt;a href="https://blog.arkstack.dev/en/blog/why-i-banned-threadlocal-from-the-exeris-kernel" rel="noopener noreferrer"&gt;article one&lt;/a&gt; replaced &lt;code&gt;ThreadLocal&lt;/code&gt;&lt;br&gt;
with &lt;code&gt;ScopedValue&lt;/code&gt; on the context propagation path;&lt;br&gt;
&lt;a href="https://blog.arkstack.dev/en/blog/reevaluating-1990s-oop-in-java" rel="noopener noreferrer"&gt;article two&lt;/a&gt; argued that some layers do not benefit&lt;br&gt;
from runtime polymorphism;&lt;br&gt;
&lt;a href="https://blog.arkstack.dev/en/blog/structured-task-scope-beyond-toy-examples" rel="noopener noreferrer"&gt;article three&lt;/a&gt; showed where&lt;br&gt;
&lt;code&gt;StructuredTaskScope&lt;/code&gt; genuinely earns its keep;&lt;br&gt;
&lt;a href="https://blog.arkstack.dev/en/blog/your-tls-stack-is-lying-about-zero-copy" rel="noopener noreferrer"&gt;article four&lt;/a&gt; pushed TLS off the heap entirely.&lt;/p&gt;

&lt;p&gt;This one is about where STS stops being the right tool — and what had to be built instead.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Constraint I Kept Hitting
&lt;/h2&gt;

&lt;p&gt;The pressure came from a specific place: I was building the orchestration layer for&lt;br&gt;
Exeris Kernel — the part that drives multi-step business processes with compensation&lt;br&gt;
and external event wiring.&lt;/p&gt;

&lt;p&gt;The obvious starting point was an existing enterprise saga framework. I considered&lt;br&gt;
Camunda, Axon, and several others in that category, and rejected all of them. The&lt;br&gt;
flexibility the kernel's other constraints demanded — zero-allocation hot paths,&lt;br&gt;
ScopedValue propagation, off-heap state — was not something I could retrofit onto a&lt;br&gt;
framework whose orchestration model was already opinionated. Each also meant a separate&lt;br&gt;
operational process — Axon Server, Camunda engine — with its own JVM, memory footprint,&lt;br&gt;
and lifecycle. The cost-efficiency dimension I only quantified properly later, but&lt;br&gt;
the operational overhead was visible from day one.&lt;/p&gt;

&lt;p&gt;So I went one level lower. At first, &lt;code&gt;StructuredTaskScope&lt;/code&gt; seemed like a natural fit.&lt;br&gt;
You fork work, you join it, you get structured lifecycle guarantees. Project Loom made&lt;br&gt;
this cheap and clean.&lt;/p&gt;

&lt;p&gt;Then I hit a step that needed to wait for a payment confirmation.&lt;/p&gt;

&lt;p&gt;Not for a few milliseconds. Not until a downstream service responds synchronously.&lt;br&gt;
The step needed to yield execution, persist its state, and resume when an external&lt;br&gt;
event arrived — possibly hours later, possibly after a JVM restart.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;StructuredTaskScope&lt;/code&gt; has no model for this. It is structured in &lt;em&gt;space&lt;/em&gt; — all forked&lt;br&gt;
tasks are bounded to a scope that lives on a single call stack. It is not structured&lt;br&gt;
in &lt;em&gt;time&lt;/em&gt;. The moment your execution boundary needs to span time rather than scope,&lt;br&gt;
you have left &lt;code&gt;StructuredTaskScope&lt;/code&gt; territory entirely.&lt;/p&gt;

&lt;p&gt;That is the constraint that decided the architecture.&lt;/p&gt;


&lt;h2&gt;
  
  
  What STS Still Does Inside Flow
&lt;/h2&gt;

&lt;p&gt;Before going further: &lt;code&gt;StructuredTaskScope&lt;/code&gt; is not replaced. It is still used in&lt;br&gt;
several places throughout the kernel — and not always for what most introductions&lt;br&gt;
to STS show.&lt;/p&gt;

&lt;p&gt;The first use is the obvious one. Inside a single Flow step, when an action needs&lt;br&gt;
to call two independent services and merge the results before returning&lt;br&gt;
&lt;code&gt;FlowOutcome.CONTINUE&lt;/code&gt;. The execution is bounded to that step's Virtual Thread.&lt;br&gt;
STS owns the fan-out; Flow owns the step boundary. Standard structured concurrency.&lt;/p&gt;

&lt;p&gt;The second is at L3, inside the Events subsystem. &lt;code&gt;InMemoryEventBus.publishAndAwait&lt;/code&gt;&lt;br&gt;
opens a scope, forks one Virtual Thread per registered handler, joins, and returns&lt;br&gt;
once every handler has finished. &lt;code&gt;CommunityEventLoop.dispatchBatch&lt;/code&gt; does the same&lt;br&gt;
shape on a different cadence: per drained batch of events, fork one VT per registered&lt;br&gt;
batch processor, join when all are done. Both are clean ad-hoc fan-outs with a&lt;br&gt;
deterministic join point — exactly the case STS was designed for.&lt;/p&gt;

&lt;p&gt;The third is the most subtle. &lt;code&gt;OutboxOrchestrator.ownerLoop&lt;/code&gt; opens a scope and forks&lt;br&gt;
exactly one task — its long-lived poll-and-flush loop. Why use STS for a single fork?&lt;br&gt;
Because Java 26 enforces an owner-thread rule: &lt;code&gt;open()&lt;/code&gt;, &lt;code&gt;fork()&lt;/code&gt;, &lt;code&gt;join()&lt;/code&gt;, and&lt;br&gt;
&lt;code&gt;close()&lt;/code&gt; must all happen on the same thread. The orchestrator spawns a dedicated&lt;br&gt;
owner Virtual Thread specifically to satisfy that constraint, so the lifecycle scope&lt;br&gt;
of the loop is explicit and structured rather than implicit and tracked across threads.&lt;/p&gt;

&lt;p&gt;The point where Flow takes over is the &lt;em&gt;inter-step&lt;/em&gt; boundary — the moment a step&lt;br&gt;
returns something other than "I am done, continue." Below that boundary, STS&lt;br&gt;
remains the right tool, in all three patterns above. Flow picks up where the&lt;br&gt;
call stack ends.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Execution Model
&lt;/h2&gt;

&lt;p&gt;The execution model came from somewhere I didn't expect. Having ruled out the&lt;br&gt;
enterprise saga frameworks and dropped one level lower into STS, I still needed&lt;br&gt;
to decide what the saga state machine itself should look like. The first analogy&lt;br&gt;
that came to mind was the TLS state machine I'd just been working on (described&lt;br&gt;
in &lt;a href="https://blog.arkstack.dev/en/blog/your-tls-stack-is-lying-about-zero-copy" rel="noopener noreferrer"&gt;article four&lt;/a&gt;). TLS is one&lt;br&gt;
of the most rigorously specified state machines in widely-deployed software —&lt;br&gt;
states explicit, transitions deterministic, I/O bounded to state boundaries.&lt;br&gt;
Saga orchestration, structurally, has the same shape. I tried imitating the TLS&lt;br&gt;
pattern. It fit, and it stayed.&lt;/p&gt;

&lt;p&gt;Each Flow instance runs on its own Virtual Thread, launched by &lt;code&gt;CoreFlowRuntime&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// From CoreFlowRuntime.launch()&lt;/span&gt;
&lt;span class="nc"&gt;Thread&lt;/span&gt; &lt;span class="n"&gt;thread&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofVirtual&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"exeris-flow-"&lt;/span&gt;
          &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;instanceIdMost&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
          &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sc"&gt;'-'&lt;/span&gt;
          &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;instanceIdLeast&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;unstarted&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;runInstance&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startStep&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="n"&gt;runningThreads&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thread&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That Virtual Thread executes steps in a loop until one of three things happens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// From CoreFlowRuntime.applyOutcome()&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;switch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="no"&gt;CONTINUE&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;applyContinueOutcome&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stepIndex&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="no"&gt;COMPLETE&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;applyCompleteOutcome&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;yield&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="no"&gt;PARK&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;applyParkOutcome&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stepIndex&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;yield&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;CONTINUE&lt;/code&gt; — advance to the next step, stay on the same Virtual Thread.&lt;br&gt;
&lt;code&gt;COMPLETE&lt;/code&gt; — short-circuit directly to terminal state, bypassing remaining steps.&lt;br&gt;
&lt;code&gt;PARK&lt;/code&gt; — this is the boundary. The Virtual Thread exits. State is persisted. Execution&lt;br&gt;
is suspended until an explicit &lt;code&gt;wake()&lt;/code&gt; call.&lt;/p&gt;


    &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fwhere-structured-task-scope-ends%2Ffig1_flowstate_transitions.png" alt="Figure 1: FlowState transitions and the Virtual Thread boundary." width="" height=""&gt;


&lt;p&gt;That &lt;code&gt;PARK&lt;/code&gt; path is exactly what has no equivalent in &lt;code&gt;StructuredTaskScope&lt;/code&gt;. When&lt;br&gt;
&lt;code&gt;applyParkOutcome&lt;/code&gt; runs, it serializes the current &lt;code&gt;FlowSnapshot&lt;/code&gt; — step index,&lt;br&gt;
compensation stack, state, timeout — and registers the instance in the parked map.&lt;br&gt;
A &lt;code&gt;wake()&lt;/code&gt; call later resolves that snapshot, either from in-memory or from the&lt;br&gt;
&lt;code&gt;FlowSnapshotStore&lt;/code&gt;, and launches a &lt;em&gt;new&lt;/em&gt; Virtual Thread that resumes from&lt;br&gt;
&lt;code&gt;currentStep + 1&lt;/code&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Defining a Flow
&lt;/h2&gt;

&lt;p&gt;Before a flow can run, it must be compiled into an execution plan. The builder API&lt;br&gt;
is intentionally explicit — you declare steps, their compensation actions, and&lt;br&gt;
any non-linear transitions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;FlowExecutionPlan&lt;/span&gt; &lt;span class="n"&gt;orderFulfillment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;plans&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newDefinition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"order-fulfillment"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;step&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"validate-order"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
              &lt;span class="n"&gt;validateOrder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
              &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;FlowOutcome&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;CONTINUE&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
          &lt;span class="o"&gt;},&lt;/span&gt;
          &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* no rollback needed for validation */&lt;/span&gt; &lt;span class="o"&gt;})&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;step&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"charge-payment"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
              &lt;span class="n"&gt;initiatePayment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
              &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;FlowOutcome&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;PARK&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// wait for async confirmation&lt;/span&gt;
          &lt;span class="o"&gt;},&lt;/span&gt;
          &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;refundPayment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;step&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ship-order"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
          &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
              &lt;span class="n"&gt;dispatchShipment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
              &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;FlowOutcome&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;COMPLETE&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
          &lt;span class="o"&gt;},&lt;/span&gt;
          &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;plans&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;compile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;orderFulfillment&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;compile()&lt;/code&gt; validates the step graph, builds the adjacency matrix and the &lt;code&gt;nextStep&lt;/code&gt;&lt;br&gt;
index array, and registers the plan in the &lt;code&gt;planCatalog&lt;/code&gt;. After this point, the engine&lt;br&gt;
can schedule instances of this definition.&lt;/p&gt;

&lt;p&gt;The trade-off here is deliberate: the definition is closed-world at compile time.&lt;br&gt;
Adding or reordering steps after in-flight instances exist is a schema migration problem&lt;br&gt;
— and the engine does not solve it silently. If an in-flight Saga was parked on a step&lt;br&gt;
that no longer exists after a redeployment, &lt;code&gt;EX-FLOW-7002&lt;/code&gt; with &lt;code&gt;SCHEMA_MISMATCH&lt;/code&gt; is&lt;br&gt;
thrown on wake. The safe pattern is blue/green with Saga drain before switching traffic.&lt;br&gt;
I kept this constraint explicit rather than hiding it behind version-transparent routing,&lt;br&gt;
because transparent versioning that does not actually guarantee correctness is worse than&lt;br&gt;
visible friction.&lt;/p&gt;


&lt;h2&gt;
  
  
  When the Wake Comes from Outside
&lt;/h2&gt;

&lt;p&gt;Park/wake with a direct &lt;code&gt;scheduler.wake()&lt;/code&gt; call covers the case where you control both&lt;br&gt;
sides. The harder case is choreography: the flow should resume when an event arrives&lt;br&gt;
from an external system — a payment gateway, a warehouse service, a user action.&lt;/p&gt;

&lt;p&gt;This is where &lt;code&gt;FlowChoreographyBridge&lt;/code&gt; and the sealed &lt;code&gt;ChoreographyDecision&lt;/code&gt;&lt;br&gt;
type come in. You register a mapper that translates incoming event descriptors&lt;br&gt;
into routing decisions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;registerChoreographyMapper&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;descriptor&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// EventDescriptor carries the flow instance ID in streamIdHigh/streamIdLow.&lt;/span&gt;
        &lt;span class="c1"&gt;// The convention: the event producer encodes the target instance UUID&lt;/span&gt;
        &lt;span class="c1"&gt;// in those fields when publishing a choreography trigger.&lt;/span&gt;
        &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;instanceMost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;descriptor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;streamIdHigh&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;instanceLeast&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;descriptor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;streamIdLow&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instanceMost&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;instanceLeast&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ChoreographyDecision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ignore&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ChoreographyDecision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;wake&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instanceMost&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;instanceLeast&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;},&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"payment.confirmed"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;eventBus&lt;/span&gt;
&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bridge implementation uses Java 21 pattern matching on the sealed hierarchy to&lt;br&gt;
dispatch without instanceof chains:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// From FlowChoreographyBridge.handle()&lt;/span&gt;
&lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nc"&gt;ChoreographyDecision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Wake&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;most&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;least&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;scheduler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;lookupParked&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;most&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;least&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;ifPresent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;scheduler:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;wake&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="c1"&gt;// Stale or duplicate wake event: idempotent no-op if instance no longer parked&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nc"&gt;ChoreographyDecision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Start&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;FlowExecutionPlan&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;most&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;least&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;scheduler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newContext&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;most&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;least&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nc"&gt;ChoreographyDecision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Ignore&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* intentional no-op */&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ChoreographyDecision&lt;/code&gt; sealed interface gives the compiler exhaustiveness guarantees.&lt;br&gt;
Adding a new decision variant without handling it in the bridge is a compile error,&lt;br&gt;
not a silent runtime miss. This is the same pattern applied to &lt;code&gt;FlowOutcome&lt;/code&gt; in the&lt;br&gt;
step execution loop — sealed types as an architectural fence.&lt;/p&gt;

&lt;p&gt;The mapper itself is a &lt;code&gt;@FunctionalInterface&lt;/code&gt;. It closes over whatever correlation&lt;br&gt;
state you need. The bridge does not prescribe how you resolve correlation IDs —&lt;br&gt;
that is your domain logic.&lt;/p&gt;


&lt;h2&gt;
  
  
  Events and Flow: Two Layers, One Deduplication Contract
&lt;/h2&gt;

&lt;p&gt;The Events subsystem (L3) and the Flow engine (L4) have an explicit deduplication split&lt;br&gt;
that is worth stating directly, because it is easy to assume one layer handles it all.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;EventBus&lt;/code&gt; delivers at-least-once. There is no built-in deduplication at the bus level&lt;br&gt;
— that is an intentional design decision. Built-in bus-level dedup requires shared state&lt;br&gt;
across all subscribers: either a heap-allocated &lt;code&gt;ConcurrentHashMap&lt;/code&gt; (GC pressure) or a&lt;br&gt;
distributed lock (latency). Neither is acceptable at the performance tier the Events&lt;br&gt;
subsystem targets.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;IdempotencyGuard&lt;/code&gt; at L4 covers the Flow-specific case: step-level dedup per Saga instance.&lt;br&gt;
The split is:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Deduplication&lt;/th&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EventBus (L3)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None — at-least-once&lt;/td&gt;
&lt;td&gt;Subscriber responsibility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flow Engine (L4)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Per step, per instance&lt;/td&gt;
&lt;td&gt;&lt;code&gt;IdempotencyGuard.tryClaimStep()&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Application&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Per state mutation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;EventDescriptor.eventUuidHigh/Low&lt;/code&gt; as dedup key&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This means a choreography wake event can arrive twice — Outbox retry, broker reconnect,&lt;br&gt;
network duplicate. The bridge calls &lt;code&gt;scheduler.lookupParked()&lt;/code&gt; then &lt;code&gt;scheduler.wake()&lt;/code&gt;.&lt;br&gt;
If the flow is no longer parked (already woken by the first delivery), &lt;code&gt;lookupParked()&lt;/code&gt;&lt;br&gt;
returns empty and the second wake is a silent no-op. If the flow wakes and re-enters&lt;br&gt;
a step it already executed, &lt;code&gt;IdempotencyGuard&lt;/code&gt; skips it and advances. Two safety nets,&lt;br&gt;
neither of which requires coordination across the two layers.&lt;/p&gt;

&lt;p&gt;The integration point between them is &lt;code&gt;FlowProgressPublisher&lt;/code&gt;: when a flow reaches a&lt;br&gt;
terminal state, it optionally publishes a &lt;code&gt;FlowProgress&lt;/code&gt; event to the &lt;code&gt;EventBus&lt;/code&gt;. Only&lt;br&gt;
terminal transitions emit — intermediate state changes are deliberately skipped to avoid&lt;br&gt;
allocation on the hot scheduling path.&lt;/p&gt;


&lt;h2&gt;
  
  
  Idempotency Under Replay
&lt;/h2&gt;

&lt;p&gt;Crash-recovery replay and choreography re-wakes create a real deduplication problem:&lt;br&gt;
a step that already executed successfully should not execute again if the engine&lt;br&gt;
restarts mid-flow.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;IdempotencyGuard&lt;/code&gt; handles this at the step level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// From CoreFlowRuntime.runStep() — called inside synchronized(instance.monitor())&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;guard&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;guard&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tryClaimStep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;instanceIdMost&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;instanceIdLeast&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;stepIndex&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Step already claimed — skip and advance&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;nextGuardIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;plan&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;nextStep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stepIndex&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nextGuardIndex&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;complete&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;nextGuardIndex&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default &lt;code&gt;CoreIdempotencyGuard&lt;/code&gt; is heap-backed: a &lt;code&gt;ConcurrentMap&amp;lt;FlowKey, ConcurrentMap&amp;lt;Integer, Boolean&amp;gt;&amp;gt;&lt;/code&gt;.&lt;br&gt;
The inner map keyed by step index makes &lt;code&gt;releaseInstance()&lt;/code&gt; an O(1) removal of the&lt;br&gt;
entire instance entry rather than O(claimed steps). On &lt;code&gt;complete()&lt;/code&gt; or&lt;br&gt;
&lt;code&gt;FAILED_ROLLEDBACK&lt;/code&gt;, the guard releases the instance claim.&lt;/p&gt;

&lt;p&gt;Custom &lt;code&gt;IdempotencyGuard&lt;/code&gt; implementations — backed by Redis, a database, or&lt;br&gt;
any other shared store — can be bound via &lt;code&gt;KernelProviders.IDEMPOTENCY_GUARD&lt;/code&gt;&lt;br&gt;
before submitting to the engine.&lt;/p&gt;

&lt;p&gt;In the upcoming distributed model (ADR-013), idempotency becomes two-layered:&lt;br&gt;
the in-memory heap CAS in &lt;code&gt;CoreIdempotencyGuard&lt;/code&gt;, and the durable CAS on&lt;br&gt;
&lt;code&gt;FlowSnapshot.schemaVersion&lt;/code&gt; in &lt;code&gt;JdbcFlowSnapshotStore&lt;/code&gt;. ADR-013 requires that&lt;br&gt;
these two layers agree on terminal states — if the heap guard reports "already done"&lt;br&gt;
but the durable row claims otherwise, the durable answer wins and heap state is&lt;br&gt;
reconciled on next load. Neither layer can be the sole source of truth in a&lt;br&gt;
distributed deployment.&lt;/p&gt;

&lt;p&gt;This is not universal. It only covers the cases where the execution path is still&lt;br&gt;
under the kernel's control. External side effects — the payment was initiated but&lt;br&gt;
the response was lost — remain the caller's responsibility to make idempotent.&lt;br&gt;
The guard prevents the kernel from re-executing a step it already claimed. It cannot&lt;br&gt;
prevent a downstream service from seeing a duplicate call if the step succeeded before&lt;br&gt;
the crash.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Still Remains True
&lt;/h2&gt;

&lt;p&gt;Compensation is powerful but it comes with a cost: every step that can be rolled back&lt;br&gt;
must be written as an idempotent inverse operation. If your compensation action has&lt;br&gt;
side effects that cannot be reversed — a notification already sent, an audit record&lt;br&gt;
already written — you are modeling a business invariant that compensation cannot&lt;br&gt;
satisfy. The engine provides the mechanism. The correctness is still yours to own.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;terminalStateCatalog&lt;/code&gt; in &lt;code&gt;CoreFlowRuntime&lt;/code&gt; grows from &lt;code&gt;start()&lt;/code&gt; until &lt;code&gt;close()&lt;/code&gt;.&lt;br&gt;
It is an in-process idempotency fence that prevents re-scheduling already-terminal flows&lt;br&gt;
within a single runtime lifetime. This is acceptable for the current operational model&lt;br&gt;
but it is a known constraint: very long-lived runtimes with high flow volume will see&lt;br&gt;
this map accumulate. Bounded retention is on the v0.7 roadmap — but any policy&lt;br&gt;
must preserve the fence semantics. A terminal flow must not be re-executable.&lt;/p&gt;

&lt;p&gt;Cross-restart choreography wake is not yet supported. After a restart, parked flows&lt;br&gt;
survive only as snapshots in &lt;code&gt;FlowSnapshotStore&lt;/code&gt;; &lt;code&gt;lookupParked()&lt;/code&gt; currently resolves&lt;br&gt;
only from the in-memory index. The v0.7 roadmap closes this via &lt;code&gt;JdbcFlowSnapshotStore&lt;/code&gt;&lt;br&gt;
with a parked-enumeration entry point — on startup the engine rehydrates wake routing&lt;br&gt;
from the durable store. Steady-state choreography keeps the in-memory O(1) fast path;&lt;br&gt;
the store probe is fallback-only on miss.&lt;/p&gt;

&lt;p&gt;One constraint from that upcoming JDBC implementation is worth noting here: &lt;code&gt;JdbcFlowSnapshotStore&lt;/code&gt;&lt;br&gt;
is explicitly prohibited from using &lt;code&gt;ThreadLocal&lt;/code&gt; for context propagation. &lt;code&gt;DataSource&lt;/code&gt;&lt;br&gt;
is constructor-injected through &lt;code&gt;BootstrapContext&lt;/code&gt;. The same constraint that &lt;a href="https://blog.arkstack.dev/en/blog/why-i-banned-threadlocal-from-the-exeris-kernel" rel="noopener noreferrer"&gt;banned&lt;br&gt;
&lt;code&gt;ThreadLocal&lt;/code&gt; from the kernel's context propagation layer&lt;/a&gt;&lt;br&gt;
runs all the way through to the distributed saga state layer.&lt;/p&gt;

&lt;p&gt;Panama FFM does not appear in this layer directly. The allocation constraint is&lt;br&gt;
enforced on the hot scheduling path via &lt;code&gt;FlowZeroAllocTck&lt;/code&gt;, but orchestration state&lt;br&gt;
lives on the heap. The off-heap ownership model from the transport layer is not the&lt;br&gt;
right model for a state machine that needs to checkpoint, restore, and evolve&lt;br&gt;
across JVM restarts.&lt;/p&gt;

&lt;p&gt;One place where the zero-allocation philosophy does reach into this layer is&lt;br&gt;
&lt;code&gt;EventDescriptor&lt;/code&gt; — the routing metadata used by both the Events subsystem and&lt;br&gt;
the choreography bridge. Seven primitive fields: two &lt;code&gt;long&lt;/code&gt; pairs for the event&lt;br&gt;
and stream UUIDs, two &lt;code&gt;int&lt;/code&gt; for the ordinal and flags bitmap, one &lt;code&gt;long&lt;/code&gt; for the&lt;br&gt;
timestamp. The wire codec packs these into exactly 64 bytes — one CPU cache line,&lt;br&gt;
with explicit &lt;code&gt;MemoryLayout&lt;/code&gt; padding to fill it. C2 scalarises the record via&lt;br&gt;
Escape Analysis today. When JEP 401 reaches GA, adding the &lt;code&gt;value&lt;/code&gt; modifier&lt;br&gt;
requires zero field changes — object headers disappear for free.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;StructuredTaskScope&lt;/code&gt; is structured in space; Flow is structured in time. The&lt;br&gt;
distinction only becomes visible at the inter-step boundary, where &lt;code&gt;FlowOutcome.PARK&lt;/code&gt;&lt;br&gt;
exits the Virtual Thread, persists state, and lets a different Virtual Thread resume&lt;br&gt;
later. No STS abstraction maps to that shape. Choreography handles the case where&lt;br&gt;
the wake comes from outside — sealed &lt;code&gt;ChoreographyDecision&lt;/code&gt; and &lt;code&gt;FlowChoreographyMapper&lt;/code&gt;&lt;br&gt;
make the event-driven path type-safe and compiler-exhaustive, so adding a new decision&lt;br&gt;
variant is a compile error rather than a silent runtime miss.&lt;/p&gt;

&lt;p&gt;The trade-offs stay visible: schema migration requires drain-and-switch, compensation&lt;br&gt;
correctness is the caller's responsibility, and the terminal state catalog is&lt;br&gt;
runtime-scoped. None of these are problems Flow pretends to solve transparently.&lt;/p&gt;

&lt;p&gt;The next concrete step is distributed saga state, decided in ADR-013. The model:&lt;br&gt;
a shared durable &lt;code&gt;FlowSnapshotStore&lt;/code&gt; (reference implementation: &lt;code&gt;JdbcFlowSnapshotStore&lt;/code&gt;&lt;br&gt;
over Postgres in v0.7 Sprint 3) as the source of truth, with Kafka choreography wake&lt;br&gt;
for cross-service saga recovery (Sprint 5/6). Optimistic concurrency via&lt;br&gt;
&lt;code&gt;FlowSnapshot.schemaVersion&lt;/code&gt; — already in the SPI with &lt;code&gt;SCHEMA_VERSION_INITIAL = 1L&lt;/code&gt;&lt;br&gt;
— resolves concurrent advance attempts from two kernel nodes without a coordinator.&lt;/p&gt;

&lt;p&gt;Three approaches were rejected before landing there. A distributed lock service&lt;br&gt;
(ZooKeeper/etcd) puts blocking coordination on the saga advance path — inconsistent&lt;br&gt;
with the No-Waste-Compute contract. CRDT-based state replication is semantically&lt;br&gt;
unsound for the saga model: the compensation stack and step ordering are not&lt;br&gt;
commutative, so concurrent advances cannot be safely merged at the data-structure level.&lt;br&gt;
A single-leader coordinator with Raft/Paxos reintroduces the centralized stateful&lt;br&gt;
component the kernel deliberately avoids.&lt;/p&gt;

&lt;p&gt;What remains: cross-service choreography wake, parked-instance enumeration on startup,&lt;br&gt;
and &lt;code&gt;schemaVersion&lt;/code&gt; wire-through in &lt;code&gt;RuntimeFlowInstance.toSnapshot()&lt;/code&gt;. The schema&lt;br&gt;
migration constraint (drain-and-switch) is a separate problem that distributed saga&lt;br&gt;
state does not soften — that requires explicit &lt;code&gt;FlowSnapshot&lt;/code&gt; definition versioning,&lt;br&gt;
which is scoped to a later milestone.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Explore the Exeris Kernel — zero-allocation architecture in running code:&lt;/em&gt;&lt;br&gt;
&lt;em&gt;🔗 &lt;a href="https://github.com/exeris-systems/exeris-kernel" rel="noopener noreferrer"&gt;exeris-systems/exeris-kernel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Flow subsystem is in &lt;code&gt;exeris-kernel-core&lt;/code&gt; and &lt;code&gt;exeris-kernel-spi&lt;/code&gt;. The TCK&lt;br&gt;
covering the full lifecycle — submit, park, wake, compensate, crash-recovery — is&lt;br&gt;
in &lt;code&gt;exeris-kernel-tck&lt;/code&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>java</category>
      <category>architecture</category>
      <category>exeris</category>
      <category>jvmperformance</category>
    </item>
    <item>
      <title>Your TLS Stack Is Lying to You About Zero-Copy</title>
      <dc:creator>Arkadiusz Przychocki</dc:creator>
      <pubDate>Fri, 01 May 2026 17:03:13 +0000</pubDate>
      <link>https://dev.to/arkstack/your-tls-stack-is-lying-to-you-about-zero-copy-581n</link>
      <guid>https://dev.to/arkstack/your-tls-stack-is-lying-to-you-about-zero-copy-581n</guid>
      <description>&lt;h2&gt;
  
  
  The "No Waste Compute" Constraint
&lt;/h2&gt;

&lt;p&gt;When I started designing the Exeris Kernel, I set one non-negotiable rule very early: no waste compute. That rule sounds like a performance slogan until it starts killing otherwise normal design decisions.&lt;/p&gt;

&lt;p&gt;I had already banned &lt;code&gt;ThreadLocal&lt;/code&gt;, moved context propagation to Scoped Values, and pushed more of the runtime into explicit off-heap ownership. The idea was simple: if the hot path is supposed to stay outside GC pressure, then memory shape and lifetime cannot be treated as incidental details.&lt;/p&gt;

&lt;p&gt;Then I reached TLS.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Constraint upfront:&lt;/strong&gt; this is not a universal argument against &lt;code&gt;SSLEngine&lt;/code&gt;. For a normal Java service it is still a perfectly reasonable choice — battle-tested and deeply integrated into the ecosystem. This is about a narrower problem: what happens when TLS sits directly on the hot path of a runtime where off-heap ownership, deterministic cleanup, and zero-allocation execution are hard architectural constraints.&lt;/p&gt;

&lt;p&gt;In most Java applications, the TLS layer is just part of the stack. It encrypts bytes, hands them off, and usually gets discussed only when certificates break or latency suddenly becomes visible in production. But in a runtime where every byte on the transport path matters, TLS is not a side concern. It is one of the defining execution boundaries. Every request passes through it. Every response passes through it. If that boundary still speaks in heap-facing contracts, then the rest of the runtime is already adapting to the wrong model.&lt;/p&gt;

&lt;p&gt;The root issue I found with &lt;code&gt;SSLEngine&lt;/code&gt; was not that it is slow in the abstract, old, or even mainly that it allocates. The deeper problem is that &lt;code&gt;SSLEngine&lt;/code&gt; keeps the TLS boundary expressed in terms of JVM-managed buffer objects and heap-visible control flow, while the rest of the runtime is trying very hard to stop doing exactly that.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Impedance Mismatch in Memory Ownership
&lt;/h2&gt;

&lt;p&gt;The failure showed up at the contract level long before any benchmark.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SSLEngine&lt;/code&gt; is shaped around &lt;code&gt;ByteBuffer&lt;/code&gt;. You call &lt;code&gt;wrap(src, dst)&lt;/code&gt; and &lt;code&gt;unwrap(src, dst)&lt;/code&gt;. You get an &lt;code&gt;SSLEngineResult&lt;/code&gt; back. You stay inside a model where the TLS boundary is expressed through JVM-owned API objects, even if some of the underlying storage uses direct memory.&lt;/p&gt;

&lt;p&gt;That matters because I am not trying to reduce heap pressure only statistically in Exeris; I am trying to define explicit ownership all the way through the hot path. What I wanted from the boundary was strict control: the kernel owns the input memory, the kernel owns the output memory, the kernel controls the lifetime, and the kernel can release native state exactly when it decides the work is done.&lt;/p&gt;

&lt;p&gt;What &lt;code&gt;SSLEngine&lt;/code&gt; gives you is different. It relies on buffer exchange through a JDK object contract and state transitions expressed through JVM return objects. Its cleanup is not shaped around the same explicit ownership model as the rest of the kernel.&lt;/p&gt;

&lt;p&gt;In a conventional stack, delayed cleanup is usually acceptable because the whole system already tolerates a lot of deferred work. In an off-heap-first runtime, "cleanup later" is not neutral. It means native TLS state can survive beyond the point where the runtime is logically done with it. Once I noticed this mismatch in ownership semantics clearly, I stopped thinking of &lt;code&gt;SSLEngine&lt;/code&gt; as a component to tune and started seeing it as a boundary that belonged to the wrong architecture.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;img src="https://blog.arkstack.dev/blog/your-tls-stack-is-lying-about-zero-copy/fig1_tls_boundary.png" alt="Figure 1: SSLEngine memory contract vs. Exeris Arena ownership model."&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Netty Question
&lt;/h2&gt;

&lt;p&gt;I looked at Netty's &lt;code&gt;OpenSslEngine&lt;/code&gt; directly before committing to the FFM path. It is genuinely fast and battle-tested — and for many systems it is the right answer. But it operates under a different architectural paradigm.&lt;/p&gt;

&lt;p&gt;Netty solves the off-heap problem through pooled buffers and manual reference counting (&lt;code&gt;retain()&lt;/code&gt; and &lt;code&gt;release()&lt;/code&gt;). That is a powerful model, but it comes with a structural tax: the ownership semantics inevitably leak into application code, and forgetting to release a buffer creates notoriously difficult memory leaks. It is still a model bridging JVM objects and native memory through a heavy framework abstraction.&lt;/p&gt;

&lt;p&gt;With Panama FFM in Exeris, I don't need reference counting. I get deterministic, strict ownership. Memory boundaries are tied to scopes (like &lt;code&gt;Arena&lt;/code&gt;), meaning the lifecycle of the TLS buffer is statically guaranteed by the runtime, not dynamically managed by developers counting references. The boundary is cleaner, and the cost of maintaining it drops.&lt;/p&gt;




&lt;h2&gt;
  
  
  Explicit State and FFM
&lt;/h2&gt;

&lt;p&gt;To see why this changes the architecture, look at the actual implementation in the Exeris Kernel.&lt;/p&gt;

&lt;p&gt;First, I stopped letting the TLS engine silently manage its own lifecycle. In &lt;code&gt;TlsStateMachine&lt;/code&gt;, the transitions are deterministic and tied to the kernel's execution context, not left to the garbage collector.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Snippet from TlsStateMachine.java (Exeris Kernel)&lt;/span&gt;
&lt;span class="c1"&gt;// State transitions are explicitly modeled and bound to the off-heap lifecycle.&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;advanceState&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TlsEvent&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// I enforce strict state progression before any native call is made.&lt;/span&gt;
    &lt;span class="c1"&gt;// There is no ambiguous "maybe it's closed" state lingering on the heap.&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentState&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;TlsState&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;HANDSHAKE&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;TlsEvent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;APP_DATA&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IllegalStateException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Cannot process application data during handshake"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// ... explicit state handling&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Second, I mapped the actual cryptographic operation directly via Panama's FFM in &lt;code&gt;OffHeapTlsEngine&lt;/code&gt;. Notice that I am not wrapping heap arrays. I am passing raw memory segments or delegating file descriptors directly to native OpenSSL functions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Snippet from OffHeapTlsEngine.java (Exeris Kernel)&lt;/span&gt;
&lt;span class="c1"&gt;// Zero-allocation FFM call directly accessing the MemorySegment.&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;writeRaw&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MemorySegment&lt;/span&gt; &lt;span class="n"&gt;sourceSegment&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 1. I know the memory is off-heap and strictly owned by an Arena.&lt;/span&gt;
    &lt;span class="c1"&gt;// 2. I pass the native pointer directly via FFM downcall.&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;SSL_write&lt;/span&gt;&lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;MemoryAddress&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;sslHandle&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sourceSegment&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;sourceSegment&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;byteSize&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Throwable&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;TlsNativeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"FFM downcall to SSL_write failed"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trade-off here is explicit: I lose the safety net of &lt;code&gt;ByteBuffer&lt;/code&gt; bounds checking and GC cleanup. In return, I gain absolute control over the data path.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Exploratory Benchmarks Prove
&lt;/h2&gt;

&lt;p&gt;I prefer brutal transparency over carefully curated optimization claims. The native FFM TLS path in Exeris is still taking shape, but the early exploratory JMH results confirm exactly what I expected structurally.&lt;/p&gt;

&lt;p&gt;I tested four distinct architectural models:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;JDK SSLEngine&lt;/strong&gt;: The standard heap-facing boundary (in-memory direct only).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Netty tcnative&lt;/strong&gt;: Off-heap via JNI and reference-counted &lt;code&gt;ByteBuf&lt;/code&gt; (embedded channel pipeline).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exeris FFM (Memory BIO)&lt;/strong&gt;: Native TLS via Panama, where the runtime explicitly owns the memory (in-process).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exeris FFM (FD Owner)&lt;/strong&gt;: The absolute hot path. OpenSSL is bound directly to the socket file descriptor, bypassing intermediate memory buffers entirely (write-loopback).&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Architecture&lt;/th&gt;
&lt;th&gt;Memory Boundary&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;Allocation (Per 1KB Record)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JDK SSLEngine&lt;/td&gt;
&lt;td&gt;Heap (&lt;code&gt;ByteBuffer&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~905k ops/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~2,528 B/op&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Netty tcnative&lt;/td&gt;
&lt;td&gt;Off-heap (&lt;code&gt;ByteBuf&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~850k ops/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~560 B/op&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exeris Memory BIO&lt;/td&gt;
&lt;td&gt;Off-heap (Panama &lt;code&gt;Arena&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~923k ops/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0 B/op&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exeris FD Owner&lt;/td&gt;
&lt;td&gt;Direct Socket OS boundary&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~365k ops/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0 B/op&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;(Methodology: JMH &lt;code&gt;gc&lt;/code&gt; phase, Oracle JDK 26 GA, ZGC, commit &lt;a href="https://github.com/exeris-systems/exeris-benchmarks/commit/f778683bf1d343d0c6a3a595d2d3b754c44a696c" rel="noopener noreferrer"&gt;&lt;code&gt;f778683&lt;/code&gt;&lt;/a&gt;, 2026-05-01. The Memory BIO profile phase additionally confirmed via JFR: zero &lt;code&gt;jdk.GarbageCollection&lt;/code&gt; events recorded — ZGC never ran a single collection during the entire benchmark run. Full suite in &lt;a href="https://github.com/exeris-systems/exeris-benchmarks" rel="noopener noreferrer"&gt;&lt;code&gt;exeris-benchmarks&lt;/code&gt;&lt;/a&gt;.)&lt;/em&gt;&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;img src="https://blog.arkstack.dev/blog/your-tls-stack-is-lying-about-zero-copy/fig2_fd_owner_path.png" alt="Figure 2: The data path of Memory BIO vs FD Owner directly binding to the socket descriptor."&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Let’s unpack what these numbers actually mean, because context matters more than raw digits.&lt;/p&gt;

&lt;p&gt;In a pure in-process memory test, the &lt;strong&gt;Exeris Memory BIO&lt;/strong&gt; implementation outpaces both standard &lt;code&gt;SSLEngine&lt;/code&gt; and Netty's &lt;code&gt;tcnative&lt;/code&gt;. The runtime achieves ~919,000 ops/s without paying the structural tax of heap-facing buffer exchanges.&lt;/p&gt;

&lt;p&gt;But the most important architectural metric is the last row: &lt;strong&gt;Exeris FD Owner&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A naive reading would ask why the throughput dropped to 350,000 ops/s. The answer is that the FD Owner benchmark leaves the synthetic in-process memory arena entirely. It writes directly to the OS loopback interface via socket file descriptors. At this stage, I am no longer benchmarking memory copy operations; I am hitting the limits of the OS network stack and syscalls.&lt;/p&gt;




&lt;h2&gt;
  
  
  The GC Layer and the True Cost of Abstractions
&lt;/h2&gt;

&lt;p&gt;What changed my mind was not the ops/s number. It was what &lt;code&gt;-prof gc&lt;/code&gt; showed underneath it.&lt;/p&gt;

&lt;p&gt;To process a standard 1024-byte payload, &lt;code&gt;SSLEngine&lt;/code&gt; allocates over &lt;strong&gt;2.5 Kilobytes&lt;/strong&gt; of garbage (&lt;code&gt;gc.alloc.rate.norm&lt;/code&gt;). The TLS layer generates more heap waste than the data it encrypts. I had already pushed the rest of the hot path off-heap — and the GC profiler was telling me the TLS boundary was quietly undoing that work on every record.&lt;/p&gt;

&lt;p&gt;By contrast, the Exeris FFM paths drop the normalized allocation rate to &lt;strong&gt;strict zero&lt;/strong&gt;. (The profiler registers &lt;code&gt;0.015 B/op&lt;/code&gt; with zero actual GC counts, which is standard JMH measurement noise for absolute zero).&lt;/p&gt;

&lt;p&gt;This is the core definition of "No Waste Compute." By eliminating the intermediate buffer tier completely, the kernel fundamentally changes the garbage collector's job. It stops doing TLS cleanup entirely. ZGC is no longer forced to clean up after the cryptography layer.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;img src="https://blog.arkstack.dev/blog/your-tls-stack-is-lying-about-zero-copy/fig3_gc_allocation_rate.png" alt="Figure 3: Allocation rate (garbage generated) per 1KB payload across different TLS architectures."&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Where SSLEngine Still Wins
&lt;/h2&gt;

&lt;p&gt;A few things remain true even after this architectural shift.&lt;/p&gt;

&lt;p&gt;First, &lt;code&gt;SSLEngine&lt;/code&gt; is still the right answer for the vast majority of systems. If I were building a normal Spring Boot application, a Netty service, or anything where the goal was strong operational simplicity with conventional JVM trade-offs, I would not force a native TLS path into the design.&lt;/p&gt;

&lt;p&gt;Second, direct buffers and pooling still matter. This is not an article pretending the entire existing Java ecosystem is naive.&lt;/p&gt;

&lt;p&gt;Finally, Panama FFM and native TLS do not remove complexity—they relocate it. You get absolute control, but you also inherit absolute responsibility for lifecycle, correctness, and failure modes. This is an architectural decision for a highly specialized kernel, not a generic industry recommendation.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Changed, and What I Gave Up
&lt;/h2&gt;

&lt;p&gt;A lot of JVM performance work still assumes the heap is the center of the system and the goal is simply to make it hurt less. That is a valid way to design software, but it is not the design I wanted for Exeris.&lt;/p&gt;

&lt;p&gt;Once the runtime moved toward explicit off-heap ownership, &lt;code&gt;SSLEngine&lt;/code&gt; stopped looking like a harmless standard abstraction and started looking like the one boundary that could quietly drag the whole transport path back into the wrong model.&lt;/p&gt;

&lt;p&gt;I dropped it because for this specific runtime, it speaks the wrong language. If the hot path is supposed to be off-heap and deterministic by design, then TLS has to speak that language too.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The FFM native TLS implementation and the explicit ownership model are built entirely off-heap in the Exeris Kernel. If you want to verify the numbers, run the code, or explore zero-allocation architecture:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Review the metrics: &lt;a href="https://github.com/exeris-systems/exeris-benchmarks" rel="noopener noreferrer"&gt;exeris-systems/exeris-benchmarks&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Inspect the Kernel and leave a star: &lt;a href="https://github.com/exeris-systems/exeris-kernel" rel="noopener noreferrer"&gt;exeris-systems/exeris-kernel&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>java</category>
      <category>architecture</category>
      <category>exeris</category>
      <category>jvmperformance</category>
    </item>
    <item>
      <title>StructuredTaskScope beyond toy examples: dependency-aware kernel bootstrap in modern Java</title>
      <dc:creator>Arkadiusz Przychocki</dc:creator>
      <pubDate>Tue, 07 Apr 2026 10:24:29 +0000</pubDate>
      <link>https://dev.to/arkstack/structuredtaskscope-beyond-toy-examples-dependency-aware-kernel-bootstrap-in-modern-java-57j0</link>
      <guid>https://dev.to/arkstack/structuredtaskscope-beyond-toy-examples-dependency-aware-kernel-bootstrap-in-modern-java-57j0</guid>
      <description>&lt;p&gt;I did not start this because I wanted to write an article about &lt;code&gt;StructuredTaskScope&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I got there from a more annoying direction: bootstrap had stopped being a startup script.&lt;/p&gt;

&lt;p&gt;Once the kernel had a real subsystem graph — config, memory, persistence, graph, events, flow, transport — the old mental model broke down. The question was no longer &lt;em&gt;"how do I start modules?"&lt;/em&gt; It became &lt;em&gt;"what is actually allowed to start now, what must already be ready, and what happens if one piece fails halfway through?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That is a different problem from request fan-out.&lt;/p&gt;

&lt;p&gt;This article is a follow-up to my earlier piece on DOP, &lt;code&gt;ScopedValue&lt;/code&gt;, and Loom. There, I used &lt;code&gt;StructuredTaskScope&lt;/code&gt; as a clean example of native fail-fast execution. Here I want to show the more useful case: what happened once I tried to fit it into a real lifecycle model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Constraint upfront:&lt;/strong&gt; this only makes sense when the execution path is still under my control. If you are building a plugin surface or a highly open extension model, parts of this break down quickly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Bootstrap stopped being linear
&lt;/h2&gt;

&lt;p&gt;A lot of startup code still assumes the system is basically a list:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;build some objects&lt;/li&gt;
&lt;li&gt;call &lt;code&gt;start()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;maybe wait a bit&lt;/li&gt;
&lt;li&gt;hope shutdown is the reverse&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That works until the dependency graph becomes real.&lt;/p&gt;

&lt;p&gt;In Exeris, bootstrap is constrained by subsystem relationships, not by the order I happen to like in a &lt;code&gt;main()&lt;/code&gt; method. Some subsystems are foundational. Some are optional. Some can start only after several others are already running. Some failures can degrade. Some cannot.&lt;/p&gt;

&lt;p&gt;At that point, startup becomes a graph problem whether you admit it or not.&lt;/p&gt;

&lt;p&gt;What I kept from the old model was determinism.&lt;br&gt;
What I dropped was the idea that everything meaningful should happen inside one generic &lt;em&gt;"start all modules"&lt;/em&gt; phase.&lt;/p&gt;

&lt;p&gt;The shape of the graph matters more than the urge to parallelize it.&lt;/p&gt;


    &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fstructured-task-scope-beyond-toy-examples%2Ffig1_boot_diag.png" alt="Figure 1: Dependency-aware kernel bootstrap graph in Exeris. The point is that concurrency is legal only where the graph permits it." width="800" height="1784"&gt;


&lt;p&gt;&lt;em&gt;Figure 1: Dependency-aware kernel bootstrap graph in Exeris. The point is not that several subsystems exist. The point is that concurrency is legal only where the graph permits it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I noticed that once the graph was explicit, &lt;em&gt;"just parallelize bootstrap"&lt;/em&gt; stopped being a serious answer pretty quickly. The graph already tells you where concurrency is allowed and where it is simply too early.&lt;/p&gt;

&lt;p&gt;The bootstrap docs in Exeris describe the same thing from the subsystem side: L0 remains foundational, higher layers can move only after the substrate is ready, and shutdown keeps that structure in reverse.&lt;/p&gt;


&lt;h2&gt;
  
  
  The split that actually mattered
&lt;/h2&gt;

&lt;p&gt;The design choice that mattered most was not &lt;em&gt;using&lt;/em&gt; &lt;code&gt;StructuredTaskScope&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It was deciding &lt;strong&gt;where not to use it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;At first, the obvious temptation was to parallelize more of bootstrap. If the JVM gives you virtual threads and structured concurrency, it is very easy to start looking for places to apply them.&lt;/p&gt;

&lt;p&gt;I ended up doing less than that.&lt;/p&gt;

&lt;p&gt;I kept &lt;code&gt;initialize()&lt;/code&gt; sequential and topological.&lt;br&gt;
I only allowed structured parallelism in &lt;code&gt;start()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That was not ideological. It was practical.&lt;/p&gt;

&lt;p&gt;Initialization is where the orchestrator builds structure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;provider bindings&lt;/li&gt;
&lt;li&gt;health registration&lt;/li&gt;
&lt;li&gt;active subsystem ordering&lt;/li&gt;
&lt;li&gt;dependency-safe lifecycle state&lt;/li&gt;
&lt;li&gt;bootstrap telemetry hooks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That phase wants determinism more than it wants speed. I did not want graph construction, provider composition, and lifecycle execution to collapse into one concurrent blur.&lt;/p&gt;

&lt;p&gt;Startup is different. Once the graph is already resolved and the active set is known, concurrency becomes useful — but only if it stays inside the same lifecycle boundaries the graph already established.&lt;/p&gt;

&lt;p&gt;That led to a much simpler rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;initialization stays ordered, startup may become parallel.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This was the point where the old model stopped making sense for me. I was no longer trying to make bootstrap &lt;em&gt;faster&lt;/em&gt; in the abstract. I was trying to keep lifecycle ownership readable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BootstrapPhase&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;BootstrapPhase&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;values&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Subsystem&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;forPhase&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;orderedSubsystems&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;phase&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forPhase&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phase&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;BootstrapPhase&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;FOUNDATION&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;startSequential&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forPhase&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;profileName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startedNames&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;startParallel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forPhase&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;profileName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startedNames&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, &lt;code&gt;FOUNDATION&lt;/code&gt; stays sequential on purpose. That includes the parts of bootstrap that decide whether the rest of the kernel can even be interpreted correctly: configuration roots, base runtime substrate, exception boundaries, and core providers.&lt;/p&gt;

&lt;p&gt;I could have parallelized more of that. I did not.&lt;/p&gt;

&lt;p&gt;The trade-off is deliberate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I give up some startup parallelism early&lt;/li&gt;
&lt;li&gt;in exchange for a cleaner substrate&lt;/li&gt;
&lt;li&gt;and less ambiguity when the higher layers begin to move&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not universal. If your startup graph is shallow and your layers are genuinely independent, you can be more aggressive. In my case, the real cost of a bad foundation was not a few extra milliseconds. It was a fuzzier lifecycle model and harder-to-classify failures later.&lt;/p&gt;




&lt;h2&gt;
  
  
  ScopedValue still mattered at the boundary
&lt;/h2&gt;

&lt;p&gt;This article is about &lt;code&gt;StructuredTaskScope&lt;/code&gt;, but I ended up reusing the same lesson from the previous piece: context propagation only stays clean if the boundary is explicit.&lt;/p&gt;

&lt;p&gt;In Exeris, bootstrap resolves configuration once, then binds it at the kernel boundary before the rest of the lifecycle begins. Everything spawned under that boundary inherits the same immutable context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;ScopedValue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;KernelProviders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;CURRENT_CONFIG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;call&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;runBootInsideScope&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;orchestrator&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;configRegistry&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;configWatcher&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernelMain&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="o"&gt;});&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SubsystemCircularDependencyException&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SubsystemOrchestrator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BootstrapException&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BootstrapException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Subsystem bootstrap failed: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That choice mattered more than another layer of constructor wiring would have.&lt;/p&gt;

&lt;p&gt;I did not want every subsystem, handler, or virtual thread to receive config through argument threading just because bootstrap needed lifecycle scope. I also did not want to fall back to &lt;code&gt;ThreadLocal&lt;/code&gt; and reintroduce the same inheritance and mutability problems I had already rejected elsewhere.&lt;/p&gt;

&lt;p&gt;So the boundary stayed strict:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;config is resolved once&lt;/li&gt;
&lt;li&gt;bound once&lt;/li&gt;
&lt;li&gt;inherited downward&lt;/li&gt;
&lt;li&gt;and torn down when boot exits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That kept the lifecycle model cleaner. It also meant that when I later opened structured startup rounds, they inherited the same immutable runtime context without extra ceremony.&lt;/p&gt;




&lt;h2&gt;
  
  
  The useful part was not STS itself
&lt;/h2&gt;

&lt;p&gt;The useful part was computing a &lt;strong&gt;safe round&lt;/strong&gt; before opening a scope.&lt;/p&gt;

&lt;p&gt;I do not want to smooth this into a generic explanation, because it is really the center of the design.&lt;/p&gt;

&lt;p&gt;The orchestrator does not just fork all pending subsystems for a phase and wait.&lt;/p&gt;

&lt;p&gt;It first computes which subsystems are actually safe to start &lt;strong&gt;now&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pendingNames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;Subsystem:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;collect&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;util&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Collectors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;LinkedHashSet:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Subsystem&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subsystem&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;dependenciesReadyForRound&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subsystem&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pendingNames&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startedNames&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ready&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BootstrapException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"Phase "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" cannot make progress: unresolved dependencies among pending subsystems "&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;pendingNames&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That changed the role of &lt;code&gt;StructuredTaskScope&lt;/code&gt; completely.&lt;/p&gt;

&lt;p&gt;It was no longer responsible for discovering order.&lt;br&gt;
It was responsible for executing one dependency-safe round inside an order the orchestrator had already made explicit.&lt;/p&gt;

&lt;p&gt;That is why I keep saying this is a graph problem first and a concurrency problem second.&lt;/p&gt;


    &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fstructured-task-scope-beyond-toy-examples%2Ffig2_ready_round.png" alt="Figure 2: Dependency-safe startup round. StructuredTaskScope is opened only after the orchestrator computes a ready set from the graph." width="800" height="361"&gt;


&lt;p&gt;&lt;em&gt;Figure 2: Dependency-safe startup round. The orchestrator computes eligibility first, then gives &lt;code&gt;StructuredTaskScope&lt;/code&gt; a bounded unit of work to own.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This was the point where the old &lt;em&gt;"just launch it and coordinate later"&lt;/em&gt; model stopped making sense. I did not want startup order to become an emergent property of timing, future composition, or whichever task completed first.&lt;/p&gt;

&lt;p&gt;I wanted concurrency to appear &lt;strong&gt;after&lt;/strong&gt; dependency eligibility had already been established.&lt;/p&gt;


&lt;h2&gt;
  
  
  This is where StructuredTaskScope actually earned its place
&lt;/h2&gt;

&lt;p&gt;Once the ready set exists, the role of &lt;code&gt;StructuredTaskScope&lt;/code&gt; becomes very narrow and very clean.&lt;/p&gt;

&lt;p&gt;It owns one startup round.&lt;/p&gt;

&lt;p&gt;That is it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StructuredTaskScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;open&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StructuredTaskScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Subtask&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;tasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StructuredTaskScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Subtask&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;subsystem&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fork&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;doStart&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subsystem&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
                    &lt;span class="o"&gt;}))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;join&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Throwable&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;failures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;StructuredTaskScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Subtask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;State&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;FAILED&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StructuredTaskScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Subtask&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;exception&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Throwable&lt;/span&gt; &lt;span class="n"&gt;first&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getFirst&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BootstrapException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" subsystem(s) failed in phase "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt;
                &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;". First failure: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the part I actually like.&lt;/p&gt;

&lt;p&gt;Not because it is clever. Mostly because it is boring in the right way.&lt;/p&gt;

&lt;p&gt;The round has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an owner&lt;/li&gt;
&lt;li&gt;explicit lifetime&lt;/li&gt;
&lt;li&gt;explicit completion&lt;/li&gt;
&lt;li&gt;explicit failure collection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No task belongs to some vague executor that outlives the lifecycle moment that created it. No background startup work escapes into &lt;em&gt;"maybe still running"&lt;/em&gt; territory. The concurrency boundary finally matches the lifecycle boundary.&lt;/p&gt;

&lt;p&gt;That was the point.&lt;/p&gt;

&lt;p&gt;And that is also why I think &lt;code&gt;StructuredTaskScope&lt;/code&gt; is more interesting here than in the usual &lt;em&gt;"fetch two things in parallel"&lt;/em&gt; examples. Those examples prove the API works. This kind of orchestrator is where it starts to fit the shape of the system.&lt;/p&gt;




&lt;h2&gt;
  
  
  I could have done this with futures. I did not want to.
&lt;/h2&gt;

&lt;p&gt;There is nothing impossible about building this with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ExecutorService&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CompletableFuture&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;latches&lt;/li&gt;
&lt;li&gt;custom worker tracking&lt;/li&gt;
&lt;li&gt;hand-rolled failure aggregation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the goal was just &lt;em&gt;"run multiple startup actions in parallel,"&lt;/em&gt; all of those would work.&lt;/p&gt;

&lt;p&gt;But the real problem was never just parallelism.&lt;/p&gt;

&lt;p&gt;What I actually cared about was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;who owns this work&lt;/li&gt;
&lt;li&gt;when exactly this round ends&lt;/li&gt;
&lt;li&gt;what belongs to this phase and what does not&lt;/li&gt;
&lt;li&gt;how failure is surfaced&lt;/li&gt;
&lt;li&gt;how shutdown reasoning stays clean&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bootstrap is one of the worst places to tolerate vague concurrency ownership. When startup fails, I do not want to guess whether some task is still alive in the background or whether a future chain has already detached from the lifecycle moment that spawned it.&lt;/p&gt;

&lt;p&gt;That is the real difference here.&lt;/p&gt;

&lt;p&gt;The point is not that &lt;code&gt;StructuredTaskScope&lt;/code&gt; can run tasks. The point is that it gives this round a proper boundary.&lt;/p&gt;

&lt;p&gt;I would still use more conventional concurrency tools when the lifecycle is shallower, the ownership model is already loose, or the surrounding architecture does not benefit from such a strict boundary. This is not a universal replacement story.&lt;/p&gt;




&lt;h2&gt;
  
  
  The failure policy mattered at least as much as the concurrency primitive
&lt;/h2&gt;

&lt;p&gt;I do not think this model would feel coherent without an explicit failure policy.&lt;/p&gt;

&lt;p&gt;Exeris supports both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;FAIL_FAST&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DEGRADE&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But not symmetrically.&lt;/p&gt;

&lt;p&gt;Foundational subsystems are still mandatory. They do not get to degrade just because higher layers can. That boundary matters.&lt;/p&gt;

&lt;p&gt;Inside the orchestrator, that asymmetry is explicit. Optional subsystems may be removed under &lt;code&gt;DEGRADE&lt;/code&gt;, but a mandatory failure still aborts boot.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isMandatory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
        &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subsystem&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;phase&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;BootstrapPhase&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;FOUNDATION&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;subsystem&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isOptional&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;failurePolicy&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;FailurePolicy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DEGRADE&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;isMandatory&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;removeSubsystemAndTransitiveDependents&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subsystem&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;healthMonitor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;markKernelState&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;KernelHealthMonitor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;KernelState&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;FAILED&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BootstrapException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"Subsystem '"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;subsystem&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"' failed: "&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;failure&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;failure&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I kept that asymmetry because not all failures mean the same thing. An optional higher-level capability failing to start can be survivable. A foundation-layer failure usually means the system no longer has a sane substrate to run on.&lt;/p&gt;

&lt;p&gt;This was another place where I resisted smoothing the model into something more uniform. Uniformity would have looked cleaner on paper, but it would have made the lifecycle semantics less truthful.&lt;/p&gt;

&lt;p&gt;So the useful question was never:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;can these tasks run in parallel?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It was:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;what does it mean for the kernel if this one fails right now?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That question forces the architecture to stay honest.&lt;/p&gt;




&lt;h2&gt;
  
  
  Startup only makes sense if shutdown keeps the same shape
&lt;/h2&gt;

&lt;p&gt;One thing I did not want to lose was lifecycle symmetry.&lt;/p&gt;

&lt;p&gt;A graph-shaped startup model should not collapse into an improvised shutdown path. If startup order is derived from dependency structure, shutdown should preserve that structure in reverse.&lt;/p&gt;

&lt;p&gt;In practice, that means I care about reverse topological shutdown just as much as startup rounds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Subsystem&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;reversed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;orderedSubsystems&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;Collections&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reversed&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Subsystem&lt;/span&gt; &lt;span class="n"&gt;subsystem&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;reversed&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subsystem&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isRunning&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;subsystem&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That sounds obvious, but it matters more once concurrency enters the picture. Structured startup is easier to trust when the rest of the lifecycle still behaves like one coherent model instead of a collection of unrelated hooks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy4y6o5lujfaff6w8z3fd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy4y6o5lujfaff6w8z3fd.png" alt="Figure 3: Lifecycle symmetry in Exeris. Startup order and shutdown order are two sides of the same dependency model." width="800" height="2914"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 3: Lifecycle symmetry in Exeris. Startup derives capability from dependency order; shutdown preserves that order in reverse so the lifecycle remains one coherent model.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I would not make reverse shutdown the headline of the article, but I would not treat it as a footnote either. It is part of the same argument: structured concurrency helps most when the surrounding lifecycle is already structured.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I measured, and what I did not claim
&lt;/h2&gt;

&lt;p&gt;This article is architectural first, but I do not want to leave it floating at the level of &lt;em&gt;"this feels cleaner."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The evidence I care about here is not generic throughput. It is lifecycle evidence.&lt;/p&gt;

&lt;p&gt;For this model, the useful signals are things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sequential startup vs phase-grouped structured startup&lt;/li&gt;
&lt;li&gt;total cold boot duration until boot-ready&lt;/li&gt;
&lt;li&gt;per-phase startup timing&lt;/li&gt;
&lt;li&gt;round timing in parallel phases&lt;/li&gt;
&lt;li&gt;repeated cold-start variance&lt;/li&gt;
&lt;li&gt;degraded boot timing when optional subsystems are removed&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;jdk.VirtualThreadPinned&lt;/code&gt; events during startup&lt;/li&gt;
&lt;li&gt;final active subsystem count recorded at boot-ready&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is why I treat JFR as part of the architecture here, not just as a performance tool. If the startup model is real, it should leave a readable lifecycle trace behind it.&lt;/p&gt;

&lt;p&gt;The bootstrap documentation in Exeris already treats startup telemetry as part of the contract: boot-ready, shutdown completion, dependency-cycle detection, and lifecycle state are all first-class signals rather than incidental logs.&lt;/p&gt;

&lt;p&gt;I also have some early exploratory startup measurements, although I am deliberately treating them as supporting evidence rather than a headline claim.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Exeris community h1&lt;/th&gt;
&lt;th&gt;Quarkus JVM VT tuned&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Startup → health-ready&lt;/td&gt;
&lt;td&gt;1132 ms&lt;/td&gt;
&lt;td&gt;2182 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Startup → first request&lt;/td&gt;
&lt;td&gt;1205 ms&lt;/td&gt;
&lt;td&gt;2432 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Health-ready → first request&lt;/td&gt;
&lt;td&gt;73 ms&lt;/td&gt;
&lt;td&gt;250 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These measurements were taken on dev hardware and remain sensitive to local runtime conditions, including machine state and GUI / no-GUI setup. So I am not using them to claim broad startup superiority yet.&lt;/p&gt;

&lt;p&gt;What they do show already is narrower, but still useful: this bootstrap model is measurable in operational terms. It is not just architecturally cleaner on paper.&lt;/p&gt;

&lt;p&gt;The smaller &lt;code&gt;health-ready → first-request&lt;/code&gt; gap is especially interesting to me, because it suggests the lifecycle boundary is not only short on paper but also closer to usable work.&lt;/p&gt;

&lt;p&gt;I have also validated the same runtime under a constrained exploratory profile with zero request errors, but that belongs to a different discussion than this article. The point here is narrower: the lifecycle model is observable and survives contact with measurement.&lt;/p&gt;

&lt;p&gt;I am also deliberately keeping the claim scope narrow. Early measurements are useful, but they are still sensitive to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;classloading state&lt;/li&gt;
&lt;li&gt;JIT state&lt;/li&gt;
&lt;li&gt;machine noise&lt;/li&gt;
&lt;li&gt;native load conditions&lt;/li&gt;
&lt;li&gt;startup environment shape&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I would rather say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;this model is measurable and operationally inspectable&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;than jump too quickly to:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;this model is definitively faster.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That comes later, if the data actually holds.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this does not solve
&lt;/h2&gt;

&lt;p&gt;This model does not fix bad subsystem boundaries.&lt;/p&gt;

&lt;p&gt;It does not fix circular graphs.&lt;br&gt;
It does not fix startup work that should not exist in the first place.&lt;br&gt;
It does not mean every subsystem should suddenly become parallel.&lt;br&gt;
And it definitely does not generalize to every runtime architecture.&lt;/p&gt;

&lt;p&gt;I would still use more conventional patterns when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the graph is shallow&lt;/li&gt;
&lt;li&gt;the lifecycle is simpler&lt;/li&gt;
&lt;li&gt;subsystem ownership is fuzzy by design&lt;/li&gt;
&lt;li&gt;plugin-style openness matters more than deterministic startup shape&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a universal recipe. It applies when the execution path is still under my control and the lifecycle itself is part of the architecture.&lt;/p&gt;

&lt;p&gt;That boundary matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I kept, what I dropped
&lt;/h2&gt;

&lt;p&gt;I think this is the part that usually gets lost when articles get too polished.&lt;/p&gt;

&lt;p&gt;I did not end up with a universal &lt;em&gt;"use STS for bootstrap"&lt;/em&gt; rule.&lt;/p&gt;

&lt;p&gt;What I kept:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;topological lifecycle order&lt;/li&gt;
&lt;li&gt;deterministic initialization&lt;/li&gt;
&lt;li&gt;explicit phase boundaries&lt;/li&gt;
&lt;li&gt;explicit failure policy&lt;/li&gt;
&lt;li&gt;reverse shutdown symmetry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What I dropped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the idea that &lt;code&gt;initialize()&lt;/code&gt; and &lt;code&gt;start()&lt;/code&gt; should be the same phase&lt;/li&gt;
&lt;li&gt;the idea that startup parallelism should be maximal&lt;/li&gt;
&lt;li&gt;the idea that concurrency should appear before dependency safety is known&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also considered using &lt;code&gt;StructuredTaskScope&lt;/code&gt; in other places where it looked fashionable on paper. In at least one case, it simply did not buy me anything meaningful, so I left it out.&lt;/p&gt;

&lt;p&gt;That contrast was useful. It made the bootstrap use case clearer. &lt;code&gt;StructuredTaskScope&lt;/code&gt; was not valuable because it was new. It was valuable because this part of the system already had a natural owner, a natural boundary, and a natural failure model.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;A few things became clearer to me while building this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bootstrap became a graph problem before it became a concurrency problem.&lt;/strong&gt;&lt;br&gt;
That changed which part of the design actually needed structure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;StructuredTaskScope&lt;/code&gt; helped only after order was already explicit.&lt;/strong&gt;&lt;br&gt;
The useful move was not &lt;em&gt;"fork everything"&lt;/em&gt; but &lt;em&gt;"compute a safe round, then run it inside a bounded scope."&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The trade-off is intentional.&lt;/strong&gt;&lt;br&gt;
I kept &lt;code&gt;initialize()&lt;/code&gt; and &lt;code&gt;FOUNDATION&lt;/code&gt; sequential on purpose. I gave up some parallelism to keep lifecycle ownership and failure semantics easier to reason about.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What this unlocks next in Exeris is not just cleaner startup code. It gives me a more inspectable lifecycle model for later work around health, telemetry, subsystem isolation, and eventually more demanding cold-start contracts.&lt;/p&gt;

&lt;p&gt;If you want to see what this looks like outside a toy example, the bootstrap and lifecycle code is in the Exeris Kernel repository.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Explore the Exeris Kernel — zero-allocation architecture in running code:&lt;br&gt;
🔗 &lt;a href="https://github.com/exeris-systems/exeris-kernel" rel="noopener noreferrer"&gt;exeris-systems/exeris-kernel&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>java</category>
      <category>projectloom</category>
      <category>structuredconcurrency</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Reevaluating 1990s OOP in Java: DOP, Scoped Values, and Loom in 2026</title>
      <dc:creator>Arkadiusz Przychocki</dc:creator>
      <pubDate>Sat, 21 Mar 2026 09:17:18 +0000</pubDate>
      <link>https://dev.to/arkstack/reevaluating-1990s-oop-in-java-dop-scoped-values-and-loom-in-2026-3p4g</link>
      <guid>https://dev.to/arkstack/reevaluating-1990s-oop-in-java-dop-scoped-values-and-loom-in-2026-3p4g</guid>
      <description>&lt;p&gt;For decades, the Gang of Four (GoF) design patterns were the standard for object-oriented programming. If you had conditional behavior, you built a Factory. If you had interchangeable algorithms, you built a Strategy.&lt;/p&gt;

&lt;p&gt;While these patterns remain foundational, applying them blindly in modern Java (21 through 26) often introduces an unnecessary &lt;strong&gt;Abstraction Tax&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When engineering the &lt;strong&gt;Exeris Kernel&lt;/strong&gt;, my goal was "No Waste Compute." This didn't mean chasing micro-optimizations, but rather rethinking control flow, context propagation, and concurrency using modern JVM primitives.&lt;/p&gt;

&lt;p&gt;Here is a pragmatic look at where the JVM is heading, and how Data-Oriented Programming (DOP) combined with Project Loom changes the architectural calculus for closed-domain systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Enterprise Reality Check &amp;amp; Disclaimers
&lt;/h2&gt;

&lt;p&gt;Before diving in, let's ground this in reality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The Migration Path:&lt;/strong&gt; You don't rewrite systems overnight.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Java 21 gives you Records, Sealed Interfaces, Pattern Matching, and Virtual Threads (GA). You can adopt DOP today.&lt;/li&gt;
&lt;li&gt;Java 25 brings &lt;code&gt;ScopedValue&lt;/code&gt; to GA, fixing context propagation.&lt;/li&gt;
&lt;li&gt;Java 26+ further stabilizes &lt;code&gt;StructuredTaskScope&lt;/code&gt; (STS). &lt;strong&gt;Disclaimer:&lt;/strong&gt; As of JDK 26, STS is in its 6th preview (JEP 525). Running it in production requires &lt;code&gt;--enable-preview&lt;/code&gt; and organizational buy-in. It is highly stable, but it is &lt;em&gt;not&lt;/em&gt; a final standard yet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. The "Closed World" Constraint:&lt;/strong&gt; The DOP approaches shown below are designed for &lt;strong&gt;Closed-World domains&lt;/strong&gt; (e.g., the core business logic of a specific microservice). If you are building an Open-World system (a plugin architecture, an extensible framework, or an SPI), you &lt;em&gt;must&lt;/em&gt; respect the Open/Closed Principle. In those cases, &lt;code&gt;sealed&lt;/code&gt; interfaces are the wrong tool, and traditional polymorphism with Service Registries remains the correct choice.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: The Legacy Approach (Understanding the Tax)
&lt;/h2&gt;

&lt;p&gt;Historically, to process different payment methods, we relied on polymorphic Strategy classes managed by a Factory. If we needed to pass context (like a Transaction ID), we relied on &lt;code&gt;ThreadLocal&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PaymentService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;PaymentStrategy&lt;/span&gt; &lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PaymentFactory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;strategy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;process&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's be clear: &lt;strong&gt;allocating a single 24-byte Strategy object per request will not kill your heap.&lt;/strong&gt; The true "Abstraction Tax" at scale comes from the combination of indirection: cognitive overhead, polymorphic dispatch costs on extreme hot-paths, deep object graph complexity, and crucially, the severe memory overhead of copying &lt;code&gt;InheritableThreadLocal&lt;/code&gt; maps when spawning thousands of Virtual Threads.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Data-Oriented Programming
&lt;/h2&gt;

&lt;p&gt;In modern Java, we can separate &lt;em&gt;data&lt;/em&gt; from &lt;em&gt;behavior&lt;/em&gt;. Instead of a polymorphic Factory returning Strategy objects, we use &lt;strong&gt;Sealed Interfaces&lt;/strong&gt; and &lt;strong&gt;Records&lt;/strong&gt; to model our domain, and &lt;strong&gt;Pattern Matching&lt;/strong&gt; for dispatch.&lt;/p&gt;

&lt;p&gt;In DOP, records must guarantee the validity of their state at creation using Compact Constructors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.math.BigDecimal&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// 1. A closed hierarchy. The compiler ensures exhaustiveness.&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;sealed&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;PaymentMethod&lt;/span&gt; &lt;span class="n"&gt;permits&lt;/span&gt; &lt;span class="nc"&gt;CreditCard&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Blik&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Compact Constructors prevent invalid state&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;CreditCard&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;cardNumber&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;PaymentMethod&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;CreditCard&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cardNumber&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;cardNumber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;matches&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\\d{16}"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IllegalArgumentException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Invalid card format"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getLastFourDigits&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cardNumber&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;substring&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;Blik&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;PaymentMethod&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Blik&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;matches&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\\d{6}"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IllegalArgumentException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Invalid BLIK code"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For closed-domain dispatch, this eliminates the Factory entirely. You get compile-time exhaustiveness, guaranteed data validity, and more predictable control flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Memory Footprint Shift
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxk8zhrdpzpzi68tpiggf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxk8zhrdpzpzi68tpiggf.png" alt=" " width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: Scoped Values (Context Without the Leak)
&lt;/h2&gt;

&lt;p&gt;If nested logic needs a Transaction ID for logging, &lt;code&gt;InheritableThreadLocal&lt;/code&gt; becomes a massive bottleneck. Copying state across millions of Virtual Threads destroys the lightweight nature of Project Loom.&lt;/p&gt;

&lt;p&gt;In JDK 25, &lt;strong&gt;Scoped Values&lt;/strong&gt; are GA. &lt;code&gt;ScopedValue&lt;/code&gt; provides immutable, downward-only data flow. It drastically reduces inheritance overhead and automatically vanishes when the execution scope exits.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PaymentContext&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ScopedValue&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;TX_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ScopedValue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newInstance&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Binding the context&lt;/span&gt;
&lt;span class="nc"&gt;ScopedValue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PaymentContext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TX_ID&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"TX-9981"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Processing ID: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nc"&gt;PaymentContext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TX_ID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="o"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;(Caveat: Because ScopedValues are strictly downward, patterns like updating MDC context deep in the call stack require architectural adjustments).&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: Execution (Pure DOP + Structured Concurrency)
&lt;/h2&gt;

&lt;p&gt;We have fixed data modeling (DOP) and state propagation (&lt;code&gt;ScopedValue&lt;/code&gt;). Now we need execution.&lt;/p&gt;

&lt;p&gt;If we execute a payment while simultaneously calling a fraud service, and the fraud service fails, the payment must abort instantly (Fail-Fast).&lt;/p&gt;

&lt;p&gt;In 2026, many still use libraries like Resilience4j for this. &lt;strong&gt;To be clear: &lt;code&gt;StructuredTaskScope&lt;/code&gt; does not replace retries or circuit breakers (which belong in your Service Mesh/Envoy layer).&lt;/strong&gt; However, STS natively replaces application-layer &lt;em&gt;bulkheads and timeouts&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Here is the implementation using JDK 26 Preview API. We fork virtual threads and pass validated data records directly to a function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.math.BigDecimal&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.UUID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.StructuredTaskScope&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.StructuredTaskScope.Subtask&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;PaymentRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;PaymentMethod&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PaymentOrchestrator&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;handle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PaymentRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;txId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;UUID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;randomUUID&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;substring&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;ScopedValue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PaymentContext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TX_ID&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;txId&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;executeConcurrentWorkflow&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;});&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;executeConcurrentWorkflow&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PaymentRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Enforces a strict concurrency model with ownership constraints&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StructuredTaskScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;open&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

            &lt;span class="nc"&gt;Subtask&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;paymentTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fork&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;executePayment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
            &lt;span class="nc"&gt;Subtask&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Boolean&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;fraudTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fork&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;performFraudCheck&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

            &lt;span class="c1"&gt;// Automatically interrupts sibling tasks if one fails&lt;/span&gt;
            &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;join&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Payment: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;paymentTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Fraud Check: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fraudTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s"&gt;"Clear"&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Suspicious"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StructuredTaskScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;FailedException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Transaction aborted: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCause&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentThread&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;interrupt&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;executePayment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PaymentMethod&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[Tx: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nc"&gt;PaymentContext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TX_ID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"] Executing..."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Simulate I/O&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;switch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nc"&gt;CreditCard&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;"Processing card ending in "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLastFourDigits&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nc"&gt;Blik&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;       &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;"Authorizing BLIK code: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;};&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;performFraudCheck&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PaymentRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Native Fail-Fast Architecture
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2wtgvsvc40jji80c1xqs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2wtgvsvc40jji80c1xqs.png" alt=" " width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pragmatic Verdict
&lt;/h2&gt;

&lt;p&gt;What replaces traditional GoF patterns in closed-domain systems is not a new pattern, but a shift in native JVM primitives: &lt;strong&gt;data (Records), context (ScopedValue), and execution (StructuredTaskScope).&lt;/strong&gt; By combining these primitives, we achieve:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reduced Indirection:&lt;/strong&gt; We pass strictly validated records to functions running on virtual threads, bypassing proxy layers and improving predictability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Safety:&lt;/strong&gt; Compact Constructors ensure invalid state never enters the pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native Fail-Fast:&lt;/strong&gt; &lt;code&gt;StructuredTaskScope&lt;/code&gt; natively handles cancellation propagation across thread boundaries.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;System design in 2026 isn't about entirely removing OOP or chasing zero-allocation myths. It is about understanding which infrastructure problems are now solved natively by the JVM and your Service Mesh—and pragmatically removing the workarounds we used to rely on.&lt;/p&gt;

&lt;p&gt;If you want to see how this model scales beyond simple request handling into a durable, off-heap runtime, take a look at the &lt;a href="https://github.com/exeris-systems/exeris-kernel" rel="noopener noreferrer"&gt;Exeris Kernel repository on GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>java</category>
      <category>architecture</category>
      <category>projectloom</category>
      <category>performance</category>
    </item>
    <item>
      <title>Why I Banned ThreadLocal from the Exeris Kernel (And What Replaced It)</title>
      <dc:creator>Arkadiusz Przychocki</dc:creator>
      <pubDate>Fri, 06 Mar 2026 18:43:10 +0000</pubDate>
      <link>https://dev.to/arkstack/why-i-banned-threadlocal-from-the-exeris-kernel-and-what-replaced-it-5aa</link>
      <guid>https://dev.to/arkstack/why-i-banned-threadlocal-from-the-exeris-kernel-and-what-replaced-it-5aa</guid>
      <description>&lt;p&gt;When I started designing the &lt;strong&gt;Exeris Kernel&lt;/strong&gt; — a next-generation, zero-copy runtime built for Java 26+ — I established one non-negotiable architectural law: &lt;strong&gt;"No Waste Compute."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In a system designed to handle extreme density by mapping exactly one Virtual Thread to every network stream (1-VT-per-Stream), every byte of memory and every CPU cycle must be intentional.&lt;/p&gt;

&lt;p&gt;But very quickly, I hit a &lt;strong&gt;legacy wall&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the standard Enterprise Java ecosystem, when you need to pass a &lt;code&gt;SecurityContext&lt;/code&gt;, a &lt;code&gt;TenantId&lt;/code&gt;, or a &lt;code&gt;TransactionID&lt;/code&gt; down to the database layer without polluting dozens of method signatures, you reach for a trusted tool: &lt;code&gt;ThreadLocal&lt;/code&gt;. For over two decades, &lt;code&gt;ThreadLocal&lt;/code&gt; was the backbone of Java framework magic. But in the era of Project Loom (JEP 444) and Structured Concurrency, this old friend becomes a &lt;strong&gt;performance serial killer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here is why I enforced a strict, kernel-wide ban on &lt;code&gt;ThreadLocal&lt;/code&gt; in Exeris, and how adopting &lt;strong&gt;JEP 506 (Scoped Values)&lt;/strong&gt; completely changed the game for high-performance architecture.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Forensic Analysis: The 3 Sins of ThreadLocal
&lt;/h2&gt;

&lt;p&gt;Treating Virtual Threads like OS threads discards most of their scalability advantages — especially around context propagation and allocation behavior. When you combine &lt;code&gt;ThreadLocal&lt;/code&gt; with a highly concurrent, thread-per-request architecture, you introduce three critical flaws:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Spaghetti State (Unconstrained Mutability)
&lt;/h3&gt;

&lt;p&gt;Any code deep in the call stack that can read a &lt;code&gt;ThreadLocal&lt;/code&gt; can also call &lt;code&gt;.set()&lt;/code&gt; on it. If a nested library mutates the &lt;code&gt;SecurityContext&lt;/code&gt; mid-flight, tracking down who changed it and when is a debugging nightmare. Data flow becomes completely unpredictable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fscopedvalue%2Ffig1_spaghetti_state.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fscopedvalue%2Ffig1_spaghetti_state.png" alt="Figure 1: The uncontrolled mutability of ThreadLocal versus the strict, read-only data flow guarantees of a lexically bounded ScopedValue." width="800" height="298"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 1: The uncontrolled mutability of ThreadLocal versus the strict, read-only data flow guarantees of a lexically bounded ScopedValue.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The Memory Leak Trap (Unbounded Lifetime)
&lt;/h3&gt;

&lt;p&gt;A &lt;code&gt;ThreadLocal&lt;/code&gt; survives until the thread dies or someone explicitly calls &lt;code&gt;.remove()&lt;/code&gt;. In legacy thread pools, forgetting to clean up means a security context bleeds into the next user's request.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The Inheritance Tax (The RAM Killer)
&lt;/h3&gt;

&lt;p&gt;This is the fatal blow. To share context with child threads, frameworks use &lt;code&gt;InheritableThreadLocal&lt;/code&gt;. When a parent thread creates a child, the JVM must &lt;strong&gt;eagerly clone&lt;/strong&gt; the parent's &lt;code&gt;ThreadLocalMap&lt;/code&gt;. This typically allocates between &lt;strong&gt;32 and 128 bytes&lt;/strong&gt; per entry on the heap, depending on the load factor and key distribution.&lt;/p&gt;

&lt;p&gt;Now, imagine a single HTTP request where your logic forks &lt;strong&gt;50 concurrent sub-tasks&lt;/strong&gt; (Virtual Threads) to fetch data. You just triggered &lt;strong&gt;50 expensive map allocations&lt;/strong&gt;. Multiply that by 10,000 concurrent requests, and your Garbage Collector stalls your application just to clean up useless context clones. This becomes a &lt;strong&gt;pure GC tax with no business value&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fscopedvalue%2Ffig2_inheritance_tax.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fscopedvalue%2Ffig2_inheritance_tax.png" alt="Figure 2: The O(N) memory copy penalty of InheritableThreadLocal compared to the O(1) constant-time pointer inheritance introduced in JEP 506." width="800" height="234"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 2: The O(N) memory copy penalty of InheritableThreadLocal compared to the O(1) constant-time pointer inheritance introduced in JEP 506.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Missing Link: Structured Concurrency Incompatibility
&lt;/h2&gt;

&lt;p&gt;Beyond performance, &lt;code&gt;ThreadLocal&lt;/code&gt; is fundamentally incompatible with Structured Concurrency. &lt;code&gt;StructuredTaskScope&lt;/code&gt; relies on deterministic, tree-like execution where child tasks are strictly bound to the lifetime of their parent. &lt;code&gt;ThreadLocal&lt;/code&gt;, being non-deterministic and fully mutable at any level of the tree, completely breaks this model.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You cannot build a reliable, fail-fast concurrent tree if any leaf node can secretly mutate the global state of the branch.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Exhibit A: The Zero-Waste Solution (JEP 506)
&lt;/h2&gt;

&lt;p&gt;To survive millions of Virtual Threads, we need a mechanism that is &lt;strong&gt;immutable&lt;/strong&gt;, &lt;strong&gt;temporally bounded&lt;/strong&gt;, and &lt;strong&gt;virtually free to inherit&lt;/strong&gt;. Enter &lt;strong&gt;Scoped Values&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of a globally mutable variable, a &lt;code&gt;ScopedValue&lt;/code&gt; defines a &lt;strong&gt;Dynamic Scope&lt;/strong&gt;. It binds a value to a specific block of code (and all methods called within it). Once the block finishes, the binding vanishes.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Scoreboard
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;ThreadLocal&lt;/th&gt;
&lt;th&gt;ScopedValue&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Immutability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mutable (Anyone can overwrite)&lt;/td&gt;
&lt;td&gt;Immutable (Read-only for callees)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lifetime&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Unbounded (Requires manual cleanup)&lt;/td&gt;
&lt;td&gt;Lexically bounded (tied to the &lt;code&gt;.run()&lt;/code&gt; block)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Inheritance Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;O(N) memory copy&lt;/td&gt;
&lt;td&gt;O(1) constant-time inheritance with negligible allocation cost&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Exhibit B: "Show, Don't Tell" — The Exeris Implementation
&lt;/h2&gt;

&lt;p&gt;In the Exeris Kernel, context propagation is strictly separated. The Security module authenticates, and the Persistence module applies Row-Level Security. They never talk directly. They communicate purely through an &lt;strong&gt;"Invisible Wall"&lt;/strong&gt; using &lt;code&gt;ScopedValue&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fscopedvalue%2Ffig3_context_scope.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.arkstack.dev%2Fblog%2Fscopedvalue%2Ffig3_context_scope.png" alt="Figure 3: Context propagation in the Exeris Kernel. Security and Persistence modules remain completely decoupled, sharing identity strictly through an immutable dynamic scope." width="302" height="620"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 3: Context propagation in the Exeris Kernel. Security and Persistence modules remain completely decoupled, sharing identity strictly through an immutable dynamic scope.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here is how identity is injected at the gateway. Notice the complete absence of &lt;code&gt;.set()&lt;/code&gt; methods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. Decode token directly from off-heap memory (Zero-Alloc)&lt;/span&gt;
&lt;span class="nc"&gt;AuthenticationResult&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;securityProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;authenticate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokenBuffer&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Open a lexically bounded, immutable Dynamic Scope&lt;/span&gt;
&lt;span class="c1"&gt;// Note: Chained .where() calls create efficient nested scopes.&lt;/span&gt;
&lt;span class="nc"&gt;ScopedValue&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;KernelProviders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;PRINCIPAL_CONTEXT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;principal&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;KernelProviders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;STORAGE_CONTEXT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;   &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;storage&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Inside this block, the context is safe.&lt;/span&gt;
        &lt;span class="c1"&gt;// It will be inherited by any Virtual Thread spawned via StructuredTaskScope.&lt;/span&gt;
        &lt;span class="n"&gt;dispatchRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// 3. Scope closes automatically. No .remove() needed. Zero leaks.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Later, deep in the Persistence module, the &lt;code&gt;TransactionOrchestrator&lt;/code&gt; needs to know the Tenant ID to append it to the SQL query. It simply queries the active scope:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TransactionOrchestrator&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;StorageContext&lt;/span&gt; &lt;span class="nf"&gt;resolveStorageContext&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Zero ThreadLocal, fully Virtual-Thread safe (JEP 506)&lt;/span&gt;
        &lt;span class="c1"&gt;// isBound() is an O(1) check&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;KernelProviders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;STORAGE_CONTEXT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isBound&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;KernelProviders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;STORAGE_CONTEXT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="c1"&gt;// Fallback to system context without allocating objects&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ImmutableStorageContext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// ... transaction execution logic&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because &lt;code&gt;ScopedValue&lt;/code&gt; is immutable, the &lt;code&gt;TransactionOrchestrator&lt;/code&gt; is &lt;strong&gt;guaranteed by lexical scoping and immutability&lt;/strong&gt; that the &lt;code&gt;StorageContext&lt;/code&gt; it reads is exactly the one set by the gateway, untampered by any interceptor along the way.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Paradigm Shift
&lt;/h2&gt;

&lt;p&gt;By ripping &lt;code&gt;ThreadLocal&lt;/code&gt; out of the kernel, we eliminated an entire category of memory leaks and GC pressure. When a system spawns 1,000,000 Virtual Threads, the difference between &lt;em&gt;"copying a map 1 million times"&lt;/em&gt; and &lt;em&gt;"sharing a pointer in constant time"&lt;/em&gt; is the difference between a crashed server and a stable infrastructure.&lt;/p&gt;

&lt;p&gt;Java 26 is not just &lt;em&gt;"Java 8 with &lt;code&gt;var&lt;/code&gt;"&lt;/em&gt;. Features like Project Loom, Panama (FFM), and Scoped Values require a &lt;strong&gt;fundamental shift&lt;/strong&gt; in how we architect systems. If we keep building frameworks using patterns from 2014, we will never unlock the true performance of modern hardware.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Would you be willing to refactor your application to drop &lt;code&gt;ThreadLocal&lt;/code&gt; and embrace &lt;code&gt;ScopedValue&lt;/code&gt;?&lt;/strong&gt; Let me know in the comments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Explore the Exeris Kernel
&lt;/h2&gt;

&lt;p&gt;The zero-allocation architecture described in this article isn't just theory — it's running code. Exeris is an open-core, post-container cloud kernel built for extreme density. If you're tired of GC pauses and want to see how native I/O, Panama FFM, and Virtual Thread orchestration look in practice, explore the Exeris Kernel:&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;GitHub Repository:&lt;/strong&gt; &lt;a href="https://github.com/exeris-systems/exeris-kernel" rel="noopener noreferrer"&gt;exeris-systems/exeris-kernel&lt;/a&gt;&lt;/p&gt;

</description>
      <category>java</category>
      <category>virtualthreads</category>
      <category>projectloom</category>
      <category>exeris</category>
    </item>
    <item>
      <title>Welcome to Arkstack — JVM Performance, Off-heap Memory &amp; Low-Latency Architecture</title>
      <dc:creator>Arkadiusz Przychocki</dc:creator>
      <pubDate>Fri, 06 Mar 2026 18:36:07 +0000</pubDate>
      <link>https://dev.to/arkstack/welcome-to-arkstack-jvm-performance-off-heap-memory-low-latency-architecture-2ege</link>
      <guid>https://dev.to/arkstack/welcome-to-arkstack-jvm-performance-off-heap-memory-low-latency-architecture-2ege</guid>
      <description>&lt;p&gt;Welcome to &lt;strong&gt;Arkstack.dev&lt;/strong&gt; — a minimalistic, developer-focused blog with zero unnecessary fluff.&lt;/p&gt;

&lt;p&gt;This is the digital home of &lt;strong&gt;Arkadiusz Przychocki&lt;/strong&gt;, Lead Cloud Architect &amp;amp; Full-Stack Engineer. If you care about clean systems, modern JVM engineering, and cloud-native architecture done right, you're in the right place.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Expect
&lt;/h2&gt;

&lt;p&gt;This blog covers the intersection of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cloud Architecture&lt;/strong&gt; — designing resilient, cost-efficient systems on AWS, GCP, and Azure. Infrastructure as code, distributed systems, and the hard lessons learned in production.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Modern Java&lt;/strong&gt; — the language is evolving faster than ever. Virtual threads (Project Loom), sealed classes, pattern matching, and the growing ecosystem around GraalVM native image compilation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero-Waste Engineering&lt;/strong&gt; — every abstraction has a cost. We care deeply about what actually runs on the metal.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Exeris Kernel
&lt;/h2&gt;

&lt;p&gt;One of the flagship projects I'm working on is &lt;strong&gt;Exeris Kernel&lt;/strong&gt; — a zero-allocation runtime designed for ultra-low-latency workloads. The core philosophy:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Every allocation is a liability. Every system call is a negotiation. We opt out of both wherever possible.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Exeris Kernel is built on top of the JVM's Foreign Function &amp;amp; Memory API, leveraging off-heap memory management to achieve predictable, GC-pause-free execution. It's designed for financial systems, real-time data pipelines, and anywhere nanoseconds matter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Example: off-heap buffer allocation in Exeris Kernel&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;arena&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Arena&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofConfined&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;segment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;arena&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;allocate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ValueLayout&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;JAVA_LONG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setAtIndex&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ValueLayout&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;JAVA_LONG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nanoTime&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Calendar vs. Competence
&lt;/h2&gt;

&lt;p&gt;I don't measure my career in calendar years. I measure it in the depth of the stack I've had to dismantle.&lt;/p&gt;

&lt;p&gt;While many spend a decade inside the comfort zone of a framework, I spent the last 6 months fighting the JVM Garbage Collector and manual memory management — off-heap arenas, direct &lt;code&gt;MemorySegment&lt;/code&gt; access via Project Panama, and the hard lessons of building a zero-allocation runtime from scratch.&lt;/p&gt;

&lt;p&gt;That's where real engineering begins. Not in CRUD scaffolding. In the moments when the GC log is your only clue, and latency spikes are measured in microseconds, not seconds.&lt;/p&gt;

&lt;p&gt;This blog is the audit trail of that journey.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Philosophy
&lt;/h2&gt;

&lt;p&gt;The philosophy of this blog mirrors the philosophy of Exeris: &lt;strong&gt;No Waste&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No unnecessary client-side JavaScript on this site&lt;/li&gt;
&lt;li&gt;No bloated frameworks where a pure function suffices&lt;/li&gt;
&lt;li&gt;No ceremony when simplicity delivers the same result&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Built with &lt;a href="https://astro.build" rel="noopener noreferrer"&gt;Astro&lt;/a&gt; — the framework that ships zero JS by default.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stay Tuned
&lt;/h2&gt;

&lt;p&gt;Upcoming content includes deep dives into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GraalVM native image compilation for Spring Boot services&lt;/li&gt;
&lt;li&gt;Designing multi-region active-active architectures&lt;/li&gt;
&lt;li&gt;The Exeris Kernel internals: lock-free queues and off-heap buffers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's build things that last.&lt;/p&gt;

</description>
      <category>cloud</category>
      <category>java</category>
      <category>architecture</category>
      <category>exeris</category>
    </item>
  </channel>
</rss>
