<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: mgd43b</title>
    <description>The latest articles on DEV Community by mgd43b (@mgd43b).</description>
    <link>https://dev.to/mgd43b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3817656%2Faab70ef8-8405-452d-aec5-004f74c0316b.png</url>
      <title>DEV Community: mgd43b</title>
      <link>https://dev.to/mgd43b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mgd43b"/>
    <language>en</language>
    <item>
      <title>Error Handling in Agent Systems: Exception Hierarchies, Partial Results, and Exit Reasons</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Sun, 31 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/error-handling-in-agent-systems-exception-hierarchies-partial-results-and-exit-reasons-1fm6</link>
      <guid>https://dev.to/agentensemble/error-handling-in-agent-systems-exception-hierarchies-partial-results-and-exit-reasons-1fm6</guid>
      <description>&lt;p&gt;Agent systems fail in ways that traditional software does not. An LLM might return an unparseable response. A tool call might timeout. An agent might enter an infinite ReAct loop. A human reviewer might walk away from an approval gate. A task might succeed but produce output that a downstream task cannot use.&lt;/p&gt;

&lt;p&gt;The interesting problem is not preventing these failures -- some are inherent to non-deterministic systems. The interesting problem is giving operators enough information to handle them gracefully: what failed, what succeeded before the failure, and what the system's terminal state actually is.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Exception Hierarchy
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt; uses a hierarchy of unchecked exceptions rooted at &lt;code&gt;AgentEnsembleException&lt;/code&gt;. Every exception the framework throws extends this base, so you can catch everything with a single catch block or handle specific cases individually.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AgentEnsembleException (base)
  ValidationException             -- invalid configuration at build/run time
  TaskExecutionException          -- a task failed during execution
  AgentExecutionException         -- an LLM call failed
  MaxIterationsExceededException  -- agent exceeded its tool-call limit
  PromptTemplateException         -- unresolved template variables
  ToolExecutionException          -- a tool call failed
  ConstraintViolationException    -- required workers were not called
  GuardrailViolationException     -- a guardrail blocked execution
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hierarchy matters because different failure types require different responses. A &lt;code&gt;ValidationException&lt;/code&gt; means your configuration is wrong -- no LLM was ever called, and the fix is in the code. A &lt;code&gt;TaskExecutionException&lt;/code&gt; means the pipeline started but a task failed -- partial results may be available. A &lt;code&gt;MaxIterationsExceededException&lt;/code&gt; means an agent got stuck in a tool-calling loop -- the fix might be fewer tools or a higher iteration limit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Partial Results on Failure
&lt;/h2&gt;

&lt;p&gt;When a multi-task pipeline fails partway through, the work completed before the failure is not discarded. &lt;code&gt;TaskExecutionException&lt;/code&gt; carries a list of &lt;code&gt;TaskOutput&lt;/code&gt; objects for tasks that completed before the failure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;EnsembleOutput&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;saveResults&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TaskExecutionException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Save whatever was completed before the failure&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TaskOutput&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCompletedTaskOutputs&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;savePartialResult&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;alertOnFailure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTaskDescription&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAgentRole&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is operationally significant. In a five-task pipeline where task four fails, you still have the outputs of tasks one through three. You can save them, display them to a user, or use them to resume the pipeline from where it left off.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exit Reasons
&lt;/h2&gt;

&lt;p&gt;Not every non-completion is an error. &lt;code&gt;EnsembleOutput.getExitReason()&lt;/code&gt; distinguishes between four terminal states:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Exit Reason&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;COMPLETED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All tasks ran to completion normally&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;USER_EXIT_EARLY&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A human reviewer chose to stop the pipeline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;TIMEOUT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A review gate timeout expired&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ERROR&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;An unrecoverable exception terminated the pipeline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;EnsembleOutput&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getExitReason&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nl"&gt;COMPLETED:&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"All done: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRaw&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nl"&gt;USER_EXIT_EARLY:&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"User stopped after "&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;completedTasks&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" task(s)"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nl"&gt;TIMEOUT:&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Review gate timed out"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nl"&gt;ERROR:&lt;/span&gt;
        &lt;span class="c1"&gt;// Typically handled via exception&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The distinction between &lt;code&gt;USER_EXIT_EARLY&lt;/code&gt; and &lt;code&gt;TIMEOUT&lt;/code&gt; matters for operational dashboards. A user exit is intentional -- the pipeline did its job and the human made a decision. A timeout might indicate a process problem (reviewer was not available) and may need escalation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Specific Exception Types
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ValidationException
&lt;/h3&gt;

&lt;p&gt;Thrown before any LLM calls when the ensemble or its components are configured incorrectly. Common causes include missing required fields, tasks referencing unregistered agents, circular context dependencies, or invalid iteration limits.&lt;/p&gt;

&lt;p&gt;This exception is your build-time safety net. If you see it, the fix is always in the configuration code.&lt;/p&gt;

&lt;h3&gt;
  
  
  AgentExecutionException
&lt;/h3&gt;

&lt;p&gt;Thrown when the LLM call itself fails -- network errors, API errors, rate limiting, timeouts. Contains the agent role and task description so you can route the failure to the right team.&lt;/p&gt;

&lt;h3&gt;
  
  
  MaxIterationsExceededException
&lt;/h3&gt;

&lt;p&gt;Thrown when an agent exceeds its &lt;code&gt;maxIterations&lt;/code&gt; limit during the ReAct loop. Contains both the configured limit and the actual iteration count.&lt;/p&gt;

&lt;p&gt;This is often a sign that the agent has too many tools and is cycling between them without making progress. The fix is usually to reduce the tool set, make tool descriptions more specific, or increase the iteration limit if the task genuinely requires many tool calls.&lt;/p&gt;

&lt;h3&gt;
  
  
  PromptTemplateException
&lt;/h3&gt;

&lt;p&gt;Thrown when a task description contains &lt;code&gt;{variable}&lt;/code&gt; placeholders that were not resolved. The exception lists the missing variable names, making it straightforward to fix.&lt;/p&gt;

&lt;h3&gt;
  
  
  GuardrailViolationException
&lt;/h3&gt;

&lt;p&gt;Thrown when an input or output guardrail blocks execution. Contains the guardrail type (INPUT or OUTPUT), the violation message, the task description, and the agent role. This integrates with the guardrail system covered in the previous post.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Retry Question
&lt;/h2&gt;

&lt;p&gt;AgentEnsemble does not include built-in retry logic. This is a deliberate design choice.&lt;/p&gt;

&lt;p&gt;The reasoning is that retry policies are highly context-dependent. A rate-limited API call might benefit from exponential backoff. A malformed LLM response might benefit from a retry with the same prompt. A task that failed because the model cannot perform the requested work should not be retried at all.&lt;/p&gt;

&lt;p&gt;For transient failures, implement retry at the call site:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;attempts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="nc"&gt;EnsembleOutput&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;AgentExecutionException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;attempts&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempts&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000L&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;attempts&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For production use, consider integrating a resilience library such as Resilience4j, which provides circuit breakers, rate limiters, and retry policies that compose well with the exception hierarchy.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Operational Model
&lt;/h2&gt;

&lt;p&gt;The error handling design reflects a particular view of how agent systems should be operated: failures are expected, partial results are valuable, and the framework should give you structured information rather than opaque error strings.&lt;/p&gt;

&lt;p&gt;The exception hierarchy makes it possible to build monitoring and alerting that distinguishes between configuration errors (fix the code), transient failures (retry or escalate), agent loops (tune the workflow), and intentional stops (human decision). The partial result preservation makes it possible to build resumable pipelines. The exit reasons make it possible to build dashboards that accurately represent pipeline outcomes.&lt;/p&gt;

&lt;p&gt;None of this prevents failures. It gives you the handles to respond to them systematically.&lt;/p&gt;




&lt;p&gt;The full error handling guide is in the &lt;a href="https://agentensemble.net/guides/error-handling/" rel="noopener noreferrer"&gt;AgentEnsemble documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I'd be interested in whether you have found the exception hierarchy granularity to be sufficient, or whether there are failure modes in your agent systems that do not map cleanly to these categories.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Scoped Memory for Agent Systems: Cross-Run Persistence Without Global State</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Fri, 29 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/scoped-memory-for-agent-systems-cross-run-persistence-without-global-state-2fpf</link>
      <guid>https://dev.to/agentensemble/scoped-memory-for-agent-systems-cross-run-persistence-without-global-state-2fpf</guid>
      <description>&lt;p&gt;Most agent frameworks treat each run as stateless. The agent starts fresh, does its work, and the output is consumed by whatever called it. If you run the same workflow again next week, the agent has no memory of what it produced last time.&lt;/p&gt;

&lt;p&gt;For some use cases that is fine. For others -- recurring research tasks, iterative drafting, accumulated domain knowledge -- you want the agent to remember what it learned in previous runs and build on it.&lt;/p&gt;

&lt;p&gt;The question is how to add cross-run memory without introducing global shared state that makes the system hard to reason about.&lt;/p&gt;

&lt;h2&gt;
  
  
  Named Scopes as the Isolation Mechanism
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt; uses named memory scopes. Each task declares which scopes it reads from and writes to. A task can only see memory from scopes it explicitly declares.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;MemoryStore&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inMemory&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;Task&lt;/span&gt; &lt;span class="n"&gt;researchTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Research current AI trends"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expectedOutput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"A research report"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ai-research"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;researchTask&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;memoryStore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the run, the task's output is stored in the &lt;code&gt;"ai-research"&lt;/code&gt; scope. On a second run with the same store, the agent's prompt automatically includes entries from the first run under a &lt;code&gt;## Memory: ai-research&lt;/code&gt; section.&lt;/p&gt;

&lt;p&gt;The scope name is the isolation boundary. Task A storing into &lt;code&gt;"research"&lt;/code&gt; and task B declaring only &lt;code&gt;"drafts"&lt;/code&gt; means task B never sees task A's output. This is not a security mechanism -- it is an attention mechanism. It controls what context an agent receives, keeping prompts focused on relevant history rather than everything that ever happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works at the Prompt Level
&lt;/h2&gt;

&lt;p&gt;The mechanics are straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;At task startup, the framework retrieves entries from every declared scope and injects them into the agent's prompt.&lt;/li&gt;
&lt;li&gt;At task completion, the framework stores the task output into every declared scope.&lt;/li&gt;
&lt;li&gt;Because entries persist in the &lt;code&gt;MemoryStore&lt;/code&gt; across runs, agents in later runs automatically see outputs from earlier runs.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The prompt injection looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;## Memory: ai-project
The following information from scope "ai-project" may be relevant:

---
Research findings from previous run: AI is accelerating in healthcare...
---

## Task
Analyse the research findings
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no magic retrieval. The framework puts the memory content into the prompt, and the LLM uses it (or ignores it) during reasoning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pluggable Storage
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;MemoryStore&lt;/code&gt; has two built-in implementations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In-memory&lt;/strong&gt; stores entries in insertion order per scope. Retrieval returns the most recent entries without semantic search. Suitable for development, testing, and single-JVM runs. Entries do not survive JVM restarts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;MemoryStore&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inMemory&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Embedding-based&lt;/strong&gt; stores entries via an embedding model and retrieves them via semantic similarity search. The backing &lt;code&gt;EmbeddingStore&lt;/code&gt; controls durability -- Chroma, Qdrant, Pinecone, pgvector, or any LangChain4j-compatible store.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAiEmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;EmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChromaEmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;baseUrl&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"http://localhost:8000"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"agentensemble-memory"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;MemoryStore&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddings&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The design tradeoff is explicit. In-memory is fast and simple but loses data on restart and does not do semantic retrieval. Embedding-based is durable and semantically aware but requires an embedding model and a vector store. You choose based on your operational requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Eviction Policies
&lt;/h2&gt;

&lt;p&gt;Unbounded memory is a prompt-size problem. Every stored entry adds tokens to the next run's prompt. Scopes support optional eviction to keep sizes bounded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Retain only the 5 most recent entries&lt;/span&gt;
&lt;span class="nc"&gt;MemoryScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"research"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;keepLastEntries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;// Retain only entries from the past 7 days&lt;/span&gt;
&lt;span class="nc"&gt;MemoryScope&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"research"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;keepEntriesWithin&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofDays&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Eviction is applied after each task stores its output. For embedding-based stores, eviction is a no-op since most embedding stores do not support deletion of individual entries.&lt;/p&gt;

&lt;h2&gt;
  
  
  MemoryTool: Agent-Driven Memory Access
&lt;/h2&gt;

&lt;p&gt;In addition to the automatic scope-based mechanism, agents can interact with memory directly during their ReAct loop using &lt;code&gt;MemoryTool&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Agent&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Researcher"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;goal&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Research and remember important facts"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MemoryTool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"research"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;MemoryTool&lt;/code&gt; provides two tool methods the LLM can call: &lt;code&gt;storeMemory(key, value)&lt;/code&gt; to store an arbitrary fact, and &lt;code&gt;retrieveMemory(query)&lt;/code&gt; to retrieve relevant memories by query.&lt;/p&gt;

&lt;p&gt;When the same &lt;code&gt;MemoryStore&lt;/code&gt; instance is used for both &lt;code&gt;MemoryTool&lt;/code&gt; and &lt;code&gt;Ensemble.builder().memoryStore(...)&lt;/code&gt;, explicit tool access and automatic scope-based access share the same backing store. This means an agent can both receive automatic context from previous runs and actively query or store additional facts during execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multiple Tasks Sharing a Scope
&lt;/h2&gt;

&lt;p&gt;Multiple tasks can declare the same scope name. Each task writes its output to the scope after it completes, so later tasks in a sequential workflow see earlier tasks' outputs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Task&lt;/span&gt; &lt;span class="n"&gt;research&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Research AI trends"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ai-project"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;Task&lt;/span&gt; &lt;span class="n"&gt;analysis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Analyse the research findings"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ai-project"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;research&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;analysis&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;memoryStore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is within-run memory sharing. The analysis task sees the research task's output because they share the &lt;code&gt;"ai-project"&lt;/code&gt; scope. On the next run, both tasks see outputs from the previous run's research and analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Design Principle
&lt;/h2&gt;

&lt;p&gt;The key design decision is that memory is opt-in and scoped, not global and automatic. An agent does not remember everything by default. Each task explicitly declares what it wants to remember and what it wants to recall.&lt;/p&gt;

&lt;p&gt;This makes the system easier to reason about. You can look at a task definition and know exactly what memory context it will receive. You can test a task with a pre-populated store and verify that it uses the memory correctly. You can clear a scope without affecting other scopes.&lt;/p&gt;

&lt;p&gt;The tradeoff is that you have to think about memory design upfront. Which tasks share scopes? How many entries should be retained? Should you use semantic search or recency-based retrieval? These are design decisions that the framework surfaces explicitly rather than hiding behind defaults.&lt;/p&gt;




&lt;p&gt;The full memory guide is in the &lt;a href="https://agentensemble.net/guides/memory/" rel="noopener noreferrer"&gt;AgentEnsemble documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I'd be interested in how you handle the prompt-size tension -- whether bounded eviction is sufficient, or whether you have needed more sophisticated retrieval strategies for production memory systems.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Tool Pipelines: Eliminating LLM Round-Trips for Deterministic Tool Chains</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Wed, 27 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/tool-pipelines-eliminating-llm-round-trips-for-deterministic-tool-chains-4p4m</link>
      <guid>https://dev.to/agentensemble/tool-pipelines-eliminating-llm-round-trips-for-deterministic-tool-chains-4p4m</guid>
      <description>&lt;p&gt;In a standard ReAct loop, every tool call requires an LLM round-trip. The agent calls a search tool, receives results, reasons about them, calls a filter tool, receives filtered output, reasons again, calls a format tool, and so on. Each step costs tokens, adds latency, and requires the LLM to make a decision that is often trivial -- the next step in the chain is predetermined.&lt;/p&gt;

&lt;p&gt;For deterministic data transformation chains, the LLM adds no reasoning value between steps. It just passes the output of one tool as input to the next. The interesting question is whether you can collapse that chain into a single tool call.&lt;/p&gt;

&lt;h2&gt;
  
  
  The ToolPipeline Abstraction
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt; provides &lt;code&gt;ToolPipeline&lt;/code&gt;, which chains multiple tools into a single compound tool. The LLM calls it once; all steps execute sequentially without LLM round-trips between them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Standard ReAct loop (3 LLM round-trips for tool mediation):
LLM -&amp;gt; search_tool -&amp;gt; LLM -&amp;gt; filter_tool -&amp;gt; LLM -&amp;gt; format_tool -&amp;gt; LLM

// With ToolPipeline (0 extra round-trips):
LLM -&amp;gt; search_then_filter_then_format -&amp;gt; LLM
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The simplest way to create one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ToolPipeline&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ToolPipeline&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;WebSearchTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;JsonParserTool&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
    &lt;span class="nc"&gt;FileWriteTool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outputPath&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// name: "web_search_then_json_parser_then_file_write"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Register it on a task like any other tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Research AI trends and save the top result to disk"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expectedOutput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Confirmation that the result was saved"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pipeline&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Data Flow and Adapters
&lt;/h2&gt;

&lt;p&gt;By default, &lt;code&gt;ToolResult.getOutput()&lt;/code&gt; from step N is passed as the input to step N+1. This works when tool outputs are directly consumable by the next tool.&lt;/p&gt;

&lt;p&gt;When you need to reshape data between steps, attach an adapter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ToolPipeline&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ToolPipeline&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"extract_and_calculate"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Extract a numeric field from JSON and apply a formula"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;step&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JsonParserTool&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;adapter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getOutput&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" * 1.1"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;step&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;CalculatorTool&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The adapter transforms the &lt;code&gt;JsonParserTool&lt;/code&gt; output (e.g., &lt;code&gt;"149.99"&lt;/code&gt;) into a calculator expression (&lt;code&gt;"149.99 * 1.1"&lt;/code&gt;) before passing it to &lt;code&gt;CalculatorTool&lt;/code&gt;. Adapters have full access to &lt;code&gt;ToolResult&lt;/code&gt;, including &lt;code&gt;getStructuredOutput()&lt;/code&gt; for typed payloads.&lt;/p&gt;

&lt;p&gt;This is the key design decision: adapters are plain Java functions, not LLM calls. They handle the deterministic reshaping that the LLM would otherwise do at full inference cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Error Strategies
&lt;/h2&gt;

&lt;p&gt;Pipelines support two error strategies:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FAIL_FAST&lt;/strong&gt; (default) stops the pipeline on the first failed step and returns that failure to the LLM immediately. Subsequent steps are never executed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CONTINUE_ON_FAILURE&lt;/strong&gt; continues executing subsequent steps even when an intermediate step fails. The failed step's error message is forwarded as input to the next step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ToolPipeline&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ToolPipeline&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"resilient_pipeline"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Continues even when a step fails"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;step&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stepA&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;step&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stepB&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;step&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stepC&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;errorStrategy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PipelineErrorStrategy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;CONTINUE_ON_FAILURE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The choice between them depends on whether downstream steps can recover from upstream failures. For a search-then-save pipeline, FAIL_FAST makes sense -- there is nothing to save if the search failed. For a multi-source aggregation, CONTINUE_ON_FAILURE lets the pipeline produce partial results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approval Gates Within Pipelines
&lt;/h2&gt;

&lt;p&gt;Steps inside a pipeline that require human approval will pause mid-pipeline, exactly as if they were standalone tools. The pipeline propagates the ensemble's &lt;code&gt;ReviewHandler&lt;/code&gt; to all nested steps automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ToolPipeline&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ToolPipeline&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;JsonParserTool&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
    &lt;span class="nc"&gt;FileWriteTool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outputPath&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requireApproval&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means you can build pipelines that include a human checkpoint before a destructive operation (like writing to disk or calling an external API) without losing the token savings for the deterministic steps before the checkpoint.&lt;/p&gt;

&lt;h2&gt;
  
  
  Nesting and Composition
&lt;/h2&gt;

&lt;p&gt;A &lt;code&gt;ToolPipeline&lt;/code&gt; implements &lt;code&gt;AgentTool&lt;/code&gt;, so it can be used as a step inside another pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ToolPipeline&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ToolPipeline&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"step_a"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"desc"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;toolA&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;toolB&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;ToolPipeline&lt;/span&gt; &lt;span class="n"&gt;outer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ToolPipeline&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"outer"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"desc"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inner&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;toolC&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets you build reusable pipeline fragments and compose them into larger chains. Each pipeline records its own aggregate metrics (timing, success/failure counts) in addition to the per-step metrics from individual tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Pipelines vs. Separate Tools
&lt;/h2&gt;

&lt;p&gt;The decision boundary is whether the LLM needs to reason between steps.&lt;/p&gt;

&lt;p&gt;Use &lt;code&gt;ToolPipeline&lt;/code&gt; when steps are deterministic and order-locked -- the LLM should not skip or reorder them, and the data transformations between steps are mechanical. The full chain appears as one operation to the LLM.&lt;/p&gt;

&lt;p&gt;Use separate tools when the LLM needs to decide which tool to call next based on intermediate results, or when intermediate results are useful for the LLM to see and reason about.&lt;/p&gt;

&lt;p&gt;In practice, this means pipelines work well for data retrieval and transformation chains (search, parse, filter, write), while separate tools work better for exploratory workflows where the agent needs to adapt its approach based on what it finds.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Broader Pattern
&lt;/h2&gt;

&lt;p&gt;ToolPipeline is one instance of a broader design principle in AgentEnsemble: when something is deterministic, do not pay LLM inference costs for it. This same principle appears in deterministic-only orchestration (tasks that never call an LLM), typed tool inputs (schema validation without LLM intervention), and phase-level workflow grouping (execution order declared in code, not negotiated by the LLM).&lt;/p&gt;

&lt;p&gt;The common thread is that the framework should handle mechanical work mechanically, and reserve LLM inference for decisions that actually require reasoning.&lt;/p&gt;




&lt;p&gt;The full tool pipeline guide is in the &lt;a href="https://agentensemble.net/guides/tool-pipeline/" rel="noopener noreferrer"&gt;AgentEnsemble documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Curious whether you have seen tool chains where the boundary between "deterministic" and "needs reasoning" is ambiguous, and how you would draw that line.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Guardrails for Agent Output: Pluggable Validation Before and After LLM Calls</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Mon, 25 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/guardrails-for-agent-output-pluggable-validation-before-and-after-llm-calls-59di</link>
      <guid>https://dev.to/agentensemble/guardrails-for-agent-output-pluggable-validation-before-and-after-llm-calls-59di</guid>
      <description>&lt;p&gt;One of the harder problems in agent systems is constraining output quality without turning every prompt into a wall of instructions. You can ask the LLM to stay under 3000 characters, or to always include a conclusion section, or to never mention competitor products. But prompt-based constraints are probabilistic. The LLM might follow them. It might not.&lt;/p&gt;

&lt;p&gt;Guardrails are the deterministic layer. They run as Java code before and after the LLM call, and they enforce rules that prompts cannot guarantee.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Model
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt; implements guardrails as two functional interfaces: &lt;code&gt;InputGuardrail&lt;/code&gt; and &lt;code&gt;OutputGuardrail&lt;/code&gt;. Both return a &lt;code&gt;GuardrailResult&lt;/code&gt; -- either success or failure with a reason.&lt;/p&gt;

&lt;p&gt;Input guardrails run before the LLM is contacted. If any fails, execution stops immediately and the agent's LLM is never called. Output guardrails run after the agent produces a response (and after structured output parsing, if configured).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;InputGuardrail&lt;/span&gt; &lt;span class="n"&gt;piiGuardrail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;desc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;taskDescription&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toLowerCase&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ssn"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"credit card"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;failure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"Task description may contain personally identifiable information"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;};&lt;/span&gt;

&lt;span class="nc"&gt;OutputGuardrail&lt;/span&gt; &lt;span class="n"&gt;lengthGuardrail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;rawResponse&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;failure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"Response is "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;rawResponse&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" chars, exceeds limit of 3000"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both are configured per-task:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Write an executive summary"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expectedOutput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"A concise summary"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inputGuardrails&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;piiGuardrail&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;outputGuardrails&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lengthGuardrail&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Functional Interfaces
&lt;/h2&gt;

&lt;p&gt;The choice to make guardrails functional interfaces rather than annotation-based or configuration-driven has a few practical consequences.&lt;/p&gt;

&lt;p&gt;First, guardrails are composable. You can build them from lambdas, combine them, or wrap them in utility methods. A guardrail that checks for PII can be reused across every task in the ensemble without any framework-specific wiring.&lt;/p&gt;

&lt;p&gt;Second, they are testable in isolation. A guardrail is a pure function from input to result. You can unit test it without standing up an ensemble or mocking an LLM.&lt;/p&gt;

&lt;p&gt;Third, they are stateless by default. Since guardrails may run concurrently (in parallel workflows), stateless lambdas are inherently thread-safe. If you need stateful validation, thread safety is your responsibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Input Guardrails See
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;GuardrailInput&lt;/code&gt; record carries everything you need to make a pre-execution decision:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;taskDescription()&lt;/code&gt; -- the task description text&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;expectedOutput()&lt;/code&gt; -- the expected output specification&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;contextOutputs()&lt;/code&gt; -- outputs from prior context tasks (immutable)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;agentRole()&lt;/code&gt; -- the role of the agent about to execute&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means you can write guardrails that check not just the current task, but the outputs of upstream tasks. For example, a guardrail that rejects a writing task if the research task upstream produced no findings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;InputGuardrail&lt;/span&gt; &lt;span class="n"&gt;requireResearch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;hasResearch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;contextOutputs&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;anyMatch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRaw&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;hasResearch&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;failure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"No substantive research output found"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Output Guardrails and Typed Output
&lt;/h2&gt;

&lt;p&gt;When a task uses &lt;code&gt;outputType&lt;/code&gt; for structured output, the execution order is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Input guardrails run (before LLM)&lt;/li&gt;
&lt;li&gt;LLM executes and produces raw text&lt;/li&gt;
&lt;li&gt;Structured output parsing (JSON extraction + deserialization)&lt;/li&gt;
&lt;li&gt;Output guardrails run (with both &lt;code&gt;rawResponse()&lt;/code&gt; and &lt;code&gt;parsedOutput()&lt;/code&gt; available)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This means output guardrails can inspect the typed Java object directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;ResearchReport&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;findings&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;conclusion&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;

&lt;span class="nc"&gt;OutputGuardrail&lt;/span&gt; &lt;span class="n"&gt;findingsGuardrail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parsedOutput&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nc"&gt;ResearchReport&lt;/span&gt; &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findings&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findings&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;failure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Report must include at least one finding"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where guardrails and typed outputs reinforce each other. The type system gives you a parsed object; the guardrail gives you a place to enforce business rules on that object.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multiple Guardrails and Evaluation Order
&lt;/h2&gt;

&lt;p&gt;Multiple guardrails per task are evaluated in order. The first failure stops evaluation -- subsequent guardrails are not called.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Write an article"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expectedOutput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"An article"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inputGuardrails&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;piiGuardrail&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;roleGuardrail&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;domainGuardrail&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;outputGuardrails&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lengthGuardrail&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conclusionGuardrail&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to collect all failures rather than short-circuit, compose them into a single guardrail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;InputGuardrail&lt;/span&gt; &lt;span class="n"&gt;compositeGuardrail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;failures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;InputGuardrail&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;piiGuardrail&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;roleGuardrail&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;validate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isSuccess&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;GuardrailResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;failure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;join&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"; "&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="o"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Exception Propagation
&lt;/h2&gt;

&lt;p&gt;When a guardrail fails, &lt;code&gt;GuardrailViolationException&lt;/code&gt; is thrown. It propagates through the workflow executor and is wrapped in &lt;code&gt;TaskExecutionException&lt;/code&gt;, following the same pattern as other task failures.&lt;/p&gt;

&lt;p&gt;The exception carries structured information -- guardrail type (INPUT or OUTPUT), violation message, task description, and agent role -- so you can route failures to metrics or alerting without parsing error strings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TaskExecutionException&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCause&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nc"&gt;GuardrailViolationException&lt;/span&gt; &lt;span class="n"&gt;gve&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"guardrail.violation."&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;gve&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getGuardrailType&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;warn&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Guardrail blocked task '{}': {}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;gve&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTaskDescription&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;gve&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getViolationMessage&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Tradeoff
&lt;/h2&gt;

&lt;p&gt;Guardrails are deterministic checks, not semantic analysis. A length limit is easy to enforce. A toxicity check is harder -- you would need to call an external classifier inside the guardrail, which adds latency and its own failure modes.&lt;/p&gt;

&lt;p&gt;The design intentionally keeps guardrails as simple synchronous functions. If you need async validation, external API calls, or retry logic, you implement that inside the guardrail function. The framework does not impose an opinion on how complex your validation should be.&lt;/p&gt;

&lt;p&gt;This means guardrails are most useful for structural and policy checks -- length limits, required sections, PII filters, role-based access, schema validation on typed outputs. For semantic quality checks, the phase review and task reflection mechanisms (covered in earlier posts) are a better fit.&lt;/p&gt;




&lt;p&gt;The full guardrails guide is in the &lt;a href="https://agentensemble.net/guides/guardrails/" rel="noopener noreferrer"&gt;AgentEnsemble documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I'd be interested in whether the input/output split feels like the right abstraction, or whether you have seen validation needs that do not fit cleanly into either category.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Wiring Agent Ensembles into Spring Boot, Micronaut, and Quarkus</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Sat, 23 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/wiring-agent-ensembles-into-spring-boot-micronaut-and-quarkus-5539</link>
      <guid>https://dev.to/agentensemble/wiring-agent-ensembles-into-spring-boot-micronaut-and-quarkus-5539</guid>
      <description>&lt;p&gt;One question that comes up early when evaluating an agent orchestration library is how it fits into an existing backend stack. If your services run on Spring Boot, Micronaut, or Quarkus, you want agents to live inside the same dependency injection container, use the same configuration system, and expose metrics through the same actuator endpoints.&lt;/p&gt;

&lt;p&gt;The interesting design decision in &lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt; is that it has no framework dependencies at all. It is a plain Java 21+ library with a builder API. Framework integration is just a matter of wrapping those builder calls in whatever DI mechanism your framework uses. Nothing in the library changes.&lt;/p&gt;

&lt;p&gt;This keeps the core small and testable, but it also means the integration patterns are worth spelling out explicitly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Builder API as the Integration Surface
&lt;/h2&gt;

&lt;p&gt;Every AgentEnsemble component -- agents, tasks, ensembles, memory stores, listeners -- is created through builders. The framework never scans for annotations, never registers beans automatically, and never assumes a particular lifecycle model.&lt;/p&gt;

&lt;p&gt;This is deliberate. The builder API is the integration surface. In a DI container, you turn builder calls into bean definitions. In a plain &lt;code&gt;main()&lt;/code&gt; method, you call the same builders directly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Agent&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Research Analyst"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;goal&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Find accurate, up-to-date information"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;backstory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"You are a meticulous researcher."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That same code works identically inside a Spring &lt;code&gt;@Configuration&lt;/code&gt;, a Micronaut &lt;code&gt;@Factory&lt;/code&gt;, a Quarkus CDI producer, or a static main method.&lt;/p&gt;

&lt;h2&gt;
  
  
  Spring Boot
&lt;/h2&gt;

&lt;p&gt;Spring Boot is the most common case. The LangChain4j Spring Boot starters handle &lt;code&gt;ChatLanguageModel&lt;/code&gt; bean creation from &lt;code&gt;application.properties&lt;/code&gt; automatically -- AgentEnsemble does not duplicate that responsibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependencies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nf"&gt;dependencies&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"net.agentensemble:agentensemble-core:2.10.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"dev.langchain4j:langchain4j-spring-boot-starter:1.11.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"dev.langchain4j:langchain4j-open-ai-spring-boot-starter:1.11.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;// Optional: metrics via Spring Boot Actuator&lt;/span&gt;
    &lt;span class="nf"&gt;implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"net.agentensemble:agentensemble-metrics-micrometer:2.10.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Configuration Class
&lt;/h3&gt;

&lt;p&gt;Spring injects the &lt;code&gt;ChatLanguageModel&lt;/code&gt; bean (created by the LangChain4j starter) and any &lt;code&gt;EnsembleListener&lt;/code&gt; beans you have declared elsewhere.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Configuration&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentEnsembleConfig&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt; &lt;span class="nf"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Research Analyst"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;goal&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Find accurate, up-to-date information on the given topic"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;backstory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"You are a meticulous researcher with a talent for "&lt;/span&gt;
                        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"finding relevant information quickly."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="nf"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;ChatLanguageModel&lt;/span&gt; &lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Agent&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;EnsembleListener&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;listeners&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ToolMetrics&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;toolMetrics&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

        &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agents&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;listeners&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;builder:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;toolMetrics&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ifPresent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;builder:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;toolMetrics&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pattern here is standard Spring: declare beans, let Spring wire them. Any &lt;code&gt;@Component&lt;/code&gt; implementing &lt;code&gt;EnsembleListener&lt;/code&gt; is automatically collected via the &lt;code&gt;List&amp;lt;EnsembleListener&amp;gt;&lt;/code&gt; injection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Metrics via Actuator
&lt;/h3&gt;

&lt;p&gt;If you use Micrometer with Spring Boot Actuator, declare a &lt;code&gt;ToolMetrics&lt;/code&gt; bean and agent metrics appear at &lt;code&gt;/actuator/metrics&lt;/code&gt; automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Bean&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ToolMetrics&lt;/span&gt; &lt;span class="nf"&gt;toolMetrics&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MeterRegistry&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;MicrometerToolMetrics&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using the Ensemble
&lt;/h3&gt;

&lt;p&gt;Inject the &lt;code&gt;Ensemble&lt;/code&gt; bean wherever you need it. Build tasks at the call site where you have the runtime inputs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ResearchService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;ResearchService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensemble&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;researcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;research&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Task&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Research and summarise: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expectedOutput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"A concise summary with key findings"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;finalOutput&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Micronaut
&lt;/h2&gt;

&lt;p&gt;Micronaut does not have a LangChain4j integration module, so you create the &lt;code&gt;ChatLanguageModel&lt;/code&gt; bean directly. The rest of the pattern is the same -- a &lt;code&gt;@Factory&lt;/code&gt; class with &lt;code&gt;@Singleton&lt;/code&gt; methods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Factory&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentEnsembleFactory&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Singleton&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ChatLanguageModel&lt;/span&gt; &lt;span class="nf"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nd"&gt;@Value&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"${agentensemble.openai.api-key}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nd"&gt;@Value&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"${agentensemble.openai.model-name}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;OpenAiChatModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Singleton&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="nf"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;ChatLanguageModel&lt;/span&gt; &lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Agent&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;EnsembleListener&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;listeners&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agents&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;listeners&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;builder:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Micronaut injects all &lt;code&gt;EnsembleListener&lt;/code&gt; beans automatically via the &lt;code&gt;List&amp;lt;EnsembleListener&amp;gt;&lt;/code&gt; parameter. Micrometer metrics work out of the box since Micronaut ships with native Micrometer support.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quarkus
&lt;/h2&gt;

&lt;p&gt;Quarkus has its own &lt;code&gt;quarkus-langchain4j&lt;/code&gt; extension with a different programming model. The example below uses the standard LangChain4j library directly with Quarkus CDI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@ApplicationScoped&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentEnsembleProducer&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@ConfigProperty&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"agentensemble.openai.api-key"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@Produces&lt;/span&gt; &lt;span class="nd"&gt;@ApplicationScoped&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ChatLanguageModel&lt;/span&gt; &lt;span class="nf"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;OpenAiChatModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Produces&lt;/span&gt; &lt;span class="nd"&gt;@ApplicationScoped&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="nf"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;ChatLanguageModel&lt;/span&gt; &lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Agent&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Instance&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;EnsembleListener&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;listeners&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agents&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;listeners&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;builder:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The only Quarkus-specific detail is &lt;code&gt;Instance&amp;lt;EnsembleListener&amp;gt;&lt;/code&gt; instead of &lt;code&gt;List&amp;lt;EnsembleListener&amp;gt;&lt;/code&gt; -- CDI's lazy injection mechanism.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Design Tradeoff
&lt;/h2&gt;

&lt;p&gt;The choice to keep AgentEnsemble framework-agnostic means there is no auto-configuration, no classpath scanning, and no starter module that wires everything with a single dependency. You write the configuration class yourself.&lt;/p&gt;

&lt;p&gt;The upside is that the integration is completely transparent. There is no hidden magic, no classpath-sensitive behavior, and no risk of version conflicts between the library's framework assumptions and your application's framework version. The builder API is the same everywhere, so moving between frameworks (or running without one) requires changing only the DI wiring.&lt;/p&gt;

&lt;p&gt;For teams that already have a preferred framework and know how to write configuration classes, this is usually the right tradeoff. The wiring code is small, readable, and lives in one place.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Crosses the DI Boundary
&lt;/h2&gt;

&lt;p&gt;A few integration points are worth calling out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Listeners&lt;/strong&gt; integrate naturally as DI beans. Declare any &lt;code&gt;EnsembleListener&lt;/code&gt; implementation as a bean, and the ensemble configuration collects them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt; components (&lt;code&gt;MemoryStore&lt;/code&gt;, &lt;code&gt;EnsembleMemory&lt;/code&gt;) are created via builders and passed to the ensemble. In a DI framework, declare them as beans.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools&lt;/strong&gt; are configured per-agent. Declare tool instances as beans and inject them into agent factory methods.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metrics&lt;/strong&gt; via &lt;code&gt;MicrometerToolMetrics&lt;/code&gt; plug into whatever &lt;code&gt;MeterRegistry&lt;/code&gt; your framework provides.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The general rule: if a component is created via a builder, it can be a bean. If it is passed to the ensemble builder, it can be injected.&lt;/p&gt;




&lt;p&gt;The framework integration guide and full code examples are in the &lt;a href="https://agentensemble.net/guides/framework-integration/" rel="noopener noreferrer"&gt;AgentEnsemble documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I'd be interested in whether this level of framework-agnosticism feels right, or whether starter modules that auto-configure common setups would be more useful for your team.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>spring</category>
    </item>
    <item>
      <title>Operating Agent Networks: Visual Topology, Drill-Down, and Runtime Visibility</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Thu, 21 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/operating-agent-networks-visual-topology-drill-down-and-runtime-visibility-4gao</link>
      <guid>https://dev.to/agentensemble/operating-agent-networks-visual-topology-drill-down-and-runtime-visibility-4gao</guid>
      <description>&lt;p&gt;Building an agent network is one problem. Operating it is a different one. When you have ten ensembles communicating over WebSockets, sharing capabilities via discovery, routing requests across federation boundaries, and managing capacity with priority queues -- you need to see what is happening.&lt;/p&gt;

&lt;p&gt;The operational question is not "does the code work?" but "what is the system doing right now?" Which ensembles are healthy? Which are overloaded? What capabilities are available? Where are requests being routed? What changed in the last hour?&lt;/p&gt;

&lt;h2&gt;
  
  
  The visibility gap
&lt;/h2&gt;

&lt;p&gt;Individual ensemble dashboards (the live execution view covered in an earlier post) show what one ensemble is doing: its current task, agent iterations, tool calls, and trace. But they do not show the network -- how ensembles relate to each other, where requests flow, and where bottlenecks form.&lt;/p&gt;

&lt;p&gt;The gap is between per-ensemble observability and network-level observability. Both are needed. The per-ensemble view tells you why a specific task took 30 seconds. The network view tells you why the kitchen has a queue depth of 15 while the maintenance team is idle.&lt;/p&gt;

&lt;h2&gt;
  
  
  The network dashboard
&lt;/h2&gt;

&lt;p&gt;AgentEnsemble's network dashboard provides a topology view of the entire ensemble network. Navigate to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:5173/network?ensembles=kitchen:ws://localhost:7329/ws,maintenance:ws://localhost:7330/ws
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ensembles&lt;/code&gt; query parameter accepts a comma-separated list of &lt;code&gt;name:wsUrl&lt;/code&gt; pairs. Each ensemble gets its own independent WebSocket connection -- no aggregating proxy or central coordinator needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Topology graph
&lt;/h3&gt;

&lt;p&gt;Ensembles are displayed as nodes in an interactive graph powered by React Flow. Each node shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensemble name&lt;/li&gt;
&lt;li&gt;Lifecycle state (green = READY, yellow = STARTING, red = STOPPED)&lt;/li&gt;
&lt;li&gt;Active task count and queue depth&lt;/li&gt;
&lt;li&gt;Task progress bar&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Connections between ensembles (shared tasks and tools) are displayed as animated edges. The topology is derived from capability discovery -- if the kitchen shares a &lt;code&gt;prepare-meal&lt;/code&gt; task and room service uses it, the edge appears automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ensemble detail sidebar
&lt;/h3&gt;

&lt;p&gt;Click an ensemble node to open the sidebar panel showing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lifecycle state and connection status&lt;/li&gt;
&lt;li&gt;Active tasks, queue depth, completed tasks metrics&lt;/li&gt;
&lt;li&gt;Shared capabilities (tasks and tools with tags)&lt;/li&gt;
&lt;li&gt;WebSocket URL&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Drill-down to live execution
&lt;/h3&gt;

&lt;p&gt;Click "Drill Down" in the sidebar to navigate to the live execution dashboard for that specific ensemble. This reuses the existing per-ensemble dashboard infrastructure -- the same trace view, agent iteration timeline, and tool call details.&lt;/p&gt;

&lt;p&gt;The flow is: network topology (high-level) -&amp;gt; ensemble detail (mid-level) -&amp;gt; live execution trace (low-level). Each level answers different questions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dynamic ensemble addition
&lt;/h3&gt;

&lt;p&gt;Click "Add Ensemble" in the header to connect to a new ensemble by entering its name and WebSocket URL. The dashboard is not static -- you can add ensembles as you discover them or as new ones come online.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The network dashboard opens independent WebSocket connections to each ensemble. There is no central aggregator. Each ensemble already exposes a WebSocket endpoint for the live dashboard; the network dashboard reuses those same endpoints.&lt;/p&gt;

&lt;p&gt;Status polling (every 5 seconds) fetches &lt;code&gt;/api/status&lt;/code&gt; from each ensemble's HTTP endpoint for queue depth and lifecycle state. The existing &lt;code&gt;HelloMessage&lt;/code&gt; with &lt;code&gt;snapshotTrace&lt;/code&gt; provides late-join support -- when a new connection opens, it receives the current state immediately rather than waiting for the next update.&lt;/p&gt;

&lt;p&gt;This architecture means the dashboard works with any combination of ensembles, including ensembles in different namespaces or clusters (as long as the WebSocket URLs are reachable). It also means the dashboard has no state of its own -- refreshing the page reconnects to all ensembles and rebuilds the view.&lt;/p&gt;

&lt;h2&gt;
  
  
  Audit trail
&lt;/h2&gt;

&lt;p&gt;Beyond the real-time dashboard, the audit trail provides a historical record of network events:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Work requests received, completed, and failed&lt;/li&gt;
&lt;li&gt;Capacity changes (profile applications, scaling events)&lt;/li&gt;
&lt;li&gt;Discovery events (capabilities registered and deregistered)&lt;/li&gt;
&lt;li&gt;Federation events (cross-realm routing decisions)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The audit trail is append-only and backed by the same transport infrastructure as the rest of the network. In development, it is an in-memory log. In production, it can be backed by Kafka for durability and external consumption.&lt;/p&gt;

&lt;p&gt;The audit trail answers questions that the real-time dashboard cannot: "When did the kitchen start receiving requests from the airport realm?" "How many requests failed between 2am and 4am?" "When was the weekend profile last applied?"&lt;/p&gt;

&lt;h2&gt;
  
  
  Operational profiles in the dashboard
&lt;/h2&gt;

&lt;p&gt;When an operational profile is applied, the dashboard receives a &lt;code&gt;ProfileAppliedMessage&lt;/code&gt; and updates the topology to reflect the new capacity targets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"profile_applied"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"profileName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sporting-event-weekend"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"capacities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"front-desk"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"replicas"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"maxConcurrent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"dormant"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"kitchen"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"replicas"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"maxConcurrent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"dormant"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"appliedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-03-28T14:30:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dashboard can display the active profile, show which ensembles have adjusted their capacity, and highlight ensembles that have not yet reached their target capacity.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the dashboard does not do
&lt;/h2&gt;

&lt;p&gt;The dashboard is read-only. It does not send commands to ensembles, apply profiles, or adjust capacity. It observes and displays.&lt;/p&gt;

&lt;p&gt;This is deliberate. Operational actions (scaling, profile application, ensemble restarts) should go through the directive system, the profile scheduler, or your deployment pipeline. The dashboard provides the visibility to make those decisions, not the mechanism to execute them.&lt;/p&gt;

&lt;p&gt;The exception is the "Add Ensemble" feature, which adds a display connection -- it does not modify the ensemble's configuration or behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;No central state.&lt;/strong&gt; The dashboard has no persistent state. If you close it and reopen, it reconnects and rebuilds. This simplifies the architecture but means there is no historical view in the dashboard itself -- that is what the audit trail is for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;WebSocket reachability.&lt;/strong&gt; The dashboard needs WebSocket access to every ensemble. In a Kubernetes deployment, this may require ingress configuration or port-forwarding. In development, ensembles run on localhost and are directly reachable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Polling interval.&lt;/strong&gt; Status is polled every 5 seconds. Events that happen between polls (a brief spike in queue depth, a transient connection failure) may not be visible. For sub-second operational visibility, you would need to supplement with metrics (Micrometer, OpenTelemetry) exported to a time-series database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No alerting.&lt;/strong&gt; The dashboard shows current state and recent history. It does not trigger alerts when thresholds are crossed. Alerting should be handled by your monitoring stack (Grafana, Prometheus, PagerDuty) using the metrics that ensembles already export.&lt;/p&gt;

&lt;h2&gt;
  
  
  The design principle
&lt;/h2&gt;

&lt;p&gt;The useful insight is that operating an agent network requires three levels of visibility:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Network level&lt;/strong&gt; -- topology, connections, capacity distribution, routing patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ensemble level&lt;/strong&gt; -- queue depth, active tasks, shared capabilities, health&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution level&lt;/strong&gt; -- individual task traces, agent iterations, tool calls&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each level answers different questions. The network dashboard provides levels 1 and 2, with drill-down to level 3 via the existing live execution dashboard. The audit trail provides the historical dimension that complements the real-time view.&lt;/p&gt;

&lt;p&gt;This layered observability is not unique to agent systems -- it mirrors the service mesh / individual service / request trace pattern in microservices. What is specific to agent systems is the non-determinism: you cannot predict how long a task will take, how many iterations an agent will need, or whether a request will be delegated to another ensemble. The dashboard helps operators reason about a system that is inherently unpredictable.&lt;/p&gt;




&lt;p&gt;The network dashboard is part of &lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href="https://agentensemble.net/guides/network-dashboard/" rel="noopener noreferrer"&gt;network dashboard guide&lt;/a&gt; covers setup, and the &lt;a href="https://agentensemble.net/guides/audit-trail/" rel="noopener noreferrer"&gt;audit trail guide&lt;/a&gt; covers the historical event log.&lt;/p&gt;

&lt;p&gt;I'd be interested in what operational tools others are using for multi-agent systems -- and whether the topology-first approach matches how you think about agent network operations.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Testing Distributed Agent Systems: Stubs, Recordings, and Isolation</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Tue, 19 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/testing-distributed-agent-systems-stubs-recordings-and-isolation-3023</link>
      <guid>https://dev.to/agentensemble/testing-distributed-agent-systems-stubs-recordings-and-isolation-3023</guid>
      <description>&lt;p&gt;Testing a single agent ensemble is already harder than testing most software: the output is non-deterministic, the execution path depends on LLM responses, and the number of iterations is unpredictable.&lt;/p&gt;

&lt;p&gt;Testing a &lt;em&gt;network&lt;/em&gt; of agent ensembles adds distributed system concerns on top of that: WebSocket connections between services, shared state across ensembles, capability discovery, and cross-ensemble delegation. If your tests require all of these to be running, your test suite becomes an integration environment rather than a test suite.&lt;/p&gt;

&lt;p&gt;The question is how to test an ensemble's &lt;em&gt;network behavior&lt;/em&gt; without requiring the rest of the network to be running.&lt;/p&gt;

&lt;h2&gt;
  
  
  The testing problem
&lt;/h2&gt;

&lt;p&gt;An ensemble that delegates work to other ensembles via &lt;code&gt;NetworkTask&lt;/code&gt; or &lt;code&gt;NetworkTool&lt;/code&gt; has external dependencies. In production, those dependencies are real ensembles running on real infrastructure. In tests, you need control over what those dependencies return.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Production code: room service delegates to kitchen&lt;/span&gt;
&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;roomService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Handle room service request"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;NetworkTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"ws://kitchen:7329/ws"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you test this ensemble without a running kitchen, the &lt;code&gt;NetworkTask&lt;/code&gt; fails to connect. If you run a real kitchen, your test depends on the kitchen's LLM, its prompt, its tools -- all of which are outside your control and non-deterministic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stubs for predictable behavior
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;NetworkTask.stub()&lt;/code&gt; and &lt;code&gt;NetworkTool.stub()&lt;/code&gt; return canned responses without connecting to any real ensemble:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;StubNetworkTask&lt;/span&gt; &lt;span class="n"&gt;mealStub&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stub&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Meal prepared: wagyu steak, medium-rare. Estimated 25 minutes."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="nc"&gt;StubNetworkTool&lt;/span&gt; &lt;span class="n"&gt;inventoryStub&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkTool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stub&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"check-inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"3 portions available"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;roomService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Handle room service request"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mealStub&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inventoryStub&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;EnsembleOutput&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;roomService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The stub replaces the real network call with a predetermined response. The ensemble under test processes the stub's response exactly as it would process a response from a real kitchen ensemble.&lt;/p&gt;

&lt;p&gt;This gives you deterministic network behavior while letting the ensemble's own LLM interactions remain non-deterministic. You are testing how the ensemble &lt;em&gt;uses&lt;/em&gt; the network response, not whether the network itself works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recordings for assertion
&lt;/h2&gt;

&lt;p&gt;Sometimes you need to verify not just the output but what the ensemble &lt;em&gt;sent&lt;/em&gt; to its dependencies. &lt;code&gt;NetworkTask.recording()&lt;/code&gt; captures every request for later assertion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;RecordingNetworkTask&lt;/span&gt; &lt;span class="n"&gt;recorder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;recording&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;roomService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Handle room service request for wagyu steak"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recorder&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;roomService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Assert what was sent to the kitchen&lt;/span&gt;
&lt;span class="n"&gt;assertThat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recorder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;callCount&lt;/span&gt;&lt;span class="o"&gt;()).&lt;/span&gt;&lt;span class="na"&gt;isEqualTo&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;assertThat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recorder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;lastRequest&lt;/span&gt;&lt;span class="o"&gt;()).&lt;/span&gt;&lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"wagyu"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;assertThat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recorder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;()).&lt;/span&gt;&lt;span class="na"&gt;hasSize&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Recordings combine the predictability of stubs (they return a configurable response, defaulting to &lt;code&gt;"recorded"&lt;/code&gt;) with the observability of a mock. You can verify that the ensemble called the right capability with the right parameters.&lt;/p&gt;

&lt;p&gt;This is particularly useful for testing delegation logic: does the ensemble delegate to the right capability? Does it include the right context in the request? Does it handle the response correctly?&lt;/p&gt;

&lt;h3&gt;
  
  
  Custom responses for recordings
&lt;/h3&gt;

&lt;p&gt;By default, recordings return &lt;code&gt;"recorded"&lt;/code&gt;. You can provide a custom response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;RecordingNetworkTask&lt;/span&gt; &lt;span class="n"&gt;recorder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;recording&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Meal prepared in 25 minutes"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Tool naming consistency
&lt;/h2&gt;

&lt;p&gt;Both stubs and recordings use the same naming convention as real network tools: &lt;code&gt;"ensemble.capability"&lt;/code&gt;. A stub for the kitchen's &lt;code&gt;prepare-meal&lt;/code&gt; capability is named &lt;code&gt;"kitchen.prepare-meal"&lt;/code&gt; -- the same name the agent sees when using a real &lt;code&gt;NetworkTask&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This means the agent's tool selection logic works identically in tests and production. The agent does not know whether &lt;code&gt;kitchen.prepare-meal&lt;/code&gt; is backed by a real WebSocket connection, a stub, or a recording.&lt;/p&gt;

&lt;h2&gt;
  
  
  Thread safety
&lt;/h2&gt;

&lt;p&gt;All test doubles are thread-safe. &lt;code&gt;RecordingNetworkTask&lt;/code&gt; and &lt;code&gt;RecordingNetworkTool&lt;/code&gt; use &lt;code&gt;CopyOnWriteArrayList&lt;/code&gt; internally, so concurrent calls from parallel tool execution are safely recorded.&lt;/p&gt;

&lt;p&gt;This matters because agent ensembles with multiple concurrent tasks may invoke network tools from different threads simultaneously. The test doubles handle this correctly without external synchronization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integration test patterns
&lt;/h2&gt;

&lt;p&gt;Stubs and recordings handle unit-level testing: verifying one ensemble's behavior in isolation. For integration testing -- verifying that two ensembles work together correctly -- you can run both ensembles in the same process with in-process transport:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In-process: no WebSocket connections needed&lt;/span&gt;
&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;kitchen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Manage kitchen operations"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;shareTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mealTask&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;roomService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Handle room service request"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;NetworkTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Both ensembles run in-process with shared registry&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In-process transport eliminates network concerns (connection timeouts, port conflicts, container management) while preserving real cross-ensemble communication. The ensembles interact through in-memory queues and registries, so the behavior is the same as production except for the transport layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing patterns summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What to test&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ensemble uses network response correctly&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NetworkTask.stub()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Canned response, deterministic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ensemble sends correct request to network&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NetworkTask.recording()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Capture and assert requests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Two ensembles work together&lt;/td&gt;
&lt;td&gt;In-process transport&lt;/td&gt;
&lt;td&gt;Real interaction, no network&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;End-to-end with real infrastructure&lt;/td&gt;
&lt;td&gt;WebSocket transport&lt;/td&gt;
&lt;td&gt;Full integration test&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each level adds realism and cost. Start with stubs for fast, focused tests. Use recordings when you need to verify outbound requests. Use in-process transport for integration tests. Reserve full WebSocket tests for the deployment verification layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Stubs hide integration problems.&lt;/strong&gt; A stub always returns the same response, regardless of what the ensemble sends. If the ensemble sends a malformed request that a real kitchen would reject, the stub does not catch that. Integration tests with in-process transport or WebSocket transport are needed to verify the contract between ensembles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM non-determinism leaks through.&lt;/strong&gt; Even with stubbed network dependencies, the ensemble's own LLM calls are non-deterministic. The same test may pass or fail depending on the model's response. For fully deterministic tests, you need to stub the LLM as well (using LangChain4j's test doubles or a local model with temperature 0).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recordings only capture what was sent.&lt;/strong&gt; They do not verify that the request would be accepted by the real provider. Schema validation or contract testing would be needed to verify compatibility.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In-process tests share a JVM.&lt;/strong&gt; Two ensembles running in the same process share class loaders, thread pools, and memory. Resource contention in tests that does not occur in production is possible. Conversely, isolation problems that only occur with separate processes are not caught.&lt;/p&gt;

&lt;h2&gt;
  
  
  The design principle
&lt;/h2&gt;

&lt;p&gt;The useful insight is that network behavior and business logic are separable concerns. An ensemble's decision to delegate to the kitchen, and how it processes the kitchen's response, is business logic. The WebSocket connection, serialization, and transport is infrastructure.&lt;/p&gt;

&lt;p&gt;Test doubles let you test the business logic without the infrastructure. In-process transport lets you test the interaction without the network. Full integration tests verify everything works together.&lt;/p&gt;

&lt;p&gt;This layered approach is standard practice for distributed systems. What makes it notable in the agent context is that the business logic is already non-deterministic (LLM-driven), so isolating the network layer from the LLM layer is particularly valuable for test stability.&lt;/p&gt;




&lt;p&gt;Network testing tools are part of &lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href="https://agentensemble.net/guides/network-testing/" rel="noopener noreferrer"&gt;network testing guide&lt;/a&gt; covers the full API including stubs, recordings, and in-process transport setup.&lt;/p&gt;

&lt;p&gt;I'd be interested in how others approach testing multi-agent systems -- especially how you handle the double non-determinism of LLM behavior plus network behavior.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Capacity Management in Agent Networks: Rate Limiting, Priority Queues, and Backpressure</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Sun, 17 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/capacity-management-in-agent-networks-rate-limiting-priority-queues-and-backpressure-33ce</link>
      <guid>https://dev.to/agentensemble/capacity-management-in-agent-networks-rate-limiting-priority-queues-and-backpressure-33ce</guid>
      <description>&lt;p&gt;Agent ensembles that run as long-lived services on a network will, at some point, receive more work than they can handle. The question is what happens next.&lt;/p&gt;

&lt;p&gt;Without capacity management, the answer is usually one of: unbounded queue growth (OOM), random request dropping, or cascade failures where an overloaded ensemble backs up its callers. None of these are acceptable for systems that run in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  The capacity problem in agent networks
&lt;/h2&gt;

&lt;p&gt;Agent workloads have properties that make capacity management harder than in traditional request/response systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Variable execution time.&lt;/strong&gt; A simple analysis task might take 5 seconds. A complex coding task might take 5 minutes. You cannot predict queue drain rate from request count alone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Variable cost.&lt;/strong&gt; Each agent iteration consumes LLM tokens. An overloaded system does not just slow down -- it burns money faster.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-deterministic behavior.&lt;/strong&gt; An agent might complete in 3 iterations or 30. Capacity planning based on averages can be wildly wrong for individual requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fan-out amplification.&lt;/strong&gt; One incoming request to a coordinator might fan out to 5 different ensembles. Overload at one point amplifies across the network.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These properties mean that simple concurrency limits are necessary but not sufficient. You also need priority ordering, starvation prevention, and the ability to adjust capacity proactively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rate limiting
&lt;/h2&gt;

&lt;p&gt;The first line of defense is limiting how many tasks an ensemble processes concurrently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;kitchen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Manage kitchen operations"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxConcurrent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the concurrency limit is reached, new requests queue. The queue has a configurable maximum depth. When the queue is also full, the ensemble rejects new requests with a backpressure signal.&lt;/p&gt;

&lt;p&gt;The backpressure signal propagates upstream: the calling ensemble receives a rejection and can retry, route to an alternative provider (via discovery), or report the failure to its own caller.&lt;/p&gt;

&lt;p&gt;This is straightforward but effective. The ensemble protects itself from overload, and the backpressure signal gives callers information they need to make routing decisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Priority queues with aging
&lt;/h2&gt;

&lt;p&gt;Not all requests are equally urgent. A VIP guest's meal request should be processed before a routine inventory check. Priority queues handle this, but naive priority queues have a starvation problem: low-priority requests may never be processed if high-priority requests keep arriving.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;PriorityRequestQueue&lt;/code&gt; adds aging to prevent starvation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;PriorityRequestQueue&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PriorityRequestQueue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requestQueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseQueue&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;levels&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agingInterval&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofMinutes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Enqueue with priority&lt;/span&gt;
&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;enqueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vipRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;       &lt;span class="c1"&gt;// highest priority&lt;/span&gt;
&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;enqueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normalRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;    &lt;span class="c1"&gt;// normal&lt;/span&gt;
&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;enqueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batchRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;     &lt;span class="c1"&gt;// lowest priority&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Aging works by promoting requests that have waited longer than the aging interval. A batch request (priority 2) that has been waiting for 10 minutes (two aging intervals) gets promoted twice to priority 0. It will be processed next, regardless of incoming high-priority requests.&lt;/p&gt;

&lt;p&gt;This guarantees that every request is eventually processed, while still giving meaningful priority to urgent work in the common case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operational profiles
&lt;/h2&gt;

&lt;p&gt;Rate limits and priorities handle reactive capacity management -- responding to load as it arrives. Operational profiles handle proactive capacity management -- adjusting capacity in anticipation of known load changes.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;NetworkProfile&lt;/code&gt; bundles per-ensemble capacity targets and shared memory pre-load directives into a deployable unit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;NetworkProfile&lt;/span&gt; &lt;span class="n"&gt;weekendProfile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkProfile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"sporting-event-weekend"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"front-desk"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Capacity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;maxConcurrent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Capacity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;maxConcurrent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;preload&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Extra beer and ice stocked"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;ProfileApplier&lt;/span&gt; &lt;span class="n"&gt;applier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ProfileApplier&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sharedMemoryRegistry&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;broadcaster&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;applier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apply&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;weekendProfile&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a profile is applied, two things happen:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pre-load directives&lt;/strong&gt; seed shared memory scopes with context (e.g., alerting the kitchen about extra stock)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A &lt;code&gt;ProfileAppliedMessage&lt;/code&gt; is broadcast&lt;/strong&gt; to all ensembles with the new capacity targets&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The broadcast message includes replica counts and concurrency limits. Consumers of this message (a Kubernetes operator, a scaling script, or the ensembles themselves) can use the targets to trigger actual scaling.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scheduled profiles
&lt;/h3&gt;

&lt;p&gt;Profiles can be applied on a schedule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ProfileScheduler&lt;/span&gt; &lt;span class="n"&gt;scheduler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ProfileScheduler&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;applier&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Apply weekend profile every 7 days&lt;/span&gt;
&lt;span class="n"&gt;scheduler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;weekendProfile&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofHours&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;    &lt;span class="c1"&gt;// initial delay&lt;/span&gt;
    &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofDays&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;    &lt;span class="c1"&gt;// interval&lt;/span&gt;

&lt;span class="c1"&gt;// One-shot: return to normal after the weekend&lt;/span&gt;
&lt;span class="n"&gt;scheduler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;scheduleOnce&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normalProfile&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofDays&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Directive-driven profiles
&lt;/h3&gt;

&lt;p&gt;Profiles can also be applied via the directive system, enabling external triggers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;NetworkProfileDirectiveHandler&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;NetworkProfileDirectiveHandler&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;applier&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;profiles&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;directiveDispatcher&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;registerHandler&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"APPLY_PROFILE"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An external system (monitoring, a human operator, a scheduler) sends an &lt;code&gt;APPLY_PROFILE&lt;/code&gt; directive with the profile name, and the network adjusts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scheduled tasks
&lt;/h2&gt;

&lt;p&gt;Long-running ensembles often need to perform recurring work: inventory checks, health monitoring, report generation. The &lt;code&gt;ScheduledTask&lt;/code&gt; API makes this a first-class concern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;kitchen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Manage kitchen operations"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;scheduledTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ScheduledTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"inventory-check"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Check current inventory levels and generate report"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Schedule&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;every&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofHours&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;broadcastTo&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hotel.inventory"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;scheduledTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ScheduledTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"equipment-check"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Verify all kitchen equipment is operational"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Schedule&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;every&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofHours&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Scheduled tasks run alongside the ensemble's normal work processing. They use the same concurrency limits -- a scheduled task counts against &lt;code&gt;maxConcurrent&lt;/code&gt; the same as an incoming request. This prevents scheduled tasks from starving request processing.&lt;/p&gt;

&lt;p&gt;The optional &lt;code&gt;broadcastTo&lt;/code&gt; sends the task result to a named topic, where other ensembles can consume it as context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Audit trail
&lt;/h2&gt;

&lt;p&gt;For operational visibility, the audit trail captures significant events across the network:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Work request received/completed/failed&lt;/li&gt;
&lt;li&gt;Capacity changes (profile applied, scaling events)&lt;/li&gt;
&lt;li&gt;Discovery events (capability registered/deregistered)&lt;/li&gt;
&lt;li&gt;Federation events (cross-realm routing)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The audit trail is append-only and can be backed by the same transport infrastructure as the rest of the network (in-memory for development, Kafka for production).&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Rate limits are blunt.&lt;/strong&gt; A concurrency limit of 10 treats all tasks equally. A quick 5-second task and a 5-minute coding task both count as one. For workloads with highly variable task duration, adaptive concurrency (adjusting limits based on observed completion times) would be more effective, but adds complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Profile-based scaling is manual.&lt;/strong&gt; Operational profiles define target capacities, but the actual scaling (adjusting K8s replicas, for instance) must be performed by an external system. The profile broadcast is a signal, not an actuator.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Priority is caller-declared.&lt;/strong&gt; The caller assigns priority when enqueuing a request. There is no built-in mechanism to enforce priority policies or prevent callers from marking everything as high priority. Priority policies need to be enforced by convention or middleware.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Aging is time-based.&lt;/strong&gt; Requests age based on wall-clock time, not queue depth or system load. Under sustained high load, aging may promote low-priority requests sooner than desired. Under low load, aging is irrelevant because everything is processed quickly anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  The design principle
&lt;/h2&gt;

&lt;p&gt;Capacity management for agent networks needs three layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reactive protection&lt;/strong&gt; -- rate limits and backpressure to prevent overload in real time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Priority ordering&lt;/strong&gt; -- ensuring urgent work is processed first, with aging to prevent starvation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proactive adjustment&lt;/strong&gt; -- operational profiles to scale capacity before anticipated load changes&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each layer addresses a different time horizon: seconds (rate limits), minutes (priority queues), and hours/days (operational profiles). Together, they give operators the tools to keep an agent network running under variable load without manual intervention for routine capacity changes.&lt;/p&gt;




&lt;p&gt;Capacity management is part of &lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href="https://agentensemble.net/guides/rate-limiting/" rel="noopener noreferrer"&gt;rate limiting guide&lt;/a&gt;, &lt;a href="https://agentensemble.net/guides/operational-profiles/" rel="noopener noreferrer"&gt;operational profiles guide&lt;/a&gt;, and &lt;a href="https://agentensemble.net/guides/scheduled-tasks/" rel="noopener noreferrer"&gt;scheduled tasks guide&lt;/a&gt; cover the full APIs.&lt;/p&gt;

&lt;p&gt;I'd be interested in how others handle capacity management in agent networks -- especially whether adaptive concurrency limits (based on observed task duration) have been useful in practice.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Shared Memory Across Agent Ensembles: Consistency Models for Distributed State</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Fri, 15 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/shared-memory-across-agent-ensembles-consistency-models-for-distributed-state-kjo</link>
      <guid>https://dev.to/agentensemble/shared-memory-across-agent-ensembles-consistency-models-for-distributed-state-kjo</guid>
      <description>&lt;p&gt;When agent ensembles operate as independent services on a network, they occasionally need to share state. The kitchen ensemble needs to know current inventory levels. The front desk needs to know room assignments. The maintenance team needs to know which equipment is out of service.&lt;/p&gt;

&lt;p&gt;The question is not whether to share state -- it is how to share it without creating the coordination problems that shared mutable state always creates in distributed systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The consistency spectrum
&lt;/h2&gt;

&lt;p&gt;Not all shared state needs the same consistency guarantees. Some observations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inventory notes&lt;/strong&gt; ("extra beer stocked for the weekend") are advisory. If two ensembles read slightly different versions, the consequence is minor. Eventual consistency is fine.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Room assignments&lt;/strong&gt; are exclusive. Two ensembles should not assign the same room to different guests. This needs stronger coordination -- distributed locks or optimistic locking.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configuration preferences&lt;/strong&gt; ("kitchen closes at 11pm") are rarely updated and widely read. Eventual consistency with version tracking works well.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Forcing a single consistency model on all shared state is either too expensive (locking everything) or too weak (eventual consistency for exclusive resources). The useful design is per-scope consistency selection.&lt;/p&gt;

&lt;h2&gt;
  
  
  SharedMemory with configurable consistency
&lt;/h2&gt;

&lt;p&gt;AgentEnsemble v3.0.0 introduces &lt;code&gt;SharedMemory&lt;/code&gt; -- a wrapper around the existing &lt;code&gt;MemoryStore&lt;/code&gt; that adds consistency-aware read/write semantics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;SharedMemory&lt;/span&gt; &lt;span class="n"&gt;sharedMemory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;store&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inMemory&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;consistency&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Consistency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EVENTUAL&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Store an entry&lt;/span&gt;
&lt;span class="n"&gt;sharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;store&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;MemoryEntry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Wagyu beef: 12 portions remaining"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;storedAt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Instant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

&lt;span class="c1"&gt;// Retrieve entries&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;MemoryEntry&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;retrieve&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"beef"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;Consistency&lt;/code&gt; enum controls the coordination behavior:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;EVENTUAL&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Last-write-wins, no coordination&lt;/td&gt;
&lt;td&gt;Context, preferences, notes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;OPTIMISTIC&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Version-checked writes, retry on conflict&lt;/td&gt;
&lt;td&gt;Counters, shared documents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LOCKED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Distributed lock before each read/write&lt;/td&gt;
&lt;td&gt;Room assignments, exclusive resources&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The consistency model is set per &lt;code&gt;SharedMemory&lt;/code&gt; instance, which means different scopes can use different models:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Advisory inventory notes -- eventual consistency&lt;/span&gt;
&lt;span class="nc"&gt;SharedMemory&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;store&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inMemory&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;consistency&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Consistency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EVENTUAL&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Room assignments -- distributed lock&lt;/span&gt;
&lt;span class="nc"&gt;SharedMemory&lt;/span&gt; &lt;span class="n"&gt;rooms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;store&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inMemory&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;consistency&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Consistency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;LOCKED&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How the consistency models work
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Eventual consistency
&lt;/h3&gt;

&lt;p&gt;The simplest model. Writes go directly to the backing store with no coordination. Reads return the latest local value. If two ensembles write to the same key concurrently, the last write wins.&lt;/p&gt;

&lt;p&gt;This is appropriate for state that is informational rather than authoritative -- agent context, preferences, notes, status updates. The cost of a stale read is low.&lt;/p&gt;

&lt;h3&gt;
  
  
  Optimistic locking
&lt;/h3&gt;

&lt;p&gt;Each entry carries a version number. Writes include the expected version. If the actual version differs (because another ensemble wrote since the last read), the write fails with a &lt;code&gt;ConcurrentModificationException&lt;/code&gt; and the caller retries with the updated version.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Read with version&lt;/span&gt;
&lt;span class="nc"&gt;VersionedEntry&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readVersioned&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"beef-count"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Modify and write back with expected version&lt;/span&gt;
&lt;span class="n"&gt;sharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;writeVersioned&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"beef-count"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;withContent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"11 portions"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is appropriate for state that is updated frequently by multiple ensembles but where conflicts are rare. The optimistic assumption is that most writes will not conflict, so the overhead of locking is avoided in the common case.&lt;/p&gt;

&lt;h3&gt;
  
  
  Distributed locking
&lt;/h3&gt;

&lt;p&gt;Each read or write acquires a distributed lock on the scope. Only one ensemble can read or write at a time. This provides the strongest consistency guarantee but the highest coordination cost.&lt;/p&gt;

&lt;p&gt;This is appropriate for exclusive resources -- room assignments, equipment reservations, anything where concurrent access would cause correctness problems.&lt;/p&gt;

&lt;p&gt;The lock implementation depends on the transport backing. With in-process transport, it is a &lt;code&gt;ReentrantLock&lt;/code&gt;. With Kafka or Redis, it is a distributed lock (Redlock or similar).&lt;/p&gt;

&lt;h2&gt;
  
  
  Shared memory in the network configuration
&lt;/h2&gt;

&lt;p&gt;Shared memory scopes are declared at the network level and automatically available to all ensembles in the network:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;NetworkConfig&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ws://kitchen:7329/ws"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"front-desk"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ws://front-desk:7329/ws"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sharedMemory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;SharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;store&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inMemory&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;consistency&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Consistency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EVENTUAL&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sharedMemory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"rooms"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;SharedMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;store&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;inMemory&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;consistency&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Consistency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;LOCKED&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each scope has a name that ensembles use to access it. The kitchen writes to "inventory"; the front desk reads from it. The consistency model is transparent to the application code -- the &lt;code&gt;SharedMemory&lt;/code&gt; API is the same regardless of the consistency level.&lt;/p&gt;

&lt;h2&gt;
  
  
  When shared memory helps
&lt;/h2&gt;

&lt;p&gt;Shared memory is useful when ensembles need to communicate state that does not fit the request/response pattern. Some examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inventory tracking&lt;/strong&gt; -- the kitchen updates inventory as it uses ingredients; room service reads inventory before making promises to guests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shift context&lt;/strong&gt; -- the front desk shares current occupancy, VIP arrivals, and special events; other ensembles use this context to adjust their behavior&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Equipment status&lt;/strong&gt; -- maintenance updates the status of equipment; the kitchen checks before starting tasks that require specific equipment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In each case, the state is written by one ensemble and read by others. The communication is asynchronous and does not require a direct request/response exchange.&lt;/p&gt;

&lt;h2&gt;
  
  
  When shared memory hurts
&lt;/h2&gt;

&lt;p&gt;Shared memory is a distributed state coordination mechanism. It has the same problems as every distributed state coordination mechanism:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consistency vs. availability tradeoff.&lt;/strong&gt; Locked consistency blocks when the lock cannot be acquired (lock holder is down, network partition). Eventual consistency never blocks but may return stale data. There is no free lunch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging is harder.&lt;/strong&gt; When an ensemble reads stale data, the bug manifests as incorrect behavior, not as an error message. Tracing why the kitchen thought there were 12 portions when there are actually 8 requires understanding the write/read timeline across multiple ensembles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scope proliferation.&lt;/strong&gt; Without discipline, shared memory scopes multiply. Each scope is a coordination point that adds operational complexity. Prefer a small number of well-defined scopes over many narrow ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In-memory backing is not durable.&lt;/strong&gt; The default &lt;code&gt;MemoryStore.inMemory()&lt;/code&gt; loses data on process restart. For production, back it with a persistent store (Redis, a database, or a custom &lt;code&gt;MemoryStore&lt;/code&gt; implementation).&lt;/p&gt;

&lt;h2&gt;
  
  
  The design principle
&lt;/h2&gt;

&lt;p&gt;The useful insight is that shared state in agent networks is not monolithic. Different categories of state need different consistency guarantees, and forcing a single model is either too expensive or too weak.&lt;/p&gt;

&lt;p&gt;Per-scope consistency selection lets you use eventual consistency for advisory state (low coordination cost, high availability), optimistic locking for frequently updated counters (low cost in the common case, retry on conflict), and distributed locks for exclusive resources (high coordination cost, strong guarantees).&lt;/p&gt;

&lt;p&gt;The consistency model is a property of the data, not a property of the system. Choose it based on what happens when two ensembles access the same state concurrently.&lt;/p&gt;




&lt;p&gt;Shared memory is part of &lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href="https://agentensemble.net/guides/shared-memory/" rel="noopener noreferrer"&gt;shared memory guide&lt;/a&gt; covers the full API including consistency models and network configuration.&lt;/p&gt;

&lt;p&gt;I'd be interested in how others handle shared state across agent systems -- whether you use explicit shared memory, pass context through task results, or avoid shared state entirely.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Federation for Agent Networks: Cross-Namespace Capability Sharing via Realms</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Wed, 13 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/federation-for-agent-networks-cross-namespace-capability-sharing-via-realms-1adi</link>
      <guid>https://dev.to/agentensemble/federation-for-agent-networks-cross-namespace-capability-sharing-via-realms-1adi</guid>
      <description>&lt;p&gt;Discovery lets ensembles find capabilities within a network. But in a real deployment, not every ensemble lives in the same namespace or even the same cluster. A hotel chain might run separate ensemble networks at each property, each in its own Kubernetes namespace, but want them to share spare capacity when one property is overloaded.&lt;/p&gt;

&lt;p&gt;This is the federation problem: how do you extend capability discovery across trust and network boundaries without collapsing everything into one flat namespace?&lt;/p&gt;

&lt;h2&gt;
  
  
  Realms as trust boundaries
&lt;/h2&gt;

&lt;p&gt;AgentEnsemble v3.0.0 introduces &lt;strong&gt;realms&lt;/strong&gt; as the organizational unit for federation. A realm is a namespace-level discovery and trust boundary -- typically mapping to a Kubernetes namespace in production deployments.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;FederationConfig&lt;/span&gt; &lt;span class="n"&gt;federation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FederationConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;localRealm&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hotel-downtown"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;federationName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hotel Chain"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;realm&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hotel-airport"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"hotel-airport-ns"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;realm&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hotel-beach"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"hotel-beach-ns"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Within a realm, ensembles discover each other freely. Cross-realm discovery requires explicit opt-in: an ensemble must advertise its capacity as &lt;strong&gt;shareable&lt;/strong&gt; for other realms to use it.&lt;/p&gt;

&lt;p&gt;This is a deliberate design choice. Discovery within a namespace is expected and low-risk. Discovery across namespaces is a capacity-sharing decision that should be explicit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Capacity advertisement
&lt;/h2&gt;

&lt;p&gt;Ensembles periodically broadcast their current load and availability using capacity updates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"capacity_update"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ensemble"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kitchen"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"realm"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hotel-downtown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"available"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"currentLoad"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.35&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxConcurrent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"shareable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;shareable&lt;/code&gt; flag is the federation gate. When &lt;code&gt;true&lt;/code&gt;, the ensemble's spare capacity is available to ensembles in other realms. When &lt;code&gt;false&lt;/code&gt;, the ensemble only serves local requests.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;CapacityAdvertiser&lt;/code&gt; handles the periodic broadcasting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;CapacityAdvertiser&lt;/span&gt; &lt;span class="n"&gt;advertiser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;CapacityAdvertiser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"hotel-downtown"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;computeCurrentLoad&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
    &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;                              &lt;span class="c1"&gt;// max concurrent&lt;/span&gt;
    &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;                             &lt;span class="c1"&gt;// shareable to other realms&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;broadcast&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

&lt;span class="n"&gt;advertiser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSeconds&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Status is derived automatically from load: &lt;code&gt;"available"&lt;/code&gt; when load is below 1.0, &lt;code&gt;"busy"&lt;/code&gt; when at capacity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The routing hierarchy
&lt;/h2&gt;

&lt;p&gt;When an ensemble needs a capability, the &lt;code&gt;FederationRegistry&lt;/code&gt; routes the request using a three-level hierarchy:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Priority&lt;/th&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;Condition&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1 (highest)&lt;/td&gt;
&lt;td&gt;Local realm&lt;/td&gt;
&lt;td&gt;Provider is in the same realm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Same realm (unregistered)&lt;/td&gt;
&lt;td&gt;Provider has no realm info (assumed local)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3 (lowest)&lt;/td&gt;
&lt;td&gt;Cross-realm&lt;/td&gt;
&lt;td&gt;Provider is in a different realm and &lt;code&gt;shareable = true&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Within each level, the least-loaded provider is preferred.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;FederationRegistry&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FederationRegistry&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;capabilityRegistry&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Find the best provider using the routing hierarchy&lt;/span&gt;
&lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findProvider&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"hotel-downtown"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The routing hierarchy encodes a simple operational principle: prefer local providers (lower latency, same failure domain), fall back to cross-realm providers when local capacity is insufficient.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters for agent systems
&lt;/h2&gt;

&lt;p&gt;Federation solves a specific operational problem: agent networks that need to scale across deployment boundaries without becoming a single monolithic system.&lt;/p&gt;

&lt;p&gt;Consider a hotel chain with three properties. Each property runs its own ensemble network (kitchen, front desk, maintenance, room service). Each network is self-contained and operates independently. But during peak events -- a conference at one property, a holiday weekend at another -- one property's kitchen may be overwhelmed while another has spare capacity.&lt;/p&gt;

&lt;p&gt;Without federation, each property is an island. The overloaded kitchen queues requests or drops them. With federation, the overloaded kitchen's requests overflow to a kitchen in another realm that has advertised spare capacity.&lt;/p&gt;

&lt;p&gt;The key constraint is that federation should be additive, not required. Each realm must work independently. Federation adds cross-realm routing as an optimization, not a dependency. If the federation link goes down, each realm continues operating on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Network configuration
&lt;/h2&gt;

&lt;p&gt;Enable federation at the network level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;NetworkConfig&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ws://kitchen:7329/ws"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensemble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"maintenance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ws://maintenance:7329/ws"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;federationConfig&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;FederationConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;localRealm&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hotel-downtown"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;federationName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hotel Chain"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;realm&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hotel-airport"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"hotel-airport-ns"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;realm&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hotel-beach"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"hotel-beach-ns"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;FederationConfig&lt;/code&gt; is optional. Without it, the network operates in single-realm mode -- standard discovery without cross-realm routing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Cross-realm latency.&lt;/strong&gt; Requests routed to a different realm incur network latency that local requests do not. For agent workloads where task execution takes seconds or minutes, the routing overhead is negligible. For latency-sensitive workflows, the local-first routing hierarchy mitigates this -- cross-realm routing only happens when local capacity is insufficient.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Capacity staleness.&lt;/strong&gt; Capacity updates are periodic (default: every 10 seconds). Between updates, the routing decisions are based on slightly stale data. An ensemble that was available 5 seconds ago might be at capacity now. The consequence is occasional request routing to busy providers, which queue the request rather than rejecting it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trust is implicit.&lt;/strong&gt; Realm membership is declared in configuration, not enforced cryptographically. Any ensemble that claims to be in a realm is trusted. For environments where this matters, the transport layer needs authentication -- which is outside the scope of the federation layer itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Realm topology is static.&lt;/strong&gt; The set of realms in a federation is declared at configuration time. You cannot add new realms at runtime without reconfiguring existing ensembles. For dynamic environments where namespaces are created and destroyed frequently, this requires a configuration management strategy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shareable is binary.&lt;/strong&gt; An ensemble either shares all its spare capacity or none. There is no per-realm or per-capability sharing control. If you need more granular sharing policies, you would need to implement them in the capacity advertisement logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  The design principle
&lt;/h2&gt;

&lt;p&gt;The useful insight is that federation is a capacity-sharing problem, not a networking problem. The networking (WebSockets, Kafka) already works across boundaries. What federation adds is a policy layer: who can use whose spare capacity, and in what order.&lt;/p&gt;

&lt;p&gt;Realms provide the organizational unit. Capacity advertisement provides the data. The routing hierarchy provides the policy. Together, they turn a collection of independent agent networks into a cooperative federation that shares spare capacity while maintaining operational independence.&lt;/p&gt;




&lt;p&gt;Federation is part of &lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href="https://agentensemble.net/guides/federation/" rel="noopener noreferrer"&gt;federation guide&lt;/a&gt; covers the full API including capacity advertisement and realm configuration.&lt;/p&gt;

&lt;p&gt;I'd be interested in whether others are dealing with multi-namespace agent deployments, and how they handle cross-boundary capability sharing.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Dynamic Discovery in Agent Networks: From Hardcoded Routes to Capability Catalogs</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Mon, 11 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/dynamic-discovery-in-agent-networks-from-hardcoded-routes-to-capability-catalogs-1mh</link>
      <guid>https://dev.to/agentensemble/dynamic-discovery-in-agent-networks-from-hardcoded-routes-to-capability-catalogs-1mh</guid>
      <description>&lt;p&gt;The simplest way to connect two agent ensembles is a direct reference: ensemble A knows ensemble B's address and calls it. This works when you have two or three ensembles with stable relationships.&lt;/p&gt;

&lt;p&gt;It stops working when you have ten ensembles, or when ensembles come and go, or when the same capability is provided by multiple ensembles and you want the caller to use whichever one is available. At that point, you need discovery -- a way for ensembles to find capabilities without knowing in advance who provides them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The static wiring problem
&lt;/h2&gt;

&lt;p&gt;In a statically wired agent network, every cross-ensemble call requires knowing the provider's identity and address:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Static: caller knows exactly who to call&lt;/span&gt;
&lt;span class="nc"&gt;NetworkTask&lt;/span&gt; &lt;span class="n"&gt;mealTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"ws://kitchen:7329/ws"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates coupling. If the kitchen ensemble moves to a different port, every caller needs updating. If you add a second kitchen for capacity, callers need load-balancing logic. If the kitchen goes down, callers need fallback logic.&lt;/p&gt;

&lt;p&gt;The fundamental issue is that callers should care about &lt;em&gt;what&lt;/em&gt; they need (a meal preparation capability), not &lt;em&gt;who&lt;/em&gt; provides it or &lt;em&gt;where&lt;/em&gt; it runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Capability advertisement with tags
&lt;/h2&gt;

&lt;p&gt;AgentEnsemble v3.0.0 introduces capability discovery. Ensembles advertise their shared tasks and tools with optional tags, and other ensembles discover providers at runtime.&lt;/p&gt;

&lt;h3&gt;
  
  
  Advertising capabilities
&lt;/h3&gt;

&lt;p&gt;When building an ensemble, declare what it shares with the network:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;kitchen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Manage kitchen operations"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;shareTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"check-inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inventoryTool&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"food"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"stock"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;shareTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mealTask&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"food"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"cooking"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;kitchen&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7329&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;shareTool&lt;/code&gt; and &lt;code&gt;shareTask&lt;/code&gt; methods register capabilities in the network's capability registry. The trailing string arguments are tags -- metadata that classifies the capability for filtered discovery.&lt;/p&gt;

&lt;h3&gt;
  
  
  Discovering capabilities
&lt;/h3&gt;

&lt;p&gt;Another ensemble can discover providers without knowing their identity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Discover by capability name&lt;/span&gt;
&lt;span class="nc"&gt;NetworkTool&lt;/span&gt; &lt;span class="n"&gt;inventoryCheck&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkTool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;discover&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"check-inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Discover by tag&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;CapabilityInfo&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;foodCapabilities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByTag&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"food"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The registry returns the provider that currently offers the requested capability. If multiple providers offer the same capability, the registry can apply selection logic (round-robin, least-loaded, affinity-based).&lt;/p&gt;

&lt;h2&gt;
  
  
  Tag-based catalogs
&lt;/h2&gt;

&lt;p&gt;Tags turn the capability registry into a searchable catalog. Rather than querying for specific capability names, you can query for categories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Find all capabilities tagged with "food"&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;CapabilityInfo&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;food&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByTag&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"food"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Find all capabilities tagged with both "food" and "stock"&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;CapabilityInfo&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;stockChecks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByTags&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"food"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"stock"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each &lt;code&gt;CapabilityInfo&lt;/code&gt; includes the capability name, type (task or tool), provider ensemble name, and tags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;CapabilityInfo&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;food&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;           &lt;span class="c1"&gt;// "check-inventory"&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;           &lt;span class="c1"&gt;// "TOOL"&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;   &lt;span class="c1"&gt;// "kitchen"&lt;/span&gt;
&lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;      &lt;span class="c1"&gt;// ["food", "stock"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is useful for building dynamic agent systems where an orchestrating ensemble does not know in advance what capabilities are available. It can discover capabilities at runtime, filter by category, and wire them into its workflow dynamically.&lt;/p&gt;

&lt;h2&gt;
  
  
  The registry abstraction
&lt;/h2&gt;

&lt;p&gt;The capability registry is part of the transport SPI, which means it has pluggable implementations:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;th&gt;Backing&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;In-memory&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ConcurrentHashMap&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Development, testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WebSocket-broadcast&lt;/td&gt;
&lt;td&gt;Network messages&lt;/td&gt;
&lt;td&gt;Multi-process, simple mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kafka-backed&lt;/td&gt;
&lt;td&gt;Kafka topics&lt;/td&gt;
&lt;td&gt;Production, durable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In development, capabilities are registered and discovered within a single process or across WebSocket connections. In production, the registry can be backed by Kafka for durability and horizontal scaling.&lt;/p&gt;

&lt;p&gt;The application code that registers and discovers capabilities does not change between implementations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dynamic vs. static wiring
&lt;/h2&gt;

&lt;p&gt;The choice between static and dynamic wiring is not binary. A practical network often uses both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static wiring&lt;/strong&gt; for well-known, stable relationships (the front desk always calls the kitchen)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic discovery&lt;/strong&gt; for capabilities that may be provided by different ensembles depending on deployment, capacity, or availability
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Static: known relationship&lt;/span&gt;
&lt;span class="nc"&gt;NetworkTask&lt;/span&gt; &lt;span class="n"&gt;knownMealTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkTask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"prepare-meal"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"ws://kitchen:7329/ws"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Dynamic: discover at runtime&lt;/span&gt;
&lt;span class="nc"&gt;NetworkTool&lt;/span&gt; &lt;span class="n"&gt;discovered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NetworkTool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;discover&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"check-inventory"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The two approaches coexist. Static tasks bypass the registry entirely. Dynamic tasks use the registry for resolution. The agent using the task or tool does not know which approach was used to create it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Capability lifecycle
&lt;/h2&gt;

&lt;p&gt;Capabilities have a lifecycle that mirrors the ensemble lifecycle:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Registration&lt;/strong&gt; -- when the ensemble starts and calls &lt;code&gt;shareTask&lt;/code&gt; or &lt;code&gt;shareTool&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt; -- when other ensembles query the registry&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deregistration&lt;/strong&gt; -- when the ensemble stops or the capability is removed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In simple mode, deregistration happens when the ensemble process exits and the in-memory registry is garbage collected. With WebSocket transport, the ensemble broadcasts a deregistration message on shutdown. With Kafka, a tombstone record is produced.&lt;/p&gt;

&lt;p&gt;The lifecycle matters for production systems. A stale registry entry (pointing to an ensemble that no longer exists) causes request failures. The registry needs to handle stale entries, either through explicit deregistration, heartbeat-based expiry, or health-check-based cleanup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Discovery adds a lookup step.&lt;/strong&gt; Every dynamically discovered capability requires a registry query. In practice, this is cached -- the first lookup queries the registry, subsequent uses of the same capability reuse the resolved provider. But the initial resolution adds latency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tag semantics are convention-based.&lt;/strong&gt; There is no schema for tags. If one ensemble tags a capability as "food" and another tags it as "cuisine", they will not discover each other. Tag conventions need to be agreed upon across teams.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multiple providers create ambiguity.&lt;/strong&gt; When two ensembles offer the same capability, the registry needs a selection strategy. The current implementation supports least-loaded selection (when capacity information is available), but more sophisticated strategies (affinity, cost-based, latency-based) would need to be built.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Registry availability is a dependency.&lt;/strong&gt; If the registry is unavailable, dynamic discovery fails. Static wiring works regardless of registry state. For critical paths, consider falling back to static wiring when discovery is unavailable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The design principle
&lt;/h2&gt;

&lt;p&gt;The useful abstraction is separating &lt;em&gt;what&lt;/em&gt; from &lt;em&gt;who&lt;/em&gt;. An ensemble that needs a meal preparation capability should express that need ("I need prepare-meal") without specifying the provider ("specifically from the kitchen ensemble at ws://kitchen:7329/ws").&lt;/p&gt;

&lt;p&gt;This separation enables the network to evolve. New providers can come online. Existing providers can be replaced. Capacity can be redistributed. The callers do not need to change.&lt;/p&gt;

&lt;p&gt;Discovery is the mechanism. Tags make it searchable. The transport SPI makes it portable across deployment environments.&lt;/p&gt;




&lt;p&gt;Capability discovery is part of &lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href="https://agentensemble.net/guides/discovery/" rel="noopener noreferrer"&gt;discovery guide&lt;/a&gt; covers the full API including tag-based filtering.&lt;/p&gt;

&lt;p&gt;I'd be interested in how others handle capability discovery in multi-agent systems -- whether you use service registries, hardcoded routes, or something else entirely.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Durable Transport for Agent Networks: Moving from In-Process Queues to Kafka</title>
      <dc:creator>mgd43b</dc:creator>
      <pubDate>Sat, 09 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/agentensemble/durable-transport-for-agent-networks-moving-from-in-process-queues-to-kafka-43bi</link>
      <guid>https://dev.to/agentensemble/durable-transport-for-agent-networks-moving-from-in-process-queues-to-kafka-43bi</guid>
      <description>&lt;p&gt;In-process queues are fine for development. They are fast, deterministic, and require zero infrastructure. But they have a property that becomes a liability in production: when the process dies, the queue contents disappear.&lt;/p&gt;

&lt;p&gt;For agent networks that run as long-lived services -- handling work requests over hours or days -- losing queued requests on restart is not acceptable. The transport layer needs durability, and that means moving from in-process data structures to something that survives process failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  What durability means for agent networks
&lt;/h2&gt;

&lt;p&gt;An agent ensemble network has three communication patterns that need durable backing:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Work request delivery&lt;/strong&gt; -- a request from one ensemble to another should not be lost if the receiving ensemble is temporarily unavailable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Response routing&lt;/strong&gt; -- when an ensemble completes a request, the response needs to reach the original caller even if the caller restarted&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capability advertisement&lt;/strong&gt; -- shared tasks and tools should remain discoverable across process restarts&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each of these has different durability requirements. Work requests are the most critical -- a lost request means lost work. Response routing needs correlation (matching responses to requests). Capability advertisement needs eventual consistency but not strict durability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kafka as the transport backing
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;agentensemble-transport-kafka&lt;/code&gt; module implements the transport SPIs against Apache Kafka. All components share a single configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;KafkaTransportConfig&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaTransportConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;bootstrapServers&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kafka:9092"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;consumerGroupId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen-ensemble"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;topicPrefix&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"agentensemble."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Request queues
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;KafkaRequestQueue&lt;/code&gt; produces work requests to a Kafka topic and consumes them with manual offset commits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;KafkaRequestQueue&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaRequestQueue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensembleName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Enqueue a work request (produces to Kafka)&lt;/span&gt;
&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;enqueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workRequest&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Poll for requests (consumes from Kafka)&lt;/span&gt;
&lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;WorkRequest&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;poll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSeconds&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The topic name is derived from the ensemble name and prefix: &lt;code&gt;agentensemble.kitchen.requests&lt;/code&gt;. Manual offset commits ensure that a request is only acknowledged after the ensemble has finished processing it. If the ensemble crashes mid-processing, the request will be redelivered on restart.&lt;/p&gt;

&lt;h3&gt;
  
  
  Delivery registry
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;KafkaDeliveryRegistry&lt;/code&gt; tracks pending deliveries and routes responses back to callers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;KafkaDeliveryRegistry&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaDeliveryRegistry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Register a pending delivery (before sending request)&lt;/span&gt;
&lt;span class="nc"&gt;CompletableFuture&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;register&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;requestId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Complete the delivery when response arrives&lt;/span&gt;
&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;complete&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;requestId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;responsePayload&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Caller awaits the result&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The registry uses a Kafka topic for durability: pending deliveries are produced as records, and completions are produced as tombstones. On restart, the registry rebuilds its state by replaying the topic from the beginning.&lt;/p&gt;

&lt;h3&gt;
  
  
  Priority queues with aging
&lt;/h3&gt;

&lt;p&gt;For workloads where some requests are more urgent than others, the &lt;code&gt;PriorityRequestQueue&lt;/code&gt; adds priority levels with aging:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;PriorityRequestQueue&lt;/span&gt; &lt;span class="n"&gt;priorityQueue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PriorityRequestQueue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requestQueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kafkaQueue&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;levels&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;                    &lt;span class="c1"&gt;// 3 priority levels (0 = highest)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;agingInterval&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofMinutes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Enqueue with priority&lt;/span&gt;
&lt;span class="n"&gt;priorityQueue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;enqueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urgentRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// highest priority&lt;/span&gt;
&lt;span class="n"&gt;priorityQueue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;enqueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normalRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// normal priority&lt;/span&gt;
&lt;span class="n"&gt;priorityQueue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;enqueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batchRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;    &lt;span class="c1"&gt;// lowest priority&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Aging prevents starvation: requests that have waited longer than the aging interval are promoted to the next higher priority level. A batch request that has been waiting for 10 minutes (two aging intervals) gets promoted twice, eventually reaching the highest priority.&lt;/p&gt;

&lt;p&gt;This is implemented as a layer on top of any &lt;code&gt;RequestQueue&lt;/code&gt; implementation, so it works with both in-process and Kafka-backed queues.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changes operationally
&lt;/h2&gt;

&lt;p&gt;Moving from in-process to Kafka transport changes the operational profile of the ensemble network:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Startup behavior changes.&lt;/strong&gt; With in-process queues, an ensemble starts with an empty queue. With Kafka, it may start with a backlog of unprocessed requests from before the restart. The ensemble needs to handle this gracefully -- processing the backlog before accepting new work, or processing both concurrently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure modes change.&lt;/strong&gt; In-process queue failures are process-fatal (if the process dies, the queue is gone). Kafka failures are infrastructure-level (broker unavailable, topic not found, authorization errors). The error handling needs to distinguish between transient failures (retry) and permanent failures (alert and skip).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monitoring needs change.&lt;/strong&gt; With in-process queues, queue depth is a simple counter. With Kafka, you need to monitor consumer lag, topic partition health, and broker connectivity. The ensemble's health check needs to include Kafka reachability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ordering semantics change.&lt;/strong&gt; In-process queues provide strict FIFO. Kafka provides per-partition ordering, which means requests may be processed out of order if the topic has multiple partitions. For most agent workloads, this is fine -- requests are independent. But if your workflow depends on ordering, you need single-partition topics or application-level sequencing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The configuration boundary
&lt;/h2&gt;

&lt;p&gt;One design decision worth calling out: the Kafka transport configuration is separate from the ensemble configuration. The ensemble does not know it is using Kafka -- it interacts with the transport SPIs. The Kafka-specific configuration (bootstrap servers, consumer groups, topic prefixes) lives in the infrastructure layer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Infrastructure layer: Kafka-specific setup&lt;/span&gt;
&lt;span class="nc"&gt;KafkaTransportConfig&lt;/span&gt; &lt;span class="n"&gt;kafkaConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaTransportConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;bootstrapServers&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kafka:9092"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;consumerGroupId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen-ensemble"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;KafkaRequestQueue&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaRequestQueue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kafkaConfig&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ensembleName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"kitchen"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Application layer: ensemble setup (transport-agnostic)&lt;/span&gt;
&lt;span class="nc"&gt;Ensemble&lt;/span&gt; &lt;span class="n"&gt;kitchen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ensemble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Manage kitchen operations"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requestQueue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This separation means the same ensemble code works in development (with in-process queues) and production (with Kafka) without changes. The transport choice is an infrastructure decision, not an application decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Operational complexity.&lt;/strong&gt; Kafka is infrastructure that needs to be provisioned, monitored, and maintained. For small deployments, the operational overhead may not be justified. The in-process transport with periodic state snapshots might be a simpler alternative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Latency.&lt;/strong&gt; Kafka adds millisecond-scale latency to every request delivery. For agent workloads where task execution takes seconds or minutes, this is negligible. For sub-second workflows, it may not be.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Topic proliferation.&lt;/strong&gt; Each ensemble gets its own request topic. A network of 20 ensembles means 20+ Kafka topics. This is manageable but requires topic lifecycle management (creation, cleanup, retention policies).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exactly-once is hard.&lt;/strong&gt; The current implementation provides at-least-once delivery. A request may be processed twice if the ensemble crashes after completing the work but before committing the offset. For most agent workloads (which are non-deterministic anyway), this is acceptable. For workloads that require exactly-once, additional deduplication logic is needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use durable transport
&lt;/h2&gt;

&lt;p&gt;The decision is straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Development and testing:&lt;/strong&gt; in-process transport. Zero setup, fast, deterministic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single-node production:&lt;/strong&gt; in-process transport with periodic state persistence. Simple, no external dependencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-node production:&lt;/strong&gt; Kafka transport. Durability, horizontal scaling, replay capability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edge or embedded:&lt;/strong&gt; in-process transport. No infrastructure dependency.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The transport SPI lets you make this decision per-deployment without changing application code.&lt;/p&gt;




&lt;p&gt;The Kafka transport module is part of &lt;a href="https://github.com/AgentEnsemble/agentensemble" rel="noopener noreferrer"&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href="https://agentensemble.net/guides/durable-transport/" rel="noopener noreferrer"&gt;durable transport guide&lt;/a&gt; covers the full configuration and operational details.&lt;/p&gt;

&lt;p&gt;I'd be interested in whether others are using Kafka (or similar) for agent-to-agent communication, and what delivery guarantee level they find sufficient in practice.&lt;/p&gt;

</description>
      <category>java</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
