<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Moses Man</title>
    <description>The latest articles on DEV Community by Moses Man (@mosesman831).</description>
    <link>https://dev.to/mosesman831</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1385970%2Ff08940ab-8bb1-44a6-be2c-5db1a8d75e76.jpeg</url>
      <title>DEV Community: Moses Man</title>
      <link>https://dev.to/mosesman831</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mosesman831"/>
    <language>en</language>
    <item>
      <title>I built multi-model orchestration as a Hermes skill</title>
      <dc:creator>Moses Man</dc:creator>
      <pubDate>Sat, 30 May 2026 00:26:56 +0000</pubDate>
      <link>https://dev.to/mosesman831/i-built-multi-model-orchestration-as-a-hermes-skill-3j6h</link>
      <guid>https://dev.to/mosesman831/i-built-multi-model-orchestration-as-a-hermes-skill-3j6h</guid>
      <description>&lt;p&gt;Most AI pipelines use one model for everything.&lt;/p&gt;

&lt;p&gt;One model plans the task, does the research, writes the answer, and checks its own work. That's like hiring one person to be your strategist, researcher, analyst, and auditor simultaneously. It doesn't scale - and more importantly, the model has no one to disagree with it.&lt;/p&gt;

&lt;p&gt;PolyBrain is my answer to that. It's an open-source multi-agent, multi-model orchestration skill for Hermes Agent. You give it an objective, it breaks the work into roles, runs them in parallel where it can, synthesizes the outputs, and verifies every claim against its cited sources before it reaches you.&lt;/p&gt;

&lt;p&gt;Here's what that looks like end to end.&lt;/p&gt;




&lt;h2&gt;
  
  
  The pipeline
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Objective -&amp;gt; Orchestrator -&amp;gt; [Researcher 1 + Researcher 2 + Builder] -&amp;gt; Synthesizer -&amp;gt; Verifier -&amp;gt; Final Answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Five distinct roles. Each one does exactly one thing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Orchestrator&lt;/strong&gt; - reads the objective, decomposes it into a JSON task plan, assigns roles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Researcher&lt;/strong&gt; - web search and citations - no uncited claims allowed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Builder&lt;/strong&gt; - code, terminal, and file operations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Synthesizer&lt;/strong&gt; - merges all outputs into a single coherent deliverable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verifier&lt;/strong&gt; - checks every claim against its source, returns PASS/FAIL per claim&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The parallel phase is where the time savings come from. Researcher 1 and Researcher 2 run at the same time. The Synthesizer only fires once both are done. The Verifier runs last.&lt;/p&gt;




&lt;h2&gt;
  
  
  What makes it different
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Different models per role
&lt;/h3&gt;

&lt;p&gt;This is the part most orchestration frameworks don't do.&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;config.yaml&lt;/code&gt; you assign a different model and provider to each role. Researcher gets a cheaper, faster model. Verifier gets a stronger one. Orchestrator gets whatever you trust most for structured JSON output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;models&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;orchestrator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-model"&lt;/span&gt;
  &lt;span class="na"&gt;researcher&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-model"&lt;/span&gt;
  &lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-model"&lt;/span&gt;
  &lt;span class="na"&gt;synthesizer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-model"&lt;/span&gt;
  &lt;span class="na"&gt;verifier&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-model"&lt;/span&gt;
  &lt;span class="na"&gt;fallback&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-model"&lt;/span&gt;

&lt;span class="na"&gt;settings&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;max_parallel&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;timeout_sec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You decide what goes where. PolyBrain doesn't prescribe it - because the right answer depends on what you're running, what you're paying, and what you actually trust.&lt;/p&gt;

&lt;h3&gt;
  
  
  Citation enforcement
&lt;/h3&gt;

&lt;p&gt;Researchers are required to include URLs. Uncited claims don't make it to the Synthesizer - they're dropped at the source. This isn't a soft suggestion in the prompt. It's structural: the Verifier checks each surviving claim against its cited source and returns a verdict.&lt;/p&gt;

&lt;p&gt;In the example run below, the Verifier caught a real data discrepancy in Azure revenue figures. That's the point - you want something that pushes back.&lt;/p&gt;

&lt;h3&gt;
  
  
  Artifact logging
&lt;/h3&gt;

&lt;p&gt;Every run saves a timestamped folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;.hermes/plans/polybrain/20260528_191548/
├── orchestrator.json
├── task_t1.md
├── task_t2.md
├── synthesis.md
└── verification.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can audit exactly what each role produced and why the final answer looks the way it does.&lt;/p&gt;




&lt;h2&gt;
  
  
  A real example
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Objective:&lt;/strong&gt; "Create a market brief on Apple with three bullets on revenue trends and two competitors."&lt;/p&gt;

&lt;p&gt;The Orchestrator decomposes this into four tasks - two parallel researchers, a synthesizer, a verifier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;t1 (Researcher)&lt;/strong&gt; - revenue trends - pulls from SEC filings and Apple Newsroom. Returns three years of top-line figures ($383.3B -&amp;gt; $391.0B - $416.2B) with a source URL for each.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;t2 (Researcher)&lt;/strong&gt; - competitor profiles - Samsung (hardware/smartphones) and Microsoft (cloud/AI). Revenue context and competitive positioning, all cited.&lt;/p&gt;

&lt;p&gt;Both run at the same time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;t3 (Synthesizer)&lt;/strong&gt; - merges both outputs into a clean brief. Preserves inline citations. Drops anything uncited.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;t4 (Verifier)&lt;/strong&gt; - checks every claim against its source. Flags a mismatch in a competitor cloud revenue figure, provides the corrected bullet with evidence.&lt;/p&gt;

&lt;p&gt;Total runtime: ~4 minutes. Parallel research phase: ~2 minutes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone into your Hermes skills folder&lt;/span&gt;
git clone &lt;span class="nt"&gt;--depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 https://github.com/mosesman831/PolyBrain.git /tmp/polybrain
&lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; /tmp/polybrain ~/.hermes/skills/research/polybrain

&lt;span class="c"&gt;# Edit config.yaml with your model aliases&lt;/span&gt;
&lt;span class="c"&gt;# Then validate it&lt;/span&gt;
python ~/.hermes/skills/research/polybrain/scripts/validate_config.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then just tell Hermes what you want:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use PolyBrain to research Apple's latest earnings and competitors
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hermes loads the skill. PolyBrain handles the rest.&lt;/p&gt;




&lt;h2&gt;
  
  
  What it doesn't do (yet)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;No persistent state across runs - if it crashes mid-run, you restart from scratch. Hermes Kanban handles durable state natively but PolyBrain doesn't plug into it yet.&lt;/li&gt;
&lt;li&gt;Some models hang in subagent calls - test with &lt;code&gt;hermes chat -q "ping" -m your-model&lt;/code&gt; before committing a model to a role.&lt;/li&gt;
&lt;li&gt;Verifier can occasionally truncate numbers - PASS/FAIL verdicts are structurally correct but some models strip leading digits from dollar amounts in the report text.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why I built it this way
&lt;/h2&gt;

&lt;p&gt;Single-model pipelines have a ceiling. The model can't critique itself meaningfully. There's no disagreement, no verification layer, no separation between "the thing that did the research" and "the thing that checks the research."&lt;/p&gt;

&lt;p&gt;PolyBrain is built around the idea that different roles benefit from different models - and that the value of an orchestration layer is precisely that it enforces structure the models themselves wouldn't maintain.&lt;/p&gt;

&lt;p&gt;It's a Hermes skill, it's config-driven, it's open source. If you're running Hermes Agent and want to try it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/mosesman831/PolyBrain" rel="noopener noreferrer"&gt;github.com/mosesman831/PolyBrain&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feedback welcome - especially if you find models that work well for specific roles.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was written with the help of AI, based on my own docs, config, and terminal output.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Made with ❤️ by LatticeAG&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agentskills</category>
      <category>agents</category>
      <category>hermes</category>
    </item>
  </channel>
</rss>
