I built multi-model orchestration as a Hermes skill - Polybrain

#ai #agentskills #agents #hermes

Most AI pipelines use one model for everything.

One model plans the task, does the research, writes the answer, and checks its own work. That's like hiring one person to be your strategist, researcher, analyst, and auditor simultaneously. It doesn't scale - and more importantly, the model has no one to disagree with it.

PolyBrain is my answer to that. It's an open-source multi-agent, multi-model orchestration skill for Hermes Agent. You give it an objective, it breaks the work into roles, runs them in parallel where it can, synthesizes the outputs, and verifies every claim against its cited sources before it reaches you.

Here's what that looks like end to end.

The pipeline

Objective -> Orchestrator -> [Researcher 1 + Researcher 2 + Builder] -> Synthesizer -> Verifier -> Final Answer

Five distinct roles. Each one does exactly one thing:

Orchestrator - reads the objective, decomposes it into a JSON task plan, assigns roles
Researcher - web search and citations - no uncited claims allowed
Builder - code, terminal, and file operations
Synthesizer - merges all outputs into a single coherent deliverable
Verifier - checks every claim against its source, returns PASS/FAIL per claim

The parallel phase is where the time savings come from. Researcher 1 and Researcher 2 run at the same time. The Synthesizer only fires once both are done. The Verifier runs last.

What makes it different

Different models per role

This is the part most orchestration frameworks don't do.

In config.yaml you assign a different model and provider to each role. Researcher gets a cheaper, faster model. Verifier gets a stronger one. Orchestrator gets whatever you trust most for structured JSON output.

models:
  orchestrator: "your-model"
  researcher: "your-model"
  builder: "your-model"
  synthesizer: "your-model"
  verifier: "your-model"
  fallback: "your-model"

settings:
  max_parallel: 3
  timeout_sec: 300

You decide what goes where. PolyBrain doesn't prescribe it - because the right answer depends on what you're running, what you're paying, and what you actually trust.

Citation enforcement

Researchers are required to include URLs. Uncited claims don't make it to the Synthesizer - they're dropped at the source. This isn't a soft suggestion in the prompt. It's structural: the Verifier checks each surviving claim against its cited source and returns a verdict.

In the example run below, the Verifier caught a real data discrepancy in Azure revenue figures. That's the point - you want something that pushes back.

Artifact logging

Every run saves a timestamped folder:

.hermes/plans/polybrain/20260528_191548/
├── orchestrator.json
├── task_t1.md
├── task_t2.md
├── synthesis.md
└── verification.md

You can audit exactly what each role produced and why the final answer looks the way it does.

A real example

Objective: "Create a market brief on Apple with three bullets on revenue trends and two competitors."

The Orchestrator decomposes this into four tasks - two parallel researchers, a synthesizer, a verifier.

t1 (Researcher) - revenue trends - pulls from SEC filings and Apple Newsroom. Returns three years of top-line figures ($383.3B -> $391.0B - $416.2B) with a source URL for each.

t2 (Researcher) - competitor profiles - Samsung (hardware/smartphones) and Microsoft (cloud/AI). Revenue context and competitive positioning, all cited.

Both run at the same time.

t3 (Synthesizer) - merges both outputs into a clean brief. Preserves inline citations. Drops anything uncited.

t4 (Verifier) - checks every claim against its source. Flags a mismatch in a competitor cloud revenue figure, provides the corrected bullet with evidence.

Total runtime: ~4 minutes. Parallel research phase: ~2 minutes.

Getting started

# Clone into your Hermes skills folder
git clone --depth=1 https://github.com/mosesman831/PolyBrain.git /tmp/polybrain
cp -r /tmp/polybrain ~/.hermes/skills/research/polybrain

# Edit config.yaml with your model aliases
# Then validate it
python ~/.hermes/skills/research/polybrain/scripts/validate_config.py

Then just tell Hermes what you want:

Use PolyBrain to research Apple's latest earnings and competitors

Hermes loads the skill. PolyBrain handles the rest.

What it doesn't do (yet)

No persistent state across runs - if it crashes mid-run, you restart from scratch. Hermes Kanban handles durable state natively but PolyBrain doesn't plug into it yet.
Some models hang in subagent calls - test with hermes chat -q "ping" -m your-model before committing a model to a role.
Verifier can occasionally truncate numbers - PASS/FAIL verdicts are structurally correct but some models strip leading digits from dollar amounts in the report text.

Why I built it this way

Single-model pipelines have a ceiling. The model can't critique itself meaningfully. There's no disagreement, no verification layer, no separation between "the thing that did the research" and "the thing that checks the research."

PolyBrain is built around the idea that different roles benefit from different models - and that the value of an orchestration layer is precisely that it enforces structure the models themselves wouldn't maintain.

It's a Hermes skill, it's config-driven, it's open source. If you're running Hermes Agent and want to try it:

GitHub: github.com/mosesman831/PolyBrain

Feedback welcome - especially if you find models that work well for specific roles.

This article was written with the help of AI, based on my own docs, config, and terminal output.

Made with ❤️ by LatticeAG