Most AI pipelines use one model for everything.
One model plans the task, does the research, writes the answer, and checks its own work. That's like hiring one person to be your strategist, researcher, analyst, and auditor simultaneously. It doesn't scale - and more importantly, the model has no one to disagree with it.
PolyBrain is my answer to that. It's an open-source multi-agent, multi-model orchestration skill for Hermes Agent. You give it an objective, it breaks the work into roles, runs them in parallel where it can, synthesizes the outputs, and verifies every claim against its cited sources before it reaches you.
Here's what that looks like end to end.
The pipeline
Objective -> Orchestrator -> [Researcher 1 + Researcher 2 + Builder] -> Synthesizer -> Verifier -> Final Answer
Five distinct roles. Each one does exactly one thing:
- Orchestrator - reads the objective, decomposes it into a JSON task plan, assigns roles
- Researcher - web search and citations - no uncited claims allowed
- Builder - code, terminal, and file operations
- Synthesizer - merges all outputs into a single coherent deliverable
- Verifier - checks every claim against its source, returns PASS/FAIL per claim
The parallel phase is where the time savings come from. Researcher 1 and Researcher 2 run at the same time. The Synthesizer only fires once both are done. The Verifier runs last.
What makes it different
Different models per role
This is the part most orchestration frameworks don't do.
In config.yaml you assign a different model and provider to each role. Researcher gets a cheaper, faster model. Verifier gets a stronger one. Orchestrator gets whatever you trust most for structured JSON output.
models:
orchestrator: "your-model"
researcher: "your-model"
builder: "your-model"
synthesizer: "your-model"
verifier: "your-model"
fallback: "your-model"
settings:
max_parallel: 3
timeout_sec: 300
You decide what goes where. PolyBrain doesn't prescribe it - because the right answer depends on what you're running, what you're paying, and what you actually trust.
Citation enforcement
Researchers are required to include URLs. Uncited claims don't make it to the Synthesizer - they're dropped at the source. This isn't a soft suggestion in the prompt. It's structural: the Verifier checks each surviving claim against its cited source and returns a verdict.
In the example run below, the Verifier caught a real data discrepancy in Azure revenue figures. That's the point - you want something that pushes back.
Artifact logging
Every run saves a timestamped folder:
.hermes/plans/polybrain/20260528_191548/
├── orchestrator.json
├── task_t1.md
├── task_t2.md
├── synthesis.md
└── verification.md
You can audit exactly what each role produced and why the final answer looks the way it does.
A real example
Objective: "Create a market brief on Apple with three bullets on revenue trends and two competitors."
The Orchestrator decomposes this into four tasks - two parallel researchers, a synthesizer, a verifier.
t1 (Researcher) - revenue trends - pulls from SEC filings and Apple Newsroom. Returns three years of top-line figures ($383.3B -> $391.0B - $416.2B) with a source URL for each.
t2 (Researcher) - competitor profiles - Samsung (hardware/smartphones) and Microsoft (cloud/AI). Revenue context and competitive positioning, all cited.
Both run at the same time.
t3 (Synthesizer) - merges both outputs into a clean brief. Preserves inline citations. Drops anything uncited.
t4 (Verifier) - checks every claim against its source. Flags a mismatch in a competitor cloud revenue figure, provides the corrected bullet with evidence.
Total runtime: ~4 minutes. Parallel research phase: ~2 minutes.
Getting started
# Clone into your Hermes skills folder
git clone --depth=1 https://github.com/mosesman831/PolyBrain.git /tmp/polybrain
cp -r /tmp/polybrain ~/.hermes/skills/research/polybrain
# Edit config.yaml with your model aliases
# Then validate it
python ~/.hermes/skills/research/polybrain/scripts/validate_config.py
Then just tell Hermes what you want:
Use PolyBrain to research Apple's latest earnings and competitors
Hermes loads the skill. PolyBrain handles the rest.
What it doesn't do (yet)
- No persistent state across runs - if it crashes mid-run, you restart from scratch. Hermes Kanban handles durable state natively but PolyBrain doesn't plug into it yet.
- Some models hang in subagent calls - test with
hermes chat -q "ping" -m your-modelbefore committing a model to a role. - Verifier can occasionally truncate numbers - PASS/FAIL verdicts are structurally correct but some models strip leading digits from dollar amounts in the report text.
Why I built it this way
Single-model pipelines have a ceiling. The model can't critique itself meaningfully. There's no disagreement, no verification layer, no separation between "the thing that did the research" and "the thing that checks the research."
PolyBrain is built around the idea that different roles benefit from different models - and that the value of an orchestration layer is precisely that it enforces structure the models themselves wouldn't maintain.
It's a Hermes skill, it's config-driven, it's open source. If you're running Hermes Agent and want to try it:
GitHub: github.com/mosesman831/PolyBrain
Feedback welcome - especially if you find models that work well for specific roles.
This article was written with the help of AI, based on my own docs, config, and terminal output.
Made with ❤️ by LatticeAG
Top comments (0)