DEV Community

Cover image for Hermes Agent's Kanban System Is the Most Underrated Feature in Open Source AI Agents
Prashant Maurya
Prashant Maurya

Posted on

Hermes Agent's Kanban System Is the Most Underrated Feature in Open Source AI Agents

Hermes Agent Challenge Submission: Write About Hermes Agent

This is a submission for the Hermes Agent Challenge: Write About Hermes Agent


When people talk about Hermes Agent, they talk about the Skills System and the persistent memory. Those are genuinely impressive. But there's a feature in the v0.12 "Tenacity Release" that I think deserves more attention: the Kanban multi-agent system.

This post is about what it actually does, why it matters, and why most agent frameworks haven't solved the problem it's solving.


The Problem: Agents That Don't Finish

Here's a pattern that anyone who's used AI agents on long tasks will recognize:

You give the agent a complex, multi-step task. It starts well. Somewhere in the middle — a tool call fails, a subprocess hangs, the context window fills, the model gets confused about state — and the agent either loops, produces garbage, or just stops. You come back an hour later to find it stuck or finished with something completely wrong.

This isn't a model intelligence problem. It's a state management and fault tolerance problem. The agent has no durable record of what it's done, what's pending, and what failed. When something goes wrong, there's no recovery path.

Hermes's Kanban system is a direct answer to this.


What the Kanban System Is

The Kanban ships as a durable multi-agent task board — a structured queue of tasks with explicit state transitions, built-in fault tolerance, and automatic recovery.

Tasks on the board have states: todo, in_progress, blocked, done, failed. The board persists across restarts. Agents working on tasks emit heartbeats. If a heartbeat stops, the task is automatically reclaimed and either retried or escalated.

The key components:

Heartbeat monitoring — Every active task has a heartbeat timer. If an agent working on a task misses its heartbeat window (it crashed, hung, or the process died), the system detects this automatically.

Zombie detection — A "zombie" is an agent that stopped responding but didn't cleanly exit. The system detects zombie agents and reclaims their tasks rather than leaving them stuck in in_progress forever.

Auto-block on incomplete exit — If a task's assigned agent exits without marking the task done or failed, the board automatically moves the task to blocked state. Nothing silently falls through.

Per-task retries — Failed tasks can be configured to automatically retry up to N times before escalating. You set retry policy per task or per board.

Hallucination recovery — This one is subtle. When an agent produces output that contradicts its own task log (claims it completed a step it never ran), the board detects the inconsistency and flags it for review rather than silently marking the task done.


The /goal Command: Staying on Target

Alongside Kanban, the v0.12 release added /goal — what the docs call the "Ralph loop."

/goal Ship the auth module with tests and a PR by end of session
Enter fullscreen mode Exit fullscreen mode

This keeps the agent locked on a target across turns. Instead of each message being independently interpreted, every subsequent action is evaluated against the declared goal. The agent won't drift — if a sub-task would take it away from the goal, it recognizes this and gets back on track.

Combined with Kanban, this means:

  1. You declare a goal
  2. Hermes decomposes it into a Kanban board of tasks
  3. Subagents pick up tasks and work on them in parallel
  4. Failed tasks get retried; zombie agents get reclaimed; blocked tasks get escalated
  5. The agent tracks progress against the original goal and knows when it's actually done

This is what "the agent finishes what it starts" looks like in practice.


Subagent Delegation: The Parallelism Layer

The Kanban system is most powerful when combined with Hermes's subagent delegation via the delegate_task tool.

A parent agent with a complex task can spawn up to 3 child agents by default (configurable), each with:

  • Isolated context (the subagent knows only what it needs to)
  • Restricted toolsets (it can only use the tools relevant to its task)
  • Its own terminal session (no file-state collisions between agents)

The parent agent coordinates — it doesn't do the work directly. It delegates, monitors progress via the Kanban board, handles escalations, and synthesizes results.

In practice, this looks like:

Parent: "Build a REST API with authentication, tests, and documentation"

→ Subagent 1: Implements the core API endpoints
→ Subagent 2: Writes integration tests
→ Subagent 3: Drafts API documentation

Parent: Monitors all three, handles merge conflicts, synthesizes final output
Enter fullscreen mode Exit fullscreen mode

Without durable state management, parallel subagents are fragile — if one fails, you don't know which one, and recovery is manual. The Kanban board makes parallel execution safe by making task state explicit and recoverable.


Checkpoints v2: The Safety Net

Running parallel agents doing real work means real risk. A subagent making file changes can go wrong.

Hermes's Checkpoints v2 (also part of the Tenacity Release) handles this. Before any file mutation, the system automatically snapshots the working directory. The checkpoint_manager tracks these snapshots with real pruning — old checkpoints get cleaned up, not accumulated indefinitely.

If something goes wrong:

/rollback
Enter fullscreen mode Exit fullscreen mode

That's it. You're back to before the last file-mutating operation. Combined with the Kanban board's task state, this means a failed multi-agent run doesn't leave you with a partially-mutated codebase in an unknown state.


Gateway Auto-Resume: Surviving Restarts

One more piece of the reliability picture: gateway auto-resume.

In previous versions, if the Hermes gateway process restarted (server reboot, OOM kill, network drop), all in-progress agent sessions were lost. You'd have to restart tasks manually.

With the Tenacity Release, the gateway automatically resumes interrupted sessions after restart. The Kanban board state is persisted, in-progress tasks get reclaimed, and the agent picks up roughly where it left off.

This matters more than it sounds for anyone running Hermes on a VPS or in a container. Process crashes happen. An agent system that survives them gracefully is a different category of tool than one that needs babysitting.


Why This Architecture Is Rare

Most agent frameworks don't have an equivalent answer to durable multi-agent task management. Here's why:

The research community optimizes for single-agent performance. Benchmarks are almost all single-agent: can the agent solve this coding problem, answer this question, complete this task. Multi-agent coordination with fault tolerance is an engineering problem, not a benchmark problem.

Durable state is hard. Most frameworks store task state in memory or simple files. Real durability — heartbeat monitoring, zombie detection, restart recovery — requires more infrastructure investment than most open source projects make.

The failure modes are subtle. An agent that fails loudly is easy to fix. An agent that succeeds incorrectly — marks a task done when it hallucinated the last step — is hard to detect without explicit verification. Most frameworks don't have hallucination recovery in their task management layer.

Hermes is, to my knowledge, the only open source agent framework that ships all of these in a single installable package.


When to Use the Kanban System

The Kanban + subagent delegation is overkill for simple tasks. Use it when:

  • The task takes more than 20–30 minutes to complete
  • The task has multiple independent subtasks that can run in parallel
  • You're running unattended (scheduled cron, overnight batch)
  • The cost of partial completion and unknown state is high (production deployments, large codebases)
  • You need a clear audit trail of what happened

For conversational tasks, quick lookups, or one-off automations, just use regular Hermes chat. The Kanban is for the serious workloads.


Putting It Together

# Start a multi-agent project
/goal Build a complete user authentication module: JWT, refresh tokens, tests, docs

# Hermes decomposes into Kanban tasks, spawns subagents, monitors progress
# You can check status at any point
/kanban status

# If something fails, check what happened
/kanban log

# Roll back if needed
/rollback
Enter fullscreen mode Exit fullscreen mode

The v0.12 "Tenacity Release" shipped 864 commits, 588 merged PRs, and closed 282 issues (including 13 P0s and 36 P1s). The Kanban system is the centerpiece, but the security wave (WhatsApp rejecting strangers by default, Discord role-allowlists, redaction on by default) and Google Chat as the 20th platform are also worth noting.

The name "Tenacity" is accurate. This release is about making the agent finish what it starts, survive what it can't prevent, and be honest about what went wrong.

That's a harder problem than raw capability — and it's the one that actually matters for production use.


Get started:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Enter fullscreen mode Exit fullscreen mode

Docs: Subagent Delegation · GitHub Release Notes

Top comments (0)