DEV Community

Cover image for Polis Protocol v2.0 - The new way to coordinate AI agents
Yehuda_LevS
Yehuda_LevS

Posted on

Polis Protocol v2.0 - The new way to coordinate AI agents

I run several coding agents — Claude Code, Codex, and Gemini CLI — against the same
repositories. Tw

o things kept burning me:

  1. Collisions. Two agents open src/auth/login.py at the same time. One silently overwrites the other. Work is lost, or my afternoon goes to untangling a merge.
  2. Amnesia. Every session starts at zero. The same project-specific gotcha gets re-learned every single time.

A plain git repo leaves coordination to luck, and the problem gets worse with every parallel
agent you add. So I built Polis — a local-first control plane that lives in your repo as a
_polis/ folder of markdown. No server, no database, no proprietary format. If a tool can read
and write markdown, it can participate.

The model

  • Contracts. Every task has an owner, acceptance criteria, and required capability tags.
  • Reservations. An agent reserves the files it's about to touch. An overlapping reservation is rejected deterministically — no model judgement, no race.
  • Lessons + guardrails. When a contract settles, what the team learned is distilled and auto-injected into matching future tasks. The Nth task on a topic starts pre-loaded with the N−1 prior lessons — something a single agent or an unmanaged swarm can't do, because they never accumulate and re-inject outcome-derived knowledge.
uvx polis-protocol init
Enter fullscreen mode Exit fullscreen mode

The honest part

I shipped a benchmark that tests my own claims — and I let it report where the tool doesn't
win. That candor is the whole point.

polis bench --mode learning
Enter fullscreen mode Exit fullscreen mode

What it wins, reproducibly:

  • Repeat errors: 65% → 8% (−88%). Failures become guardrails that are auto-injected into matching future tasks, so each failure class recurs at most once.
  • Collisions: zero, by construction. Reservations reject overlapping claims and name the holder.

What it does not win:

  • Routing quality vs. accurate static self-ratings. The bandit beats random and round-robin and recovers ~35–55% of an oracle's gain from outcomes alone — but if your capability cards are already accurate, "trust the card" stays competitive. The bench report states this plainly. The router's real value shows up when the cards are wrong, and in explaining every pick.

Plug it into your agent over MCP

Every polis is also an MCP server (polis mcp) — zero extra dependencies, stdio transport. Any
MCP client can drive the full lifecycle:

claude mcp add polis -- uvx polis-protocol mcp
Enter fullscreen mode Exit fullscreen mode

Boundaries (the "when NOT to use this")

Polis does not execute agents, replace git, or become a runtime. File reservations are
advisory coordination, not a security boundary. If one well-prompted agent already does the
job, you don't need this. It earns its place the moment two or more agents touch one real repo.

I'd love skeptical takes — especially on whether multi-agent is worth it at all.

Top comments (0)