DEV Community

Aman Bhandari
Aman Bhandari

Posted on

HANDOVER + SYNC: multi-agent coordination without a central scheduler

Three or more Claude Code agents, each owning their own repo. No central scheduler. No shared database. No message bus. Two markdown files at known paths and a single convention that keeps them consistent. That is it.

The protocol is claude-multi-agent-protocol. I run it across four agent positions and four repos in my own research setup: the lab itself, a downstream recorder, a publisher, and a shared commons. This post is the protocol written up as a generalizable pattern — not a tutorial for the specific repos.

The failure mode it prevents: every Claude Code multi-agent setup I have seen attempts to share mutable state, and every one eventually conflates two distinct flows — data (what happened) and intent (what we plan to do next). The conflation is what produces rubber-stamp rewrites, where agent B overwrites agent A's change because it could not distinguish "this is a fact" from "this is a proposal."

Separate the flows. One file per flow. Separate ownership rules.

Flow 1 — Data, one-way, single-writer

HANDOVER.md lives in the upstream agent's repo. It is append-only. It has one writer — the upstream agent. Downstream agents read it, never write to it.

The content is factual: "Latest run completed. New concepts landed: dict internals, reference semantics. Broken: three whiteboard-test attempts on JSON deserialization. Committed: commit hash X."

The shape is chronological. Each entry is tagged with a timestamp or sequence number. Nothing previously written is modified. If the upstream agent was wrong about something, the correction gets appended as a new entry referencing the old one — not an edit to the old entry.

The reason for single-writer append-only: HANDOVER.md is the data source of truth for everything downstream. If any downstream agent can write to it, two agents will write at once, git will produce a merge conflict, and a human will resolve the conflict by picking whichever version looks right — which is how the truth state of the system gets silently corrupted.

Single-writer is boring. Boring is what makes it reliable.

Flow 2 — Intent, bidirectional, per-agent sections

SYNC.md lives in a shared commons repo that every agent has access to. It has bidirectional ownership: each agent owns a section. Every agent reads every section; each agent writes only to their own section.

The content is forward-looking: "I am about to start X. I need Y from upstream. I am blocked on Z. My next three actions are A, B, C."

The shape is per-agent. The file has five sections if there are five agents: ## partner, ## observer, ## publisher, ## commons, ## principal. Each section has three fields:

  • Current focus. One sentence on what the agent is working on now.
  • Blocked on. What the agent needs to proceed. Empty if nothing.
  • Next action. The concrete next step the agent intends to take.

Three fields is the minimum that captures intent without encoding a plan. Four fields is where it starts being a planning doc.

Why the two-file split is the specific fix

Most multi-agent setups fail because they use one file for both flows. Either they put planning inline with facts (and agents start editing past facts to make the plan consistent), or they put facts inline with planning (and downstream agents see stale plans mixed with fresh facts and cannot tell which is which).

The split gives each flow the semantics it needs:

  • Data (HANDOVER) needs reliability and history. Single-writer, append-only.
  • Intent (SYNC) needs freshness and bidirectional visibility. Per-agent sections, overwritable within the section, reset on each sync.

Conflating the two makes both worse. Separating them makes both load-bearing.

Single-writer-per-section is what git already gives you

The per-agent section rule on SYNC.md means that in practice, git is the serializer. Two agents writing to different sections produce a clean merge. Two agents writing to the same section — which should not happen because each section has one owner — produces a merge conflict that surfaces the bug.

You do not need a coordination service. You do not need locks. You do not need Redis. The section-ownership rule plus git is sufficient for coordination at the 5-10 agent scale. Beyond that, you might need something else. Below that, this is enough.

The CLAUDE.md precedence rule

Each repo has its own CLAUDE.md (or equivalent agent-identity file). Each agent has its own rules of behavior. SYNC.md does not override those rules.

The precedence rule: on conflict, the repo's own CLAUDE.md wins over anything in the shared SYNC.md. An agent whose repo says "never push to main without review" does not get overridden by a SYNC.md entry from another agent saying "please push your change to main." The identity file of each agent is sovereign.

This matters because without the rule, a malicious or confused SYNC.md entry could instruct another agent to violate its own constraints. With the rule, SYNC.md is advisory for behavior outside the repo's own rules, and irrelevant for behavior governed by those rules.

The .last-processed.md marker

Each downstream agent keeps its own .last-processed.md marker in its own repo. The marker records: "I last processed HANDOVER.md entries up to sequence N at time T." When the agent is asked "any update?", it reads HANDOVER.md from N+1 onward, processes the new entries, and updates the marker.

This is standard offset-based consumption, the same pattern used by Kafka consumers and similar event-log systems. The novelty is that it works with markdown files and git instead of a broker.

Without the marker, each downstream "check for updates" either reprocesses everything or has to remember a sequence number in memory that is lost on restart. With the marker, the agent restarts cleanly, processes incrementally, and the protocol is stateless across agent sessions.

The research-vs-documentation split

In my setup, the research subject is one specific pair (Principal + Partner). The other agents are infrastructure — a recorder that preserves sessions as narrative, a publisher that renders artifacts for the public. The infrastructure agents read HANDOVER from the research pair; the research pair does not read FROM them.

This is a deliberate asymmetry. If the downstream agents could write back into the research pair's state, the research subject would be contaminated by its own observers — which is a known failure mode in ethnographic research and a direct failure mode in multi-agent systems where downstream feedback changes upstream behavior.

The HANDOVER direction is irreversible on purpose. Downstream knows about upstream. Upstream does not know about downstream's interpretation. The protocol preserves this.

What this protocol does NOT solve

Three honest limits.

1. It does not coordinate real-time interactions. HANDOVER + SYNC are per-session artifacts. Agents reading each other's files are not reading a live event stream. For anything that needs sub-second coordination, you want a real message bus.

2. It does not enforce the convention. The protocol is discipline, not compilation. If an agent writes to somebody else's section, git will merge-conflict, and a human has to notice. A compiled DSL could enforce section ownership structurally. This protocol does not.

3. It does not scale past 10-ish agents. At that scale, SYNC.md becomes a 500-line file nobody reads. The protocol assumes a small enough team that every agent can read every section in under a minute. If the team is larger, you partition — multiple SYNC.md files by subsystem, or a different coordination pattern entirely.

When to reach for this versus not

Reach for HANDOVER + SYNC when:

  • You have 2-8 Claude Code agents, each in their own repo.
  • The work is async — agents do not need to coordinate in real-time.
  • Data and intent flows are genuinely distinct (facts versus plans).
  • A compiled scheduler would be overkill; a global file would be underkill.

Do not reach for it when:

  • A single repo with good directory structure would suffice.
  • The agents are hot-loop coordinated (inference pipeline, live routing).
  • The team is large enough that SYNC.md becomes unreadable.

In my case, the pattern solved a real coordination problem I was going to hit anyway once the setup grew past two agents. Two files, one convention, zero merge conflicts in normal operation. That is the shape. If it matches your setup, steal it.


Aman Bhandari. Operator of an AI-engineering research lab running Claude Opus as the coaching partner, plus a QA-automation surface shipping against a real sprint workload. Public artifacts: claude-code-agent-skills-framework and claude-code-mcp-qa-automation. github.com/aman-bhandari.

Top comments (0)