DEV Community

Ram Chandra Samal
Ram Chandra Samal

Posted on

Multi-Repo Microservice Changes Are a Coordination Problem. I Solved It With AI Agent Teams.

A walkthrough of RepoOrch — an open-source Claude Code plugin that turns a multi-repo workspace into a deliberating team of AI specialists, with peer-to-peer messaging and a hard read-only safety model.

github.com/architonixlabs/RepoOrch · MIT licensed · v0.3.0


The honest workflow for a multi-repo bug in a microservice org looks like this: open three codebases, read until your eyes blur, hope you didn't miss a contract break in the fourth.

This isn't a code problem. It's a coordination problem. And it's the exact shape of problem that the new Agent Teams primitive in Claude Code is built to solve.

In this post I'll walk through:

  1. Why subagents (the older primitive) can't handle multi-repo change planning
  2. What Agent Teams add — specifically, the mailbox abstraction for peer-to-peer agent messaging
  3. How I built RepoOrch, an open-source plugin that uses this pattern to turn a multi-repo workspace into a deliberating AI team
  4. The propose-only safety model and why it's enforced at the platform layer, not in prose
  5. How knowledge graphs cut per-triage token cost on large workspaces

If you've ever opened five tabs for a single microservice change and thought "there has to be a better division of labor here," this is about that.


1. The shape of the problem

Imagine the workspace below — a perfectly ordinary microservice layout:

my-project/
├── auth-service/
├── payments/
├── notifications/
├── inventory/
└── shipping/
Enter fullscreen mode Exit fullscreen mode

A ticket arrives in your queue:

"Users are getting 401 errors after the auth refactor."

The bug almost certainly lives across multiple repos. Maybe auth-service changed the shape of a JWT claim, and payments still expects the old shape. Maybe notifications never validated the claim properly and is silently dropping events. The only honest way to know is to read each codebase, trace the contract, and hope you didn't miss something.

This is a textbook agentic AI use case — except for one thing: the agents need to talk to each other. And until recently, they couldn't.

2. Why subagents can't do this

The standard agent-orchestration pattern in Claude Code (before Agent Teams) looked like this:

master agent
    ├─→ subagent A (auth-service expert)
    ├─→ subagent B (payments expert)
    └─→ subagent C (notifications expert)
Enter fullscreen mode Exit fullscreen mode

Each subagent gets a question, does some research, and reports back to the master. Subagents cannot talk to each other. They form a tree, not a graph.

That's fine for "go read this file" or "summarize the schema." It falls apart the moment two specialists need to negotiate.

Consider what actually needs to happen for the 401 bug:

  • The auth specialist proposes: "I'll change the JWT sub claim from user_id to UUID."
  • The payments specialist needs to answer: "Does my service depend on the shape of the sub claim?"

In a subagent tree, the only path between them goes through the master. So the master has to:

  1. Understand the auth proposal deeply enough to translate it into a question for payments
  2. Get the payments answer
  3. Translate it back into a constraint on auth
  4. Repeat for every cross-repo edge

This doesn't scale, and worse — it forces the master into a coordinator role it's not really equipped for. The result is plans that look complete but have stale assumptions baked in.

3. Agent Teams: the mailbox primitive

Claude Code 2.1.32 introduced Agent Teams. The key difference is one small but transformative addition: each teammate gets a mailbox, and teammates can message each other directly.

master agent
    ├─→ auth specialist  ──┐
    │                      │  direct peer messages
    ├─→ payments spec.  ──┤  (mailbox)
    │                      │
    └─→ notifications  ───┘
Enter fullscreen mode Exit fullscreen mode

Now the conversation that actually needs to happen can happen:

auth → payments: "I'm proposing changing the JWT sub claim from user_id to UUID. Does your service read sub directly, and if so, how do you parse it?"

payments → auth: "Yes, we read sub in auth.middleware.ts:42 and call parseInt() on it. Your change will break us unless we update to UUID parsing in the same release."

That deliberation is what produces a plan you can actually trust. The master doesn't have to be smart enough to mediate the contract — the specialists are.

4. How RepoOrch puts this together

RepoOrch is a Claude Code plugin that operationalizes this pattern for the multi-repo microservice case. Setup is one command:

/repo-orch-setup
Enter fullscreen mode Exit fullscreen mode

The setup runner:

  1. Discovers every git repo under your workspace root.
  2. Indexes each repo (language, frameworks, endpoints, events, dependencies).
  3. Writes an editable context document per repo — and pauses for your review. (This is the most important step. The owns field in the context drives routing later.)
  4. On your confirmation, generates a specialist agent per repo and a registry.json the master uses for routing.

Then, for any incoming ticket:

/repo-orch-triage "Users are getting 401 errors after the recent auth refactor"
Enter fullscreen mode Exit fullscreen mode

The master reads the registry, scores which repos are likely involved, and spawns the relevant specialists as an Agent Team. They emit VERDICTs, deliberate over cross-repo contracts via their mailboxes, and the master synthesizes a single ordered change plan with:

  • Cross-repo dependency ordering (which repo's change must land first)
  • Risks called out per repo
  • Validation hints (the tests / endpoints to exercise)
  • Zero files modified — which brings us to the safety model.

For incidents where the root cause isn't known, there's an adversarial variant:

/repo-orch-deliberate "Payments failing intermittently — unknown root cause"
Enter fullscreen mode Exit fullscreen mode

This spawns the team in a mode where specialists are biased toward challenging each other's hypotheses, not converging too fast on the first plausible cause.

5. Propose-only, enforced at the platform layer

There is a recurring AI-agent failure mode where the prompt says "do not modify files" and then the agent modifies files anyway. Prose is not a security boundary.

RepoOrch enforces propose-only in two layers:

  1. Tool restriction. Every specialist agent is spawned with tools: Read, Grep, Glob, Bash. There are no write, edit, create, or delete tools available. This isn't a prompt instruction — it's the actual tool inventory the agent sees.
  2. PreToolUse hook. A platform-level hook intercepts every Bash call and hard-blocks write-like commands: rm, mv, git commit, git push, sed -i, > redirection, and others. Even a malicious or hallucinated bash invocation is rejected before it runs.

The /repo-orch-triage and /repo-orch-deliberate commands additionally spawn agents with permissionMode: "plan", so the spawning context itself is read + delegate only.

The v0.2 guarantee, written into the README: the agents produce a plan document. The developer executes it.

6. Knowledge graphs and token savings

The naive way to run a triage is to have every specialist cold-read its repo on every ticket. That works, but it adds up. The optional /repo-orch-graph command builds a Claude-native knowledge summary per repo:

/repo-orch-graph    →  Claude reads each repo and writes summary.json
     ↓                  (one-time cost, no API key)
/repo-orch-triage   →  master pre-loads summary.json into a ~600-token
     ↓                  GRAPH_SUMMARY
specialist          →  reads GRAPH_SUMMARY first, targeted file reads
                       only for gaps
Enter fullscreen mode Exit fullscreen mode

This entire flow runs inside the Claude Code session — no Python, no API key, no external service. If no summary exists, the plugin degrades gracefully to direct file reads.

On large workspaces this is the difference between "feels expensive" and "feels free." On small workspaces it's not worth the setup; the plugin lets you skip it.

7. Headless mode for CI

For teams that want triage to happen automatically when a Jira ticket or GitHub issue is filed, RepoOrch exposes a headless entry point via the Anthropic Agent SDK:

import { runTriage } from '.claude/plugins/repo-orchestrator/automation/repo-orch-triage_runner.mjs';

// In a GitHub/Jira webhook handler:
const plan = await runTriage({
  ticket: issue.body,
  workspaceRoot: '/path/to/your/workspace',
});

await postComment(issue.number, plan);
Enter fullscreen mode Exit fullscreen mode

This runs in permissionMode: "plan" — same read-only, propose-only constraints as the interactive flow. The output is the same change-plan document, posted as a comment on the source ticket.

What I'd build next

A few directions I'm exploring for v0.4 and beyond:

  • Confidence-weighted plans. The aggregate confidence formula already lives in skills/routing/SKILL.md — surface it more prominently so devs can sort risks by uncertainty, not just severity.
  • Graph-aware deliberation. Today the mailbox is fully connected. For 10+ repo workspaces, routing the mailbox along actual service-dependency edges would cut deliberation overhead.
  • An execute mode behind an explicit consent prompt. Still propose-by-default, but with a guarded path to apply the plan repo-by-repo with diff approval.

Try it

# Add the marketplace (one time)
/plugin marketplace add architonixlabs/RepoOrch

# Install the plugin
/plugin install repo-orchestrator@repo-orchestrator
Enter fullscreen mode Exit fullscreen mode

Then run /repo-orch-setup in any workspace that has 2+ git repos as immediate subdirectories.

Source, issues, and discussions: github.com/architonixlabs/RepoOrch

If RepoOrch saves you a single hour of multi-repo triage, a GitHub star is how I know to keep building. And if you try it and it breaks — file an issue, I'll respond. v0.3 just shipped a ten-fix security hardening pass; v0.4 is being shaped by feedback like yours.

Top comments (1)

Collapse
 
scarab-systems profile image
scarab systems

This is a really interesting angle. The “plans that look complete but have stale assumptions baked in” line especially matches what I’ve been seeing with AI coding workflows.

I’ve been working on this from a slightly different direction: not multi-repo agent orchestration, but repo supervision/diagnostics. The problem I keep coming back to is that AI agents can move fast, but the repo still needs a way to prove what is true — what has actually been built, what is only partially supported, where cleanup debt is accumulating, and when a plan or closeout claim is getting ahead of the evidence.

I like that RepoOrch keeps the agents propose-only and enforces safety at the tool/platform layer instead of relying on prose instructions. That feels like the right direction. I think we’re going to see a lot more tools in this space that don’t replace the agent, but supervise or constrain the environment the agent works inside.