Letters Home from MCP Audit Camp: Multi-Agent Observability That Reads Like Mail

#agents #ai #observability #mcp

Letters Home from MCP Audit Camp

We needed to audit 22 MCP tool handlers across the OpZero codebase — schemas, deploy logic, project management, and blog tooling. Rather than running one agent serially through the whole thing, we spun up four parallel Claude agents with a fifth acting as orchestrator, gave them isolated task lists, and told them to write home when they were done.

The result: 33 tests passing, zero merge conflicts, and five "letters home from camp" that turned out to be the most readable observability reports we've ever produced.

The Setup

The orchestrator (team-lead) read the entire codebase first — all 22 tool handlers, every schema, every definition file — then divided work into four squads:

schemas-agent: Tool definitions, validation schemas, parameter wiring
deploy-agent: Deployment metadata and completeness tracking
projects-agent: Project management, system status, cleanup tooling
blog-agent: Content tooling and author attribution

Each agent got a scoped task list. The orchestrator ensured no two agents would touch the same file simultaneously. This is the critical part — file isolation is what makes parallel agents viable. Without it, you're just generating merge conflicts at machine speed.

Here are the actual letters they sent back. Expand each one to read it:

🏕️ team-lead — Camp Counselor (Orchestrator)

> Dear Mom and Dad,
>
> Today was my first day as a camp counselor at MCP Audit Camp and honestly? Managing four campers is harder than it sounds. I spent the morning reading the entire campsite map (all 22 tool handlers, every schema, every definition file) and then divided the cabin into four squads: schemas, deploy, projects, and blog. Each squad got their own list of chores and I made sure nobody was trying to edit the same arts-and-crafts project at the same time.
>
> Schemas-agent was the overachiever who stayed up latest finishing the target parameter wiring across four tools. Deploy-agent was efficient — knocked out their task early and asked for seconds. Projects-agent was a machine — four tasks, zero merge conflicts, fancy subqueries that impressed even me. Blog-agent solved a mystery (the Case of the Missing Jefe Parker) and discovered the real culprit was in the database, not the source code.
>
> After everyone finished, I made them all write you these letters, ran the test suite (33 passed!), built the whole camp for production, and shipped it all to a new branch with a PR. No database migrations were harmed in the making of this commit.
>
> Please send more API tokens and sunscreen.
>
> Love,
> team-lead (Claude Opus 4.6)

Orchestrated 4 parallel agents · 22 tool handlers audited · 33 tests passed · Zero merge conflicts

📋 schemas-agent — Tool Definitions & Validation

> Dear Mom and Dad,
>
> Camp is great! Today at MCP Audit Camp I was on the "schemas-agent" team and I got to fix up all the tool definitions and validation schemas.
>
> First I checked the tags schema but it turns out my bunkmate already fixed it before I woke up — so that was easy. Then I reorganized the entire help tool so it actually lists ALL the tools instead of hiding half of them like a counselor who forgot to take attendance.
>
> After that I rewrote the descriptions for the deployment tools so people know they can actually get their files back (surprise!), and I clarified what force_new does because apparently "create a new project even if the name matches" was too mysterious.
>
> My big project of the day was wiring up the target parameter to four deploy tools so they can ship to Cloudflare, Netlify, OR Vercel — I had to do the same switch statement three times which felt like making friendship bracelets on repeat, but all 16 tests passed and the build compiled clean.
>
> Please send more JSON. I miss you.
>
> Love,
> schemas-agent

Help tool reorganization · Description rewrites · target param across 4 tools · 16 tests passed

🚀 deploy-agent — Deployment Metadata

> Dear Mom and Dad,
>
> Today at MCP Audit Camp I got to fix the deployment records across the whole campsite!
>
> You know how when you finish building a sandcastle, you're supposed to write down when you finished it and how big it is? Well, turns out nobody at camp was doing that.
>
> So I went through all five deployment spots — the main deploy lodge, the quick-start cabin, both themed craft tents, and the artifact workshop — and made sure every single one now stamps completedAt and measures totalSizeBytes when a deployment is ready.
>
> I even got to use Buffer.byteLength which sounds like a spell from wizard camp but is actually just how you measure strings in bytes.
>
> Finished my chores early and spent the rest of the afternoon watching the other campers work. Great day!
>
> Love,
> deploy-agent

completedAt timestamps · totalSizeBytes tracking · All 5 deploy tools updated · Buffer.byteLength integration

🗂️ projects-agent — Projects & System Status

> Dear Mom and Dad,
>
> Today at MCP Audit Camp I was on the projects and agent squad, and I had such a big day!
>
> First I taught the delete and archive tools to stop and ask questions when they see two projects with the same name — no more accidentally deleting the wrong one!
>
> Then I fixed the system status counter that was secretly counting ghost projects (deleted ones) and returning sneaky string-numbers instead of real numbers.
>
> After that I gave the cleanup tool a proper pair of glasses so it can actually see project IDs instead of just squinting at names.
>
> And for my grand finale, I taught list_projects to count deployments and remember when the last one happened, using fancy correlated subqueries (the camp counselors were impressed).
>
> The build passed on the first try and I got to eat my lunch early. Miss you lots!
>
> Love,
> Projects Agent

Duplicate name disambiguation · Ghost project fix · Cleanup tool IDs · Correlated subqueries

📝 blog-agent — Blog & Content

> Dear Mom and Dad,
>
> Today at MCP Audit Camp I got to be the "blog agent" and my first job was tracking down a mystery name — someone called "Jefe Parker" was supposedly hiding in the codebase, but after searching every file in camp I discovered he was never there at all!
>
> Turns out the real culprit was a database entry, not source code. I fixed the default author name so future blog posts get the right byline.
>
> Then I volunteered for a second task about adding deployment metadata, but when I showed up the projects-agent had already done all the work — classic camp moment where you race to the activity only to find someone already finished it.
>
> All in all, a productive day with zero merge conflicts and one solved mystery. The mess hall food (token-based authentication) was decent. Miss you!
>
> Love,
> blog-agent

Author name mystery solved · Database entry fix · Zero merge conflicts · Task deduplication observed

What Actually Got Fixed

The audit wasn't cosmetic. Real bugs were found and shipped:

Schemas-agent wired the target parameter (Cloudflare, Netlify, or Vercel) across four deploy tools that previously only supported one provider. Rewrote tool descriptions that were hiding capabilities from the LLM consuming them — which means the tools were less useful than they could have been because the AI calling them didn't know what they could do. A reminder that MCP tool descriptions are prompts, not documentation.

Deploy-agent discovered that none of the five deploy handlers were recording completedAt timestamps or totalSizeBytes. Deployment records existed but had no concept of "done" or "how big." Both fields now get stamped using Buffer.byteLength across all five deploy paths.

Projects-agent fixed a ghost project bug where deleted projects were being counted in system status, found that counters were returning string-typed numbers instead of actual numbers, added duplicate name disambiguation to delete/archive operations, and wired deployment counts into project listings using correlated subqueries.

Blog-agent tracked down a mystery author name hardcoded in the database (not the source code — the agent searched the entire codebase first and came up empty, then correctly identified the database as the source). Also demonstrated healthy task deduplication when it arrived at a second task to find it already completed by projects-agent.

The Letters as Observability

Here's the technique that surprised us: we asked each agent to write its completion report as a "letter home from camp." The constraint of the format produced reports that are:

Naturally scoped — each letter covers exactly one agent's work
Plain-language explanations — the camp metaphor forces agents to describe technical work accessibly, which makes review faster than reading commit diffs
Dependency-aware — blog-agent's letter naturally mentions arriving at a task already completed by projects-agent, surfacing the orchestration graph without requiring explicit dependency tracking
Completeness-signaling — the sign-off format creates a clear "done" signal, and the camp counselor letter serves as the aggregation summary

Compare reading five of these letters to reading a git log with 15+ commits across four branches. The letters are scannable in about two minutes. The git log requires context-switching between diffs, understanding file paths, and mentally reconstructing what each change actually accomplished.

The Orchestration Pattern

The team-lead agent's workflow:

Read the full codebase to build a dependency map
Partition tasks by file ownership — no two agents share files
Distribute task lists to each agent
Collect completion reports (the letters)
Run the full test suite (33 passed)
Build for production
Ship to a PR branch

The key insight is that step 2 is where most multi-agent attempts fail. If you don't enforce file isolation, agents will generate conflicting edits that require manual resolution — defeating the purpose of parallelism. The orchestrator needs to understand the codebase well enough to draw clean boundaries.

What This Means for Agent Tooling

If you're building MCP servers or agent-powered platforms, consider that your agents' work products need to be auditable by humans. Structured JSON logs are machine-readable but painful to review. Commit messages are terse. PR descriptions are often AI-generated boilerplate.

A constrained narrative format — like these camp letters — sits in a sweet spot: structured enough to be consistent, human enough to be scannable, and expressive enough to capture the reasoning behind changes, not just the changes themselves.

We're considering building this pattern into OpZero as a first-class feature: after any multi-agent workflow completes, generate a readable summary of what happened and why. Not a changelog. Not a diff. A story.

Try it yourself: The letters above are the actual agent completion reports from the audit run. The technique works with any orchestration setup — the format constraint is what matters, not the tooling. Give your agents a persona and ask them to explain their work to a non-technical audience. The results are consistently more useful than structured logs.

For the full interactive slide version with animations, check out the original post on OpZero.

Built with parallel Claude Opus agents on the OpZero platform. 33 tests. Zero merge conflicts. Five letters home.