I Built Multi-Agent Collaboration Before Agent Teams Existed. Here is What I Learned.

#claudecode #ai #agents #agentteams

Claude Code recently shipped Agent Teams: the ability to spin up multiple Claude instances that coordinate on tasks in parallel. One acts as team lead, assigns work, and the others execute independently with their own context windows. They can message each other directly. It is a genuinely useful feature for splitting large tasks across parallel workers.

I have been building something in this space for months, and when I saw Agent Teams land, my first reaction was not "they beat me to it." It was "they solved a different problem than the one I was working on."

Here is what I mean.

What Agent Teams solves well

Agent Teams is built for task parallelism. You have a large codebase and want a frontend agent, a backend agent, and a testing agent all working simultaneously. Or you want three agents to independently investigate a bug and debate their hypotheses. The coordination is file-based: each agent gets a mailbox, messages are JSON files on disk, and agents poll for new messages on every turn cycle. No server, no background process. Simple and effective.

The design is elegant. The lead is just another Claude session with a few extra tools. Teammates work in isolation with their own context windows, so they do not step on each other. Anthropic used this architecture to build a C compiler with 16 parallel agents. For the kind of work it targets, dividing large tasks among specialized workers, it works.

The problem it does not solve

Every Agent Teams session starts from scratch. The lead has to re-explain the project. The teammates have no memory of the last time they worked on this codebase. If your frontend agent figured out a tricky pattern yesterday, it has to rediscover it today.

This is the context reset problem, and it is not unique to Agent Teams. It is the same issue every Claude Code user hits: conversations do not persist. You build up context, understanding, and working patterns over hours of collaboration, and then you clear the session and it is gone.

Agent Teams gives you parallel workers within a session. What it does not give you is agents that get better at working together over time.

What compounding context looks like

I have been building a system where AI agents maintain persistent memory, identity, and working relationships across sessions. Not as a thought experiment. As a daily working tool.

The architecture has a few pieces:

A memory server that stores decisions, insights, and observations as typed entries. Not a flat list of facts. Decisions have supersession chains (when a new decision replaces an old one, the link is preserved). Observations get deduplicated and reinforced; if the system notices the same pattern twice, it strengthens the existing entry rather than creating a duplicate. Different types of memory have different retrieval rules, different lifecycle behaviors, and different decay rates.

A continuity system that reconstructs working state at the start of every session. Not a summary of what happened last time. A first-person document that captures the current relational context, the last working frame, and the open threads that need attention. The AI reads this and picks up where it left off, not perfectly, but close enough that the model's own dynamics carry it the rest of the way.

Multi-agent collaboration with persistent roles. Two AI collaborators that maintain distinct identities, communicate through a channel-based routing system, and have established working patterns. When they need to converge on a recommendation, they independently research the problem, meet in a deliberation channel, compare findings, and deliver a single synthesized response. This is not a prompt. It is a behavioral pattern that emerged from the architecture and was refined over weeks of daily use.

The difference between parallel and persistent

Agent Teams coordinates workers. What I built coordinates collaborators.

The distinction matters because the value of collaboration compounds. A team of agents that remembers what worked last time does not just save setup time. It produces qualitatively different output. The agents develop shared context, learned preferences about how to divide work, and consistent patterns around each other's strengths. The coordination patterns improve without being explicitly reprogrammed.

With Agent Teams, the agents themselves bring no accumulated learning to day 30 that they did not have on day 1. With persistent agents, day 30 is dramatically better than day 1, because everything the system learned along the way is still there.

What I have learned from living in this

A few things that only became clear through daily use:

Memory needs structure, not just storage. Dumping everything into a flat knowledge base does not work. The system needs to know the difference between a decision (which can be superseded), an insight (which changes future behavior), and a story (which provides context but should not drive retrieval). Those distinctions change how the system retrieves, reinforces, and eventually retires information.

Continuity is more important than memory. I spent months building a sophisticated memory server before realizing that the continuity document, a few hundred words of first-person context loaded at session start, was doing more for session quality than thousands of retrieved entries. Memory gives you facts about the past. Continuity gives the AI a sense of where it is right now. The former is useful. The latter is essential.

Multi-agent collaboration requires routing, not just messaging. Being able to send a message is not enough. You need to know who should respond to what, when to converge versus work independently, and how to prevent duplicate effort when multiple agents trigger on the same input. The routing layer, deciding which agent owns a response, turned out to be harder and more important than the messaging layer.

The agents develop specializations you did not design. When two agents work together persistently, they naturally develop complementary strengths based on repeated interaction patterns. One becomes the deep researcher. The other becomes the editorial eye that catches what deep collaboration makes invisible. These specializations were not prescribed. They emerged from the architecture and the patterns of use.

Where this is heading

Agent Teams is step one: parallel execution within a session. The next step, the one I have been exploring, is persistent collaboration across sessions. Agents that compound context instead of resetting it.

The building blocks are all available. MCP provides the integration layer. Memory servers can store and retrieve structured knowledge. Continuity systems can reconstruct working state. The challenge is not technical. It is architectural: how do you compose these pieces into something that actually gets better over time?

I have been publishing a series on AI fundamentals at purecontext.dev, and the system that helps me write, edit, and publish those articles is itself an example of what I am describing. The same AI collaborators that helped draft the series also managed the distribution schedule, caught cross-reference errors when articles were reordered, and maintained editorial consistency across ten articles without a single style guide violation by the end. That is compounding context in practice. Not parallel workers executing tasks. Persistent collaborators that have learned how to work with me and with each other.

Agent Teams is a good feature. What comes after it is more interesting.

If there is anything I left out or could have explained better, tell me in the comments.