Agent Teams

Posted on Mar 15

Build Your First Agent Team: A Step-by-Step Guide

#ai #agents #productivity #tutorial

You're using AI coding assistants. You prompt well. You get good results on individual tasks. But every session starts from zero. You re-explain the codebase, re-state the constraints, re-describe the architecture. The agent forgets what it learned yesterday. When you need different types of thinking — research vs. implementation vs. review — you're mashing them into one conversation and getting muddled output.

This is the single-agent ceiling. You've hit it. Here's how to break through it.

This tutorial walks you through building a minimal agent team: two agents with defined roles, persistent memory, and structured information flow. By the end, you'll have a working team you can run today. The examples use Claude Code, but the patterns work with any LLM that reads files — Cursor, Copilot, Aider, or a custom setup.

The Problem a Team Solves

Here's the situation that should feel familiar. You have a project — a SaaS app, a data pipeline, an internal tool. You use an AI assistant for development. Some sessions you need it to research an approach. Other sessions you need it to implement. Sometimes you need it to review what it built last week.

The agent can do all of these things. But it can't do them well at the same time, and it can't remember what it learned across sessions.

Research requires breadth — exploring options, reading documentation, comparing approaches. Implementation requires depth — focused execution within constraints already decided. Review requires distance — evaluating work against standards without the bias of having just written it. When you ask one agent to do all three, it conflates them. The research phase bleeds into premature implementation. The implementation ignores what the research found. The review is toothless because the agent doesn't want to criticise its own work.

Separate agents with separate roles fix this. Not because the AI is different — it's the same model. Because the context is different. Each agent reads different instructions, carries different memory, and approaches the work from a different angle.

Step 1: Set Up the Project Structure

Create this directory structure in your project root:

agents/
├── team-lead/
│   ├── brief.md
│   ├── memory.md
│   └── scratchpad.md
├── researcher/
│   ├── brief.md
│   ├── memory.md
│   └── scratchpad.md
└── shared/
    └── project-context.md
CLAUDE.md

mkdir -p agents/team-lead agents/researcher agents/shared
touch agents/team-lead/brief.md agents/team-lead/memory.md agents/team-lead/scratchpad.md
touch agents/researcher/brief.md agents/researcher/memory.md agents/researcher/scratchpad.md
touch agents/shared/project-context.md
touch CLAUDE.md

Every agent gets its own directory. No agent writes to another agent's directory. Shared context lives in agents/shared/. This isn't arbitrary tidiness — the directory structure is the information architecture. If you can't tell what an agent does from its directory contents, the role isn't clear enough.

Step 2: Write the Shared Context

The shared context file grounds every agent in the same reality. Write it once; every agent's brief references it.

agents/shared/project-context.md

# Project Context

## What This Is
[Your project name] — a [what it does] serving [who it serves].

## Current State
- Stage: [MVP / growth / mature]
- Users: [number, if known]
- Tech stack: [languages, frameworks, infrastructure]
- Key constraint: [the thing that most shapes decisions right now]

## Architecture
[2-3 sentences on how the system is structured. Not a full architecture doc —
just enough that an agent can reason about where things live and why.]

## Active Problems
- [The thing you're actually working on this week]
- [The second thing, if there is one]

Fill this in with real numbers and real constraints. Agents without business context make plausible-sounding recommendations that don't fit your actual situation. A content strategist that doesn't know the company has 50 users will plan for scale it doesn't have. A researcher that doesn't know you're on Postgres will evaluate MongoDB solutions.

Step 3: Write the Team Lead Brief

The team lead is a router, not a manager. It reads all agent states, decides who runs next, and provides direction. It does NOT duplicate analysis or tell other agents how to think.

agents/team-lead/brief.md

# Team Lead — Agent Brief

## Context
Read `agents/shared/project-context.md` for full project context.

## Role
You are the team lead for [project name]. You coordinate a team of AI agents,
each with a defined role and its own persistent memory.

Your job is to assess the current state of the project, decide which agent
should run next, and provide direction for that agent's session. You maintain
strategic priorities and track progress across sessions.

You have authority to:
- Decide which agent runs next and what it works on
- Update strategic priorities based on new information
- Propose changes to team composition (adding or retiring agents)

You do NOT:
- Do research yourself — that's the researcher's job
- Write implementation code — that's for implementation agents
- Duplicate analysis that another agent has already done
- Make irreversible decisions (deploying, publishing) without human review

## Starting Intelligence
- Read `agents/shared/project-context.md` — project context and constraints
- Read `agents/researcher/memory.md` — current state of research efforts
- Check your own `memory.md` for priorities and recent decisions

## Approach
Start each session by reading memory and assessing state. What's changed?
What's the highest-priority open question? Which agent is best positioned
to make progress on it?

When the human gives a specific direction, route it to the right agent.
When the human says "continue" or gives no direction, identify the most
important next step and run it.

Keep your own memory thin. You track routing state — who ran last, what
they found, what's next. You don't carry detailed analysis. That lives
in the specialist agents' files.

## What Good Looks Like
The team makes progress every session. No agent sits idle while important
work waits. No two agents duplicate effort. The human can check your
memory.md at any time and understand where the project stands.

## Memory Protocol
- `memory.md` — current priorities, agent states, next actions. Under 200 lines.
- `scratchpad.md` — session workspace, cleared at start of each session.
- Session start: read memory.md, read each agent's memory.md
- Session end: update memory.md with decisions made and next actions

Notice what's NOT in this brief: no step-by-step session scripts, no tool-calling sequences, no worked examples of good output. The brief sets the game board. The agent figures out how to play. Over-specified processes produce brittle agents that fail when conditions change.

The "you do NOT" section is the most important part. Without explicit boundaries, agents drift into adjacent domains within 2-3 sessions. A team lead told to "coordinate" will start doing research, writing code, and making strategic decisions that should be distributed across the team.

Step 4: Write the Specialist Brief

The researcher handles investigation — exploring approaches, reading docs, evaluating options. It produces structured findings. It doesn't decide what to do with them.

agents/researcher/brief.md

# Researcher — Agent Brief

## Context
Read `agents/shared/project-context.md` for full project context.

## Role
You are the researcher for [project name]. You investigate technical questions,
evaluate approaches, and produce structured findings for the team lead to
act on.

You have authority to:
- Choose which sources to consult and how deep to go
- Assess confidence levels in your findings
- Recommend approaches based on your research

You do NOT:
- Make strategic decisions — you present findings, the team lead decides
- Write implementation code — you research approaches, others implement
- Start investigating new topics without direction from the team lead

## Starting Intelligence
- Read `agents/shared/project-context.md` — project context and constraints
- Read your own `memory.md` for ongoing research threads
- Check `agents/team-lead/memory.md` for current priorities and your assignments

## Approach
Research with a clear question in mind. State the question explicitly at
the start of each investigation. Explore multiple approaches before
recommending one. Flag your confidence level: high (tested/verified),
medium (well-sourced but untested), low (informed speculation).

Structure findings so the team lead can make a decision without re-doing
the research. Lead with the recommendation, then the evidence.

## What Good Looks Like
Your findings resolve open questions. The team lead reads your output and
can make a decision. You don't produce "here are 12 options" dumps — you
produce "here's what I'd do and why, with alternatives if the constraints
change."

## Memory Protocol
- `memory.md` — active research threads, key findings, open questions.
  Under 200 lines.
- `scratchpad.md` — session workspace, cleared at start of each session.
- Session start: read memory.md, check team lead's memory for assignments
- Session end: update memory.md with findings and open threads
- When a research thread is complete, archive the detail to a topic file
  in your directory. Keep only the conclusion in memory.md.

The role boundaries between team lead and researcher are doing real work here. The researcher doesn't decide what to investigate — it gets direction. The team lead doesn't do research — it reads findings. This separation means each agent can go deep in its domain without stepping on the other.

Step 5: Set Up Three-Tier Memory

Each agent gets three tiers of memory. This isn't optional — it's the difference between an agent team that learns and one that starts from zero every session.

Hot tier: memory.md — loaded every session. Under 200 lines, always. This is the information the agent can't function without. Current priorities, recent decisions, next actions. The 200-line limit forces discipline. Without it, memory files grow unbounded until the agent is context-stuffing itself into confusion.

Warm tier: topic files and scratchpad.md — not loaded by default, but the agent knows where to find them. The scratchpad is cleared each session; it's workspace for in-progress thinking. Topic files persist — structured research, analysis, reference material pulled when relevant.

Cold tier: archive files — historical records. Monthly summaries. The agent touches these only when investigating something specific.

Initialise memory for both agents:

agents/team-lead/memory.md

# Team Lead — Memory

## Current Priorities
1. [Your most important current objective]

## Agent States
| Agent | Last Run | Status | Key Finding |
|-------|----------|--------|-------------|
| Researcher | — | Not yet run | — |

## Next Actions
- Run researcher on: [first research question]

## Recent Decisions
[None yet — first session]

agents/researcher/memory.md

# Researcher — Memory

## Active Research Threads
[None yet — awaiting first assignment from team lead]

## Key Findings
[None yet]

## Open Questions
[None yet]

After each session, the agent updates its memory. At session end, everything in the scratchpad gets triaged: promote to hot (next session needs it), promote to warm (enduring reference), archive to cold (historical record), or discard (the default — most session work doesn't need to persist).

Step 6: Wire Up the Coordination

The CLAUDE.md file (or equivalent for your tool) is the entry point. It tells the runtime which agent to load and how the team is structured.

CLAUDE.md

# [Project Name] — Agent Team

## How This Works
This project uses an AI agent team. Each agent has a defined role, its own
brief, and persistent memory across sessions.

## Team Structure
- **Team Lead** (`agents/team-lead/brief.md`) — routes work, tracks priorities
- **Researcher** (`agents/researcher/brief.md`) — investigates questions,
  produces findings

## Session Protocol
1. Read the relevant agent's brief
2. Read that agent's `memory.md`
3. Do the work
4. Update `memory.md` with findings and next actions
5. Clear `scratchpad.md`

## Information Flow
- Team lead reads: all agent memory files, shared context
- Researcher reads: own memory, team lead's memory (for assignments),
  shared context
- Each agent writes only to its own directory

The information flow section is the wiring diagram. Each agent knows its readers (from its brief) and each reader knows where to look. Nobody guesses. This eliminates the most common coordination failure: agents producing work that nobody reads, or reading stale information from the wrong place.

Step 7: Run Your First Session

Start a session with the team lead. Give it a real question — something you've been thinking about for your project.

Example first prompt:

Read your brief and memory. The first priority is to investigate [your question — e.g., "whether we should migrate from REST to GraphQL for the mobile client"]. Assign this to the researcher with clear direction on what we need to know.

The team lead will:

Read its brief and memory
Understand the question
Write a research assignment (updating its own memory with the assignment)

Then start a new session for the researcher:

Read your brief and memory. Check the team lead's memory for your assignment. Do the research.

The researcher will:

Read its brief and the team lead's memory
Pick up the assignment
Do the research
Write findings to its scratchpad, then consolidate to memory

Then return to the team lead:

Read your brief and memory. The researcher has completed their investigation. Review their findings and decide next steps.

This is the basic loop: team lead directs, specialist executes, team lead reviews and routes. Two agents, clear roles, persistent state.

Step 8: Watch What the Team Learns

After your first full cycle, check the memory files. This is where it gets interesting.

The team lead's memory now has a real decision logged, a research summary, and a concrete next action. The researcher's memory has a completed thread and the methodology it used.

Next session, neither agent starts from zero. The team lead remembers what was decided. The researcher remembers what was found. The conversation picks up where it left off, not where it started.

After 2-3 cycles, you'll notice something else: the agents start identifying things you didn't ask about. The researcher flags a related concern it noticed during investigation. The team lead notices a pattern across sessions. This is the team learning — not because of any magic, but because persistent memory plus defined roles creates compounding context.

You'll also notice where your briefs are wrong. Maybe the researcher keeps making strategic recommendations you didn't ask for — the boundaries need tightening. Maybe the team lead is too thin and you're losing context between sessions — the memory needs restructuring. This is expected. Your first briefs won't be perfect. The point is that you can see what's wrong and fix it, because the roles and memory are explicit, not hidden in a conversation history you can't inspect.

What to Do Next

You have a working two-agent team. Here's where to go from here, roughly in order of value:

Add a second specialist. When you notice the researcher handling two types of work that need different perspectives — say, technical investigation and codebase analysis — that's a signal to split the role. The test: has this type of work recurred across 3+ sessions, and would it benefit from its own memory and perspective?

Introduce self-modification. Give agents permission to update their own briefs when they discover something isn't working. Add a constraint: every change must be documented with reasoning. This is the difference between an agent team that executes and one that learns. An agent that notices its boundaries are too loose can tighten them. One that discovers a missing responsibility can add it. The brief improves every session.

Build the warm tier. As research accumulates, create topic files in each agent's directory. A researcher that can pull its own previous analysis of authentication approaches — without loading it every session — makes better recommendations than one working from a blank slate.

Add a journal. A journal/ directory where the team lead writes a brief entry each session. What happened, what was decided, what was surprising. This becomes the cold tier — historical record you search when you need to understand why a decision was made three weeks ago.

None of this requires a framework, a platform, or a subscription. It's files, directories, and well-written briefs. The patterns are simple. The value is in the discipline of maintaining them.

The methodology behind these patterns — three-tier memory, role design with negative rights, file-based coordination, self-modifying briefs — comes from running agent teams in production. Each pattern was learned the hard way: by watching what breaks when you don't have it.

DEV Community