AI coding agents can't fit a large codebase in context. When you ask one to audit 800 files, it reads some and skips the rest. I've tried Claude, ChatGPT, Deep Research for refactoring, type safety, architecture audits. Decent answers every time, but I could always find things they missed just by reading the code myself.
Fan-out auditing trades parallelism for thoroughness.
The idea
Instead of one agent doing a shallow pass on 500 files, you launch 200 agents each doing a deep pass on 5-8 files. AI gets worse at larger batches. An agent reading 5 files will catch things an agent reading 500 won't.
How it works
The prompt is a Claude Code slash command (~300 lines of markdown). The orchestrator:
- Runs one grep to find relevant files
- Groups files into slices of 5-8, keeping same-directory files together
- Shows the slice plan and waits for confirmation
- Launches agents in batches of 10, each writing findings to its own
.mdfile in the repo - After all phase 1 agents complete, launches phase 2 agents that read ~12 phase 1 files each and identify cross-cutting patterns
- Writes a final synthesis from the phase 1 and phase 2 files
Why small batches matter
This is the core insight. AI performance degrades with input size. Give an agent 5 files and it reads every line, considers each one against the criteria, and produces specific findings with line numbers. Give it 50 files and it starts pattern-matching on file names, skipping files that "look fine," and producing vague observations.
The fan-out pattern forces thoroughness by keeping each agent's scope small enough that it has no excuse to skim.
Why every agent writes to a file
The first version of this used agents that returned findings in their response, and the orchestrator summarized them. That's lossy. An agent finds 8 specific issues with line numbers, the orchestrator compresses them to "several type safety issues found," and the detail is gone.
When every agent writes to its own .md file in the repo, nothing gets lost. The orchestrator synthesizes from files, not from compressed return values in context. You can also watch the files appear in real time in your editor, which is useful for long runs.
Phase 2: cross-cutting patterns
Individual agents can only see their slice. If 9 different slices all have the same issue, no single agent knows that. Phase 2 agents each read ~12 phase 1 output files and look for patterns that span multiple slices. This is where findings like "this same function is reimplemented in 8 modules" emerge.
Phase 1 runs on Sonnet (file reading is straightforward). Phase 2 runs on Opus (reasoning across 12 reports to spot non-obvious patterns is harder).
What it's good for
I've used the same pattern for:
- Copy audits: checking user-facing text against a style guide or tropes list
- Refactoring: finding duplicated logic, consolidation opportunities, dead code
- Selling point discovery: reading every file to find features worth marketing
- Architecture audits: checking module boundaries, dependency violations, pattern compliance
You swap the reference document and the pre-filter grep. The fan-out mechanics stay the same.
What it doesn't fix
A human who knows the codebase inside out will still catch things this misses. The AI still can't reason about high-level architecture decisions or understand business context that isn't in the code. But the difference between "AI read some files and gave vague observations" and "AI read every file and gave findings with line numbers" is worth the 29 minutes.
The test run
I used it to check all user-facing text in my product (SlopWeaver, ~800 source files) against tropes.fyi (a catalog of AI writing tells).
Results: 201 slices, 809 files inspected, 220 output files, 180+ findings.
It scales to any repo size. A repo with 10,000 files would produce more slices and take longer, but the same prompt works.
Stack: Claude Code, Claude Sonnet 4.6 (phase 1), Claude Opus 4.6 (phase 2).
The prompt is open source: github.com/lachiejames/fan-out-audit
One markdown file, drop it into .claude/commands/. The repo includes the full output from this audit (all 220 files) so you can browse what it produces before running it.
Top comments (0)