Daniel Westgaard

Posted on May 13 • Originally published at riftmap.dev

You don't need a virtual monorepo. You need a graph.

#crossrepocontext #platformengineering #aicodingagents #virtualmonorepo

In the past weeks, two engineers have published some of the most concrete writeups I've seen of how to give AI coding agents context across more than one repository. Owen Zanzal's The Virtual Monorepo Pattern (March 23, 2026) and Rafferty Uy's Repo-of-Repos (May 2, 2026). Both describe a similar shape of solution. Bundle your repos into a workspace folder. Hand the agent a system map. Ship.

Both posts are good engineering. They're worth reading carefully. They also describe the strongest version of an approach that stops working at exactly the scale where the problem starts to matter, and the failure mode is the kind that's painful to retrofit.

What follows is an argument with the pattern, not with the authors. Both posts are the kind of thing I want more of: specific, shipped, written from real use. The disagreement is about where the pattern's ceiling is, and what's worth building above it.

This is the third post in a loose series. The first looked at Meta's tribal knowledge engine and the structural-vs-semantic decomposition hidden inside it. The second walked through three teams converging on cross-repo context as runtime infrastructure. This one is about the alternative response, why I think it's a stopgap rather than a destination, and the asymmetric upgrade path that makes the difference matter.

What the workspace pattern gets right

Owen Zanzal's piece is the cleanest articulation. Three files in a workspace directory:

.repos, a bash script that clones every relevant repo into a structured local folder.
CLAUDE.md, a hand-written system map describing how services relate to each other.
README.md, a deeper narrative for why the system is structured the way it is.

He calls this the virtual monorepo pattern. The thesis line is the part worth taking seriously: "the problem isn't repo structure. The problem is context visibility. Your AI assistant doesn't need to live in a monorepo. It just needs to see one. You don't need a monorepo. You need a monorepo view."

I think that's a correct assessment, and it's the right diagnosis for most teams in this position. They do not actually want to migrate to a monorepo. The migration cost is real, the political friction across teams is real, and the CI/CD changes are real. The workspace pattern delivers a wider context window without paying any of those costs. It's purely additive. It exists alongside the existing repo structure. Nothing breaks.

Rafferty Uy's repo-of-repos pattern, published a week ago, is a more polished version of the same idea, named after the agent (Tony) it powers. An outer "agent" repo pulls in all related repos under a repos/ folder as workspace folders, while commits still flow back to each underlying repo's own origin. He ships it as a GitHub template. He explicitly names three things that have changed in the last year that make the pattern work right now: context engineering is now a discipline, context windows got significantly larger, and "agents grep before they answer."

I want to single out the second of those because it's the load-bearing assumption of the entire pattern, and the rest of this post is going to push on it.

The load-bearing assumption

The workspace pattern works as long as two things hold simultaneously:

The bundled workspace fits in the agent's context window.
Grepping over that workspace is cheap enough that the agent reaches for it routinely.

Both of those hold for Owen Zanzal at 35 repos. They hold for Rafferty Uy at the size he describes. They will hold for many readers of this post. However, I don't think they hold at 200 repos, and the failure isn't gradual.

Frontier model context windows are around a million tokens at the upper end. A typical mid-sized service repository runs to several hundred thousand lines of code, much of it dependencies and tests. Bundle 200 of them and you don't fit. You don't fit in the next generation either, because organisations grow repos faster than frontier labs ship context length increases.

Even when the workspace fits in the window, the workspace doesn't load itself into the prompt. Agents grep before they answer, in Rafferty's framing. Grep is O(N) over workspace size. Asking "what depends on X" by grepping 200 repos costs token, time, and latency that doesn't scale, and the answer is fuzzy at best. Meta's published number on this is the cleanest data point in the public record: a graph lookup answers "what depends on X" in around 200 tokens, the same question by exploration costs around 6,000. That's a 30x reduction. It isn't a frontier-model problem. It's an architecture problem.

There's also a third issue, smaller in any single session but compounding over time. The CLAUDE.md system map at the heart of the workspace pattern is a hand-written context file. The Gloaguen et al. paper from ETH Zurich and LogicStar.ai (February 2026) studied this directly across four coding agents on 138 niche-repository tasks plus SWE-bench Lite. Their finding: developer-written context files give a marginal +4% improvement in agent success rate, at +19% inference cost. LLM-generated context files give a -3% effect at +20% cost. Across the board, "context files do not provide effective overviews." Agents take the same number of steps to find relevant files whether a context file is present or not.

That doesn't make CLAUDE.md useless. The +4% case is real. But it caps the upside of the pattern, and it makes maintenance non-negotiable. Owen Zanzal himself names this cost explicitly: "Keeping .repos in sync. New repos need to be added to the script. Repos that change names or move need to be updated. This is low-friction but not zero-friction." That's true at 35 repos. At 200, with five new repos a quarter, it is a part-time job nobody's job description includes.

How the pattern actually breaks

I want to be specific about how this breaks, because abstract failure modes don't land. Five concrete things go wrong as you scale.

One. The bundled workspace exceeds the context window. The agent silently truncates or the editor refuses to load the folder. You can mitigate this with multiple smaller workspaces scoped to different domains, which Owen Zanzal recommends as a workaround, but you've now manually re-implemented domain partitioning that the dependency graph would have given you for free.

Two. The agent grep over a 200-repo workspace becomes a 30x token tax on every "what depends on X" query. This isn't theoretical. Even Stripe, which has a real monorepo with hundreds of millions of lines of mostly Ruby code, doesn't dump everything into the agent's context. Their published architecture for the Minions agent system uses directory-scoped rule files that attach automatically as the agent traverses the filesystem, "rather than a single global context dump that would overflow any model's window." If Stripe at monorepo scale can't dump it all, neither can your virtual monorepo at 200 repos.

Three. CLAUDE.md decays. New service added, old service deprecated, schema changes, ownership boundaries shift. The system map drifts from reality. By the time someone notices, the agent has been confidently shipping changes against a stale model of the system. This is the failure mode Meta's engineering team named in their April 6 post on tribal knowledge: "context that decays is worse than no context." It's the reason their system uses a self-refreshing critic swarm. The workspace pattern has no equivalent, because nothing's running against it.

Four. Workspace gives the agent read access. It does not give the agent dependency awareness. The agent can grep through every repo in the workspace and still ship a change to the API repo that breaks the consumer it never opened that session. The cross-repo dependency graph is exactly the data structure that closes that gap, and grepping the workspace doesn't reconstruct it.

Five. You can't bolt the substrate on cheaply once the workspace pattern has settled in. Once teams are committing CLAUDE.md updates, scripts depend on .repos layout, and onboarding tells new engineers "open the agent workspace," migrating to a queryable graph is a second project, not a refinement of the first. The substrate has to exist before you need it.

The point both authors gesture toward but don't reach

Here's what surprised me when I read Owen Zanzal's piece carefully. His final section is called "Where This Pattern Can Go." It reads:

Auto-generating CLAUDE.md from service metadata, API contracts, and event schemas, keeping the system map current as the system changes.

Dependency graphs derived from import analysis, Terraform dependency trees, or event topic mappings, and feeding those into the AI context automatically.

Architecture doc generation as a CI artifact, so the README stays in sync with the actual system rather than drifting toward fiction.

That is the substrate. He has named it precisely. He has written the bullet points for what comes next. The only thing he hasn't done is build it.

This is the pattern with most workspace posts I've read. The author solves the immediate problem at their current scale, lists "auto-generated dependency graph" as future work, and ships. Rafferty Uy's repo-of-repos template assumes the agent grepping the workspace is the operating model and never asks at what scale grepping stops being affordable. Both posts are written for the scale where the workspace is the right level of investment. The substrate question opens up at a different scale, and neither author was writing at that scale.

I think the better read is: the workspace pattern is the stopgap, and the substrate is what it grows into. Specifically, the auto-generated dependency graph bullet at the end of Owen Zanzal's post is doing the load-bearing work. Without it, the pattern hits the maintenance ceiling somewhere between 50 and 100 repos. With it, the maintenance ceiling moves from "engineer with a weekend" to "platform that auto-discovers."

That second thing is what Riftmap exposes today. It's what Mabl built by hand into their 850-line Repo Coordination Graph in Part 1 and now extended to 100+ repositories with explicit blast-radius-for-tech-debt framing in Part 2. It's the cross-repo dependency index Meta produced as a byproduct of their tribal knowledge engine. It's what Harness, the CD vendor, called the "knowledge graph" that lets you see "the blast radius of every change before it merges" in their April 1 essay on Source Context Management. Four independent teams have built it or named it as a primitive in the past six weeks. The workspace pattern is the thing you ship while you wait for that primitive to exist for your stack. If it already exists, the pattern is the alternative bet, not the better one.

Stripe is the proof

If I had to pick one piece of evidence that the workspace pattern alone doesn't carry you, it's Stripe.

Stripe runs an internal coding agent system called Minions. As of February 2026, Minions produced over 1,300 merged pull requests per week, all written entirely by AI, all reviewed by humans. They run against Stripe's actual codebase, which is hundreds of millions of lines of mostly Ruby. Stripe is not 35 repos. They are the maximum case for what "monorepo plus large model" can do.

And it isn't enough. Stripe built an MCP server called Toolshed that exposes nearly 500 internal tools to Minions. They built directory-scoped rule files instead of a global context dump because, in their own published words, the global dump "would overflow any model's window." They built pre-warmed devboxes, deterministic verification gates, and a custom fork of Block's Goose agent. Signadot's piece Coding Agents Are Only as Good as the Signals You Feed Them puts it cleanly: "Stripe did not achieve this volume simply by pointing a large language model at its monorepo."

If the team with the world's most demanding monorepo can't simply bundle everything and let the agent figure it out, the bet that a virtual monorepo will solve it for the rest of us might be structurally unsound. The substrate is what you build because the workspace alone, even when the workspace is real, isn't where the leverage lives.

The asymmetric upgrade path

Here's how I think about the buy/build decision for teams in this range.

At 30 to 50 repos with one engineer who has a weekend. The workspace pattern is genuinely fine. Build it. Owen Zanzal's three files will get you most of the way. Rafferty Uy's repo-of-repos template will get you the rest. Don't overthink it.

At 100+ repos and growing. The workspace pattern is a maintenance bet you're going to lose. CLAUDE.md will decay, .repos will drift, and the agent will gain false confidence in stale information. The fix is structural, not procedural. The dependency graph derived from source code, refreshed on every push, is what closes the gap, and you want it before you need it.

At any scale, if the substrate already exists for your stack. The workspace is the alternative bet, not the better one. If Riftmap or something like it can already auto-discover your dependency graph from a read-only org token, the workspace pattern is solving by hand a problem you don't have to solve manually.

The asymmetry that matters is on the upgrade path. Going from no-workspace to workspace is a weekend. Going from workspace-with-decayed-CLAUDE.md-and-stale-dependency-bullets-in-prose to a queryable graph is a project. The migration tax goes up the longer you wait. "We'll do the graph later" is how teams end up with a context file they don't trust and a dependency model they have to recompute by reading.

Closing

Owen Zanzal's deeper principle from the closing of his post is "context beats structure for AI effectiveness." That's a good line and I agree with it as far as it goes. The structural codebase organisation isn't what determines whether an agent ships safe code; the context the agent has is.

The version of that principle I'd push for is one step further: structure beats context that has to be rewritten by hand. The kind of context that wins is the kind that updates when the code does, by construction, on every push. The workspace pattern delivers context. The substrate delivers context that doesn't decay.

I believe if you're going to invest in either, invest in the one that doesn't need a maintainer.

The architectural bet hasn't changed. Deterministic parsers first. Graph as the durable layer. AI as the layer on top, anchored to verified structure rather than reconstructing it from grep across a workspace that someone has to remember to update.

If you're running AI coding agents across more than a handful of repositories and finding yourself maintaining a CLAUDE.md that's drifting from reality, that's the gap. You can build the substrate yourself, the way Mabl did. Or you can start a free scan and let the parsers find it.

Sources referenced

Owen Zanzal, The "Virtual Monorepo" Pattern: How I Gave Claude Code Full-System Context Across 35 Repos — medium.com/devops-ai, March 23, 2026
Rafferty Uy, Repo-of-Repos: Tony's Multi-Repo Workspace for AI Coding Agents — raffertyuy.com, May 2, 2026
Ompragash Viswanathan, Harness, Your Repo Is a Knowledge Graph. You Just Don't Query It Yet. — harness.io/blog, April 1, 2026
Geoff Cooney, mabl, How We Built a System for AI Agents to Ship Real Code Across 75+ Repos (Part 1 of 2) — mabl.com, April 8, 2026
mabl, How We Built a System for AI Agents to Ship Real Code Across 75+ Repos (Part 2 of 2) — mabl.com, April 28, 2026
Engineering at Meta, How Meta used AI to map tribal knowledge in large-scale data pipelines — engineering.fb.com, April 6, 2026
Gloaguen et al., ETH Zurich and LogicStar.ai, Do Context Files Help Coding Agents? — arxiv.org/abs/2602.11988, February 2026
Signadot, Coding Agents Are Only as Good as the Signals You Feed Them — thenewstack.io, April 2026
Riftmap, AI coding agents need cross-repo context — riftmap.dev/blog, May 12, 2026
Riftmap, Meta needed 50+ AI agents to map their tribal knowledge — riftmap.dev/blog, May 8, 2026

Appendix: structured summary

Claim: Two patterns have emerged for giving AI coding agents context across multiple repositories. The workspace pattern (Owen Zanzal, Rafferty Uy) bundles repos into a local workspace and hands the agent a hand-written system map. The substrate pattern (Mabl, Meta, Harness, Riftmap) builds a queryable cross-repo dependency graph derived from source code. Both work at small scale. Only the substrate survives the transition past ~100 repos, because the workspace pattern's load-bearing assumptions (workspace fits the context window, grep is cheap, hand-written context stays current) all break at the same point.

Evidence:

Frontier model context windows top out around 1M tokens; 200 mid-sized service repos do not fit, and the next generation of models will not close the gap because organisations add repos faster than context lengths grow.
Meta's published data: graph lookup for "what depends on X" costs ~200 tokens; the same answered by exploration costs ~6,000. A 30x architecture-not-model gap.
Gloaguen et al. (arXiv:2602.11988): hand-written context files give +4% agent success rate at +19% cost. LLM-generated context files give -3% at +20% cost. Context files do not provide effective overviews.
Stripe Minions (1,300+ merged PRs/week, hundreds of millions of LOC): even with a real monorepo, Stripe uses directory-scoped rule files because a global context dump "would overflow any model's window." The Toolshed MCP server exposes ~500 tools to agents.
Harness, mabl, Meta, Riftmap (and adjacent: Augment, Depwire, Modulus) have all named the queryable dependency graph as a substrate primitive in the past six weeks.

Architectural takeaway: The workspace pattern is a stopgap with a known maintenance ceiling between 50 and 100 repos. The substrate is what it grows into. Owen Zanzal's "Where This Pattern Can Go" section names auto-generated dependency graphs from import analysis, Terraform trees, and event topic mappings as the next step; the substrate is exactly that, exposed as a queryable graph rather than as bullets in prose. The migration tax goes up the longer you wait, so substrate-first is the correct call for any team above 100 repos and any team that expects to be there within a year.

Audience: Platform engineers, DevOps leads, and engineering managers running AI coding agents across more than a handful of repositories, especially those weighing whether to invest in a virtual-monorepo workspace or in a queryable dependency substrate.

Top comments (4)

Theo Valmis • May 13

The ceiling argument is the key one. The workspace pattern works when cross-repo relationships are stable enough to document in a hand-written CLAUDE.md. It breaks when dependencies shift faster than anyone updates the map -- which is exactly the situation where you most need accurate context.

What you're pointing at is that the workspace pattern and the graph pattern solve orthogonal problems. The workspace pattern answers 'how does the agent see multiple repos.' The graph answers 'how does the agent know which relationships between repos are currently true.' Most teams build the workspace layer and call it done, then discover the graph problem when an agent makes a cross-repo change that was valid six months ago but isn't anymore.

Daniel Westgaard • May 14

I really think "orthogonal" is the right word and I wish I'd used it in the post. Workspace gives the agent spatial visibility across repos. The graph gives it temporal truth about which relationships currently hold. Different axes, both needed.

The thing that ties them together is that staleness is a property of any map, not just CLAUDE.md. A hand-drawn architecture diagram in Confluence has the same problem: it's a graph, but it's a static one, so it drifts the moment a dependency changes. The 'valid six months ago but isn't anymore' failure mode shows up there too. The real axis isn't workspace versus graph, it's static versus live.

Curious how you've seen teams discover the graph problem in practice? Is it usually the loud version, a cross-repo change that breaks production, or something quieter, like an agent confidently building on a contract that's been deprecated for months?

Theo Valmis • May 14

The quieter version seems more common from what I've seen.

Not necessarily a dramatic production outage, but gradual architectural drift:
agents or engineers building against contracts, patterns, or dependencies that are technically still reachable but no longer organizationally valid.

The dangerous part is that these failures often look locally correct. The code compiles, tests pass, the dependency still exists somewhere in the graph, but the architectural intent has already shifted.

That's where I think governance starts becoming distinct from topology. The graph can tell you a relationship exists. The harder problem is determining whether that relationship is still allowed, recommended, deprecated, superseded, or violating some newer invariant.

Feels like autonomous systems amplify this because they optimize for reachable structure unless something explicitly constrains them otherwise.

Daniel Westgaard • May 14

"Technically reachable but organizationally invalid" names the gap precisely: the graph is correct about what exists, the org's intent has moved on, and nothing in structural topology tells you which side of that line you're on.

This is why I think governance has to live AS edges in the graph rather than alongside it. Deprecated, superseded, owned-by, do-not-build-against; first-class entities, queryable next to the structural deps. CODEOWNERS and @deprecated are repo-local stabs at this. ADRs are the docs-layer stab. Neither crosses repo boundaries.

One thing I'm genuinely unsure about as a builder: do you think these constraints can be inferred from existing signals (commit patterns, ADR mentions, deprecation comments) or do they have to be declared explicitly? My instinct is hybrid; inference to seed, explicit to confirm, but inference quality probably degrades fast on older codebases where the original intent is buried.

DEV Community

You don't need a virtual monorepo. You need a graph.

What the workspace pattern gets right

The load-bearing assumption

How the pattern actually breaks

The point both authors gesture toward but don't reach

Stripe is the proof

The asymmetric upgrade path

Closing

Sources referenced

Appendix: structured summary

Related reading

Top comments (4)