I scanned two public GitHub organisations and counted every cross-repository edge between them. The share that is infrastructure rather than code runs from a third to almost all. The share that is a code symbol is zero.
Spend a couple of days reading the launches that landed this spring and you start to notice what they have in common. GitLab shipped Orbit. Sourcegraph rebuilt its homepage around an AI code-review demo. Both are good, and both are graphs of your code. Both sell you on the same kind of question. Who calls this function. What inherits from this interface. What does changing this method signature touch. Every query in every demo is a code query.
There is a question underneath all of it that none of those launches answers. Symbol graphs are excellent. I will say that more than once in this post and mean it every time. But how much of what actually holds a real organisation together are they even looking at? Nobody has put a number on it. So I scanned two real organisations and counted.
Two graphs, and only one of them gets measured
The argument about these graphs has been had. I have had it twice, at length. There is the symbol graph: the functions, classes and imports inside your code, the thing SCIP indexes and Sourcegraph serves at the top of its category. And there is the artifact graph: the base images, Terraform modules, Helm charts and reusable CI templates your repositories share, bound together by edges that were never code. A Dockerfile FROM line. A Terraform source block. A GitLab CI include. Different nodes, different edges, a different parser entirely. I have worked that distinction through in full for Sourcegraph and for GitLab Orbit, and I would rather link those than repeat them.
What none of us did, in any of those posts, was measure the split. The whole debate is about which graph matters. It has been conducted without the one number that would settle the shape of it. In a real organisation, how much of the cross-repo coupling lives in the artifact graph a symbol tool cannot see? That number is not on the open web. I went and got it.
One clarification before the data, because the words collide. "Artifact dependency graph" also means something specific in supply-chain security, where it records what went into building a single binary. That is a different graph pointed in the opposite direction. This post is about the cross-repo sense: what consumes your shared artifacts across the org.
What I scanned, and how I counted
Two public GitHub organisations, chosen to sit at opposite ends on purpose. Prometheus, the monitoring stack, is about as code-centric as an org gets. Go-native, 58 repositories, the kind of place where you would expect the coupling to live in go.mod and nowhere else. Cloud Posse is the opposite by design. Its entire reason to exist is publishing reusable terraform-aws- modules, so if any organisation's coupling lives in infrastructure, it is that one. I am not hiding that choice. The point of the pair is to bracket the range, and you cannot bracket a range without picking the ends.
Both were scanned on the same day, 2026-06-25, on the same build of Riftmap (v1.6.6, commit 273794e). Prometheus came back with 58 repositories scanned, Cloud Posse with 242.
Here is the rule every number in this post is counted by, stated once so you can hold me to it.
Counts are directly-declared, in-organisation, cross-repository dependency edges from a single fresh full scan of each GitHub org, where both the consuming repo and the producing repo are inside the org and are not the same repo; references in
test/andexamples/trees are excluded. The canonical headline unit is one distinct consumer-repo to producer edge; raw manifest references are reported alongside. The "no symbol-graph representation" claim is scoped to the infrastructure-artifact bucket only (Terraform sources, DockerFROMs, CIusesand includes, Kubernetes images): those edges contain no code symbols or import statements. The language-package bucket (Go, npm, Python) is manifest-declared and partially representable by a symbol indexer via import resolution.
In plain terms. An edge only counts if it stays inside the org and crosses a repository boundary. A dependency that leaves for the public registry does not count. I count what is declared, not the transitive closure underneath it. And the strong claim, the one about edges a symbol graph cannot represent, applies to the infrastructure layer, not to the package edges a symbol index can partly resolve through imports. I will come back to that last point and give the symbol graph its due.
One more convention. Every figure here is the product's all-confidence count, the same number app.riftmap.dev shows and anyone can re-run, with a stricter 0.8-confidence floor reported alongside wherever it changes the picture.
I will tell you how I know the count is honest, because the way I found out is the best evidence I have. The first scan undercounted Cloud Posse badly. It parsed 1,361 Terraform Registry references like source = "cloudposse/label/null" and resolved exactly none of them to a repository inside the org, because the resolver was treating the registry namespace as somewhere external. The scan had found a gap in my own tool. I fixed the resolver, re-scanned, and Cloud Posse's resolved edges went from 199 to 490 while terraform-null-label's consumers went from 9 to 147. Prometheus came back identical, edge for edge. A change that moves the Terraform-native org two and a half times over and does not touch a single edge in the Go-native one is the cleanest evidence I can give you that the count measures what it claims to.
Where the edges actually live
Prometheus: a third of the coupling is already invisible
Start with the org that should be friendliest to the symbol-graph view, and the one I have scanned before. 214 cross-repo edges, from 271 references. The breakdown reads exactly like a Go shop. 132 go.mod requires, 40 GitHub Actions uses, 38 Dockerfile FROM lines, 3 Kubernetes image references, 1 npm dependency.
Add up the edges a symbol graph cannot represent, the Actions and the Docker images and the Kubernetes references, and you get 38% of the total. Call it a third, in the most code-centric org I could find. At the stricter confidence floor it is 32%, which is the number I would defend hardest. Either way, a third of what holds Prometheus together is infrastructure no symbol indexer sees.
And the other two-thirds? Those are go.mod requires. Not function calls. Manifest lines a build tool resolves later against a module registry. I will come back to what a symbol graph does with those. For now the only figure that matters is the one at the bottom of the table. Of 214 cross-repo edges, the number that are code symbols is zero.
Cloud Posse: almost all of it is infrastructure
Now the other end. 490 edges, from 1,320 references. Here I have to be careful with the confidence split, because it is the one place a sharp reader can push, so let me push first. Of the 408 edges resolved at high confidence, 407 are infrastructure artifacts and exactly one is a language package. That is 99.75%. The product's headline number is 490 rather than 408, and the difference is 82 lower-confidence heuristic edges that mix real infrastructure couplings with a handful of Go import paths and some noise. I hold those out of the claim. That is why the share is anchored to the 408 I can stand behind, not the 490.
The composition of the part I can stand behind. 359 Terraform module sources, 36 Actions, 8 Docker base images, 4 reusable workflows, and that single go.mod require. Cloud Posse is an extreme, and it is an extreme by construction. That is the reason it is here. It shows you where an organisation built entirely out of shared infrastructure actually sits, and it sits at almost exactly 100%. The symbol count, again, is zero.
Put the two ends together. The infrastructure share of cross-repo coupling runs from 38% in the most code-centric org I could pick to 99.75% in the most infrastructure-centric one. The number that does not move is the symbol count. In both organisations it is zero.
One module, 147 repositories
Composition is the shape of the thing. Fan-in is where it bites. So here is the single most-consumed artifact in either org, and it retires a guess I made in writing once.
In the glossary entry for the artifact graph I wrote, as a hypothetical, that a shared Terraform module might be sourced by forty repositories. The measured figure for Cloud Posse's terraform-null-label is 147. Every one of those 147 repositories pulls it through the same line, module "this" { source = "cloudposse/label/null" }, sitting in a context.tf. Behind it the next most-shared modules are route53-cluster-hostname at 15, security-group at 13, iam-role at 9.
Sit with the 147 for a moment. That is one module, and changing it re-plans 147 repositories across the org. The edge that connects it to each of them is a single string in a manifest. There is no function call. There is no import statement. There is nothing in any symbol table. A symbol graph could index every line of Go in Cloud Posse and represent precisely none of that 147-repository blast radius, because the blast radius was never written in code.
A note on the number, so it is unimpeachable. 156 repositories reference the module in raw declarations. 147 of those resolve to distinct in-org consumers once self-references and test trees are stripped out. 147 is the figure I stand behind.
What a symbol graph does see
I have said symbol graphs are excellent twice now, and I want to make the concession real, because the claim I am making is narrower than "symbol tools are blind" and the difference is the whole point.
A symbol graph is not blind to all of this. For the language-package edges, Prometheus's 132 go.mod requires, SCIP carries the relationship. Its external-symbol mechanism resolves an import in one repository to the package that defines it in another, with version metadata attached. Ask Sourcegraph whether node_exporter depends on client_golang and it can tell you, and it would be right. I am not going to pretend otherwise to make my number look bigger.
What a symbol graph cannot do is two separate things. The first is the headline. It cannot represent the infrastructure edges at all. A Dockerfile FROM, a Terraform source, a GitLab CI include, an Actions uses. None of them contains a symbol or an import for an indexer to resolve, so none of them appears in a symbol index. That is the infrastructure bucket, the one that ran from a third to nearly all, and it is structural. It is not a coverage gap someone closes next quarter. The second is subtler, and it applies even to the package edges a symbol graph can see. An index will tell you node_exporter imports client_golang. It will not tell you which version constraint each of the 23 consumers is pinned to, which of them float to your next release, or which reference is hiding in a second go.mod three directories down. That resolution is the work. It is the thing the artifact graph does and the symbol index does not.
So here is the precise version of the claim, the one I will defend against anyone at Sourcegraph or GitLab. Not one cross-repo edge in either organisation is a symbol. And the infrastructure share of those edges, the part no symbol index reaches at all, runs from a third to nearly all.
The graph that was carrying the weight
A symbol graph can read every line of code in your organisation and still not see the base-image bump about to take a dozen services down, or the one-line source reference binding 147 repositories to a single module. Those edges were never code. They were declarations, sitting in manifests, waiting for a build tool to resolve them long after the symbol indexer had finished its pass. The graph that maps them is a different graph. And in the two organisations I measured, it was carrying somewhere between a third and very nearly all of the weight.
None of which makes that graph exotic. Past a certain repository count it is as load-bearing as the symbol graph, and a serious setup needs both. The requirements are not subtle either. It has to be parsed from source, so it stays current as the repositories change instead of going stale the moment someone forgets to update a catalogue. It has to be queryable, so the engineer reaching for it mid-incident and the agent reaching for it at planning time get the same answer. And it has to read across whatever platforms you actually run on, because the edge that bites is the one running from a GitHub service onto a base image published from GitLab. That is the primitive. It is not a product. It is the thing the industry has spent this year slowly working out that it needs, one launch at a time.
Where I build that primitive, it is called Riftmap. One read-only token, a GitLab or GitHub organisation parsed across twelve ecosystems, and the artifact graph comes back as something you can use. A visual blast radius in the UI, the same graph over an API for the agents that need it at planning time. Parsed, not inferred. Auto-discovered, never declared. The numbers in this post came out of it, and you can point it at your own org and get yours.
A few questions, answered directly
What fraction of cross-repo dependencies are infrastructure rather than code?
In two real public organisations measured in June 2026, the infrastructure-artifact share of in-org cross-repository edges ranged from 38% in a Go-native stack (Prometheus) to 99.75% in a Terraform-native one (Cloud Posse). It varies with what the organisation builds, but in both cases zero of the cross-repo edges were code symbols. The infrastructure edges are Terraform source blocks, Dockerfile FROM lines, CI includes and reusable workflows, none of which a symbol graph represents.
How many repositories typically depend on a single shared Terraform module?
In the Cloud Posse organisation, one module (terraform-null-label) is sourced by 147 repositories, each through a single-line source reference in a Terraform manifest. High-fan-out shared artifacts like this are the widest-blast-radius and least-visible edges in a polyrepo: changing one of them re-plans every consumer, and the edge to each consumer is a declaration in a manifest rather than anything in code.
Can a symbol graph such as SCIP, Sourcegraph or GitLab Orbit index infrastructure dependency edges?
No. Infrastructure edges like a Dockerfile FROM, a Terraform source or an Actions uses contain no code symbols and no import statements, so a symbol indexer represents none of them. A symbol graph can resolve language-package dependencies such as a Go import or an npm package through import resolution, but the infrastructure layer requires a different parser entirely.
Top comments (0)