<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Daniel Westgaard</title>
    <description>The latest articles on DEV Community by Daniel Westgaard (@danielwe).</description>
    <link>https://dev.to/danielwe</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3876610%2F1d0996da-2e1c-4cac-979f-f2a9d33d8b15.jpg</url>
      <title>DEV Community: Daniel Westgaard</title>
      <link>https://dev.to/danielwe</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/danielwe"/>
    <language>en</language>
    <item>
      <title>Declared, inferred, registered: the three ways a tool knows a cross-repo dependency exists</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Sat, 04 Jul 2026 13:37:47 +0000</pubDate>
      <link>https://dev.to/danielwe/declared-inferred-registered-the-three-ways-a-tool-knows-a-cross-repo-dependency-exists-3ca7</link>
      <guid>https://dev.to/danielwe/declared-inferred-registered-the-three-ways-a-tool-knows-a-cross-repo-dependency-exists-3ca7</guid>
      <description>&lt;p&gt;Three lines were open in three tabs on my screen last week, and all three declared a dependency that crosses a repository boundary.&lt;/p&gt;

&lt;p&gt;The first was a Helm chart. In &lt;code&gt;argoproj/argo-helm&lt;/code&gt;, the &lt;code&gt;argo-cd&lt;/code&gt; chart's &lt;code&gt;Chart.yaml&lt;/code&gt; carries a &lt;code&gt;dependencies:&lt;/code&gt; block:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;dependencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-ha&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;4.38.0&lt;/span&gt;
    &lt;span class="na"&gt;repository&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://dandydeveloper.github.io/charts/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second was Terraform. In &lt;code&gt;cloudposse/terraform-aws-vpc&lt;/code&gt;, the root &lt;code&gt;main.tf&lt;/code&gt; has a &lt;code&gt;module&lt;/code&gt; block whose &lt;code&gt;source&lt;/code&gt; points at another repo entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"label"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cloudposse/label/null"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"0.25.0"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The third was a Dockerfile. In &lt;code&gt;cilium/cilium&lt;/code&gt;, &lt;code&gt;images/cilium/Dockerfile&lt;/code&gt; builds its release stage from a base image passed in as a build argument:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;${CILIUM_RUNTIME_IMAGE}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;release&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run &lt;code&gt;grep&lt;/code&gt; across the org for any of these and you get a partial answer. It finds the string &lt;code&gt;redis-ha&lt;/code&gt;, but not that the chart resolves against a &lt;code&gt;Chart.lock&lt;/code&gt; you would have to read separately. It finds &lt;code&gt;cloudposse/label/null&lt;/code&gt;, but has no idea that registry short-address maps to the &lt;code&gt;cloudposse/terraform-null-label&lt;/code&gt; repo. It finds &lt;code&gt;${CILIUM_RUNTIME_IMAGE}&lt;/code&gt; and stops, because the real image name is bound somewhere else. Point a symbol graph at the same three files and it finds nothing at all. None of these is a programming-language symbol. No compiler and no SCIP indexer parses a &lt;code&gt;Chart.yaml&lt;/code&gt;, an HCL &lt;code&gt;module&lt;/code&gt; block, or a Dockerfile instruction as source code.&lt;/p&gt;

&lt;p&gt;Here is the claim I want to plant before we go further. Before you merge and run a change, a cross-repo dependency can be known to a tool in three ways: &lt;strong&gt;declared&lt;/strong&gt; in a manifest the machine already executes, &lt;strong&gt;inferred&lt;/strong&gt; from statistical signal, or &lt;strong&gt;registered&lt;/strong&gt; in a catalog a human maintains. (A fourth mode, observing the edge at runtime, needs the change already running, which is exactly what you do not have before merge. More on that below.) Those three regimes are not three qualities of the same thing. They are three different answers to the question &lt;em&gt;how did the tool come to know this edge exists at all&lt;/em&gt;, and each one buys a different, structural failure mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Parsed, not inferred" stages a two-horse race and quietly drops a third runner
&lt;/h2&gt;

&lt;p&gt;The slogan I and half the industry reach for is "parsed, not inferred." It is a good slogan and it is doing less work than it sounds like. It stages a two-horse race: on one side the tool that reads what a manifest says, on the other the tool that guesses from embeddings and model output. That framing is real, but it hides the regime that quietly runs a large share of platform teams, which is neither parsed nor inferred. It is &lt;em&gt;registered&lt;/em&gt;: an edge some human typed into a catalog, that no machine executes and no model produced.&lt;/p&gt;

&lt;p&gt;So the honest split is three-way. Declared, inferred, registered. This is a different axis from the one I drew in &lt;a href="https://riftmap.dev/blog/can-ai-check-blast-radius-of-pr-before-merge/" rel="noopener noreferrer"&gt;an earlier post in this series&lt;/a&gt;, where the taxonomy was symbol / live-state / artifact. That split is about &lt;em&gt;which layer of the stack an edge lives on&lt;/em&gt;. This one is about &lt;em&gt;how a tool knows the edge is there&lt;/em&gt;. They compose. An artifact-layer edge can be declared, inferred, or registered, and the same Terraform &lt;code&gt;module source&lt;/code&gt; can show up in all three tools by three different routes. The rest of this post is about that second axis, because it is where the word "parsed" is quietly carrying an argument it never actually made.&lt;/p&gt;

&lt;h2&gt;
  
  
  Declared: the edge the machine already executes
&lt;/h2&gt;

&lt;p&gt;A declared dependency is one written into a manifest that the machine already reads and executes to do its job. Nobody adds a &lt;code&gt;FROM&lt;/code&gt; line to document a dependency. They add it because the build will not produce an image without it. The dependency edge is a side effect of a file that has to be correct for the system to run at all, which is what makes it deterministic. &lt;code&gt;terraform init&lt;/code&gt; resolves the &lt;code&gt;module source&lt;/code&gt; or the plan fails. &lt;code&gt;helm dependency update&lt;/code&gt; pulls the chart named in &lt;code&gt;dependencies:&lt;/code&gt; or the release is incomplete. The edge is not a description of the system. It is part of the system.&lt;/p&gt;

&lt;h3&gt;
  
  
  What counts as declared
&lt;/h3&gt;

&lt;p&gt;The declared regime is wide, and it is precise. Each ecosystem has its own construct, and the point is to name the construct rather than wave at "config files":&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Helm &lt;code&gt;Chart.yaml&lt;/code&gt; &lt;code&gt;dependencies:&lt;/code&gt; entries (name, version, repository), which Riftmap reads as &lt;code&gt;helm_dependency&lt;/code&gt; edges.&lt;/li&gt;
&lt;li&gt;Terraform &lt;code&gt;module { source = ... }&lt;/code&gt; blocks, read as &lt;code&gt;terraform_module&lt;/code&gt; edges. Registry short-addresses and git URLs count as cross-repo; a bare &lt;code&gt;./&lt;/code&gt; local path does not, nor do full registry URLs like &lt;code&gt;registry.terraform.io/&lt;/code&gt; or &lt;code&gt;app.terraform.io/&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Dockerfile &lt;code&gt;FROM&lt;/code&gt; lines, read as &lt;code&gt;docker_base_image&lt;/code&gt; edges.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;go.mod&lt;/code&gt; &lt;code&gt;require&lt;/code&gt; directives, and the &lt;code&gt;replace&lt;/code&gt; directives that quietly redirect them.&lt;/li&gt;
&lt;li&gt;GitHub Actions &lt;code&gt;uses:&lt;/code&gt; values, whether they point at &lt;code&gt;owner/repo@ref&lt;/code&gt; or a reusable &lt;code&gt;.github/workflows/x.yml@ref&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;GitLab CI &lt;code&gt;include:&lt;/code&gt; in its several forms (&lt;code&gt;project:&lt;/code&gt;, &lt;code&gt;remote:&lt;/code&gt;, &lt;code&gt;component:&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Kustomize remote &lt;code&gt;resources:&lt;/code&gt; and &lt;code&gt;bases:&lt;/code&gt;, read as &lt;code&gt;kustomize_resource&lt;/code&gt; edges.&lt;/li&gt;
&lt;li&gt;npm &lt;code&gt;package.json&lt;/code&gt; dependencies, including the &lt;code&gt;npm:&lt;/code&gt; alias and &lt;code&gt;git+&lt;/code&gt; forms where the imported name and the actual package differ.&lt;/li&gt;
&lt;li&gt;Ansible, where the precision matters. A role's &lt;code&gt;meta/main.yml&lt;/code&gt; &lt;code&gt;dependencies:&lt;/code&gt; list emits a role-to-role edge (&lt;code&gt;ansible_role&lt;/code&gt;) regardless of how the string is dotted. A task in a playbook that calls a three-segment FQCN like &lt;code&gt;polaris.infrastructure.deploy&lt;/code&gt; emits a collection edge (&lt;code&gt;ansible_collection&lt;/code&gt;). Two different files, two different edge types. An FQCN in &lt;code&gt;meta/main.yml&lt;/code&gt; still resolves as a role dependency and not a collection reference, because that file is what Ansible reads when it loads a role's dependencies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That precision is the whole personality of the declared regime. The edge is not "there is a dependency somewhere in this YAML." It is a named construct with a known grammar and a known resolution step.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why grep only half-sees it
&lt;/h3&gt;

&lt;p&gt;Grep finds the string and misses the meaning, because in every one of these constructs the literal text is not the resolvable target. A Helm &lt;code&gt;version:&lt;/code&gt; is usually a semver range, not a pinned version. A Terraform registry short-address like &lt;code&gt;cloudposse/label/null&lt;/code&gt; has to be resolved through the registry's naming convention before you know which repo backs it. A Dockerfile &lt;code&gt;FROM ${VAR}&lt;/code&gt; names a variable, not an image. A GitLab CI &lt;code&gt;include:&lt;/code&gt; has five distinct shapes and an unqualified shorthand that silently resolves to local-or-remote depending on the string. An npm dependency can be declared under an alias, so the name in the code and the package actually installed are different strings. Grep sees text. The declared edge is text plus a resolution rule, and grep does not run the rule.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why a symbol graph misses it
&lt;/h3&gt;

&lt;p&gt;For most of these constructs, a symbol graph does not miss the edge so much as never look at it, because a symbol graph indexes programming-language symbols and none of these are symbols. Helm, Terraform, Docker, GitLab CI, GitHub Actions, Ansible, Kustomize. A compiler-accurate indexer like Sourcegraph's SCIP has nothing to say about any of them, because they are not code it compiles. This is not a knock on Sourcegraph. Symbol graphs and &lt;a href="https://riftmap.dev/blog/symbol-graphs-and-artifact-graphs/" rel="noopener noreferrer"&gt;artifact graphs are different categories&lt;/a&gt;, and Sourcegraph is genuinely excellent at the category it is in.&lt;/p&gt;

&lt;p&gt;I want to be fair about the two exceptions. For &lt;code&gt;go.mod&lt;/code&gt; and &lt;code&gt;package.json&lt;/code&gt;, the import path is itself a language-level symbol. Sourcegraph's own writeup on &lt;a href="https://sourcegraph.com/blog/cross-repository-code-navigation" rel="noopener noreferrer"&gt;cross-repository code navigation&lt;/a&gt; describes how SCIP's external symbols carry cross-repository dependency information across the languages it indexes, without calling out any ecosystem by name. A Go import path and an npm package name are exactly that kind of symbol, so I read those as edges a symbol graph &lt;em&gt;can&lt;/em&gt; resolve cross-repo when cross-repo indexing is configured. That is my inference from how Go and npm name their imports, not a claim on Sourcegraph's page about those two ecosystems. It is a real capability, and a heavier lift most installs skip. The manifest parser reads the literal &lt;code&gt;require&lt;/code&gt; or &lt;code&gt;dependencies&lt;/code&gt; value regardless of indexing, and it still catches the cases a symbol resolver handles less cleanly: the &lt;code&gt;npm:&lt;/code&gt; alias, the renamed module path, the non-registry git source. Different mechanisms, overlapping coverage, and I would rather concede the overlap than pretend it away.&lt;/p&gt;

&lt;p&gt;The scale is the part that does not fit in a code review. A full Riftmap scan of the &lt;code&gt;cloudposse&lt;/code&gt; GitHub org (242 repos, completed 2026-07-02) found 147 repos declaring a dependency on &lt;code&gt;cloudposse/terraform-null-label&lt;/code&gt; via a Terraform &lt;code&gt;module { source = "cloudposse/label/null" }&lt;/code&gt; block. 138 were on the current &lt;code&gt;0.25.0&lt;/code&gt;. 9 were pinned behind. Nearly every one of those references sits at the same place, &lt;code&gt;context.tf&lt;/code&gt; line 24, the line cloudposse's own module template generates. That is one declared edge, in one construct, in one org, repeated across 147 repos on a single templated line. It is exactly the kind of signal that is trivial to parse and impossible to hold in your head across 242 repositories.&lt;/p&gt;

&lt;p&gt;The honest failure mode of the declared regime is coverage. A parser is software. It only sees the ecosystems someone wrote a parser for, and it has the blind spots any parser has. If a team declares a dependency in a format nobody has written a parser for, the edge is real and the tool does not see it. That is a genuine limit, and it is a different kind of limit from the two that follow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inferred: the edge guessed from statistical signal
&lt;/h2&gt;

&lt;p&gt;An inferred dependency is one a tool produces from statistical signal rather than reading it from a declaration. Embedding proximity. Name similarity. Model output. Co-change history. This is the regime that reaches for a coupling nobody wrote down in any artifact at all: two files that always move together in the commit history, a service whose vocabulary sits close to another's in embedding space, a natural-language question about the codebase answered from summaries rather than a parse. When there is no manifest entry and no catalog record to read, inference is the only thing left that can even suggest the edge exists. That is a real place on the map, and declared parsing does not stand on it.&lt;/p&gt;

&lt;p&gt;The failure mode is that inference has no ground truth. It produces a probability that an edge exists, and probabilities are wrong at a rate. This is measured, not folklore. When Richardeau et al. asked a range of LLMs to reproduce Zachary's Karate Club graph, every model got it wrong. The benchmark has 34 nodes and 78 known edges. The best model still added two edges that are not in the graph. Edge-count outputs across models ranged from 8 to 153 against a ground truth of 78 (&lt;a href="https://arxiv.org/abs/2409.00159" rel="noopener noreferrer"&gt;arXiv:2409.00159&lt;/a&gt;). In a code-specific setting it is sharper. On a 15-question architecture-discovery suite against the Shopizer repo, an AST-derived dependency graph scored 15 out of 15. An LLM-extracted knowledge graph scored 13. A vector-only baseline scored 6 (&lt;a href="https://arxiv.org/abs/2601.08773" rel="noopener noreferrer"&gt;arXiv:2601.08773&lt;/a&gt;). The same study documents a coverage failure distinct from being wrong: the LLM extraction pass skipped 377 files outright, so the graph it built was missing large parts of the dependency surface, not just occasionally mistaken about the parts it covered.&lt;/p&gt;

&lt;p&gt;The confidence score is the tell. An inferred edge comes with a number that means "how likely we think this edge is real," and that number is doing load-bearing work, because without it you cannot separate the edges the tool is sure about from the ones it guessed. Turn the threshold up and you drop real edges. Turn it down and you admit false ones. There is no setting that gives you both, because the underlying quantity is a belief, not a fact. Inference is the right tool when nothing is written down anywhere. It is strictly worse than reading the file when the edge is already declared, because guessing at an edge that is sitting in plain text can only add error to something you could have simply read.&lt;/p&gt;

&lt;h2&gt;
  
  
  Registered: the edge a human wrote in a catalog
&lt;/h2&gt;

&lt;p&gt;A registered dependency is one a human wrote into a catalog that no machine executes. Backstage represents it as &lt;code&gt;spec.dependsOn&lt;/code&gt; in a &lt;code&gt;catalog-info.yaml&lt;/code&gt;, which the catalog processor turns into a directional relation at ingestion. Port represents it as a relation between blueprints, single or many, which its docs frame as the software catalog as a dynamic graph database. And I want to concede the real thing first, because it is real: catalogs model relationships that parsing simply cannot see. Ownership. On-call. Which team you page. The tier of a service. There is no manifest the build executes that declares who owns a repo, and a good catalog is the right home for that.&lt;/p&gt;

&lt;p&gt;The failure mode is drift, because a registered edge is true only as of the last human edit, and nothing executes it to force a correction. This is not a competitor's insinuation. It is Backstage's own documented behaviour: &lt;a href="https://github.com/backstage/backstage/issues/20030" rel="noopener noreferrer"&gt;issue #20030&lt;/a&gt; describes how unregistering an entity leaves related entities carrying stale relationships until a later processing pass, and a Group page will show a live warning about relationships to entities that no longer exist. Port's CTO makes the maintenance case directly, though as an interested party. As he puts it on Port's blog, &lt;a href="https://www.port.io/blog/what-are-the-technical-disadvantages-of-backstage" rel="noopener noreferrer"&gt;"YAMLs require maintenance when code changes occur. This results in outdated information that can affect operations and decision-making…"&lt;/a&gt;. And from the adoption side, Roadie, a Backstage-ecosystem vendor and not a rival, reports two customers reaching &lt;a href="https://roadie.io/blog/3-strategies-for-a-complete-software-catalog" rel="noopener noreferrer"&gt;88% and 90% catalog completeness&lt;/a&gt; over roughly four months of active effort. That last number is the one I keep coming back to. Even funded, deliberate catalog work plateaus below 100%, because the catalog is a second job that competes with shipping, and the parts nobody remembered to update are silently indistinguishable from the parts that are current. I have written more on &lt;a href="https://riftmap.dev/blog/backstage-alternatives/" rel="noopener noreferrer"&gt;why teams quietly abandon the catalog&lt;/a&gt; elsewhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  A fourth mode: discovered at runtime, and why it is unavailable before merge
&lt;/h2&gt;

&lt;p&gt;There is a fourth way to know a cross-repo edge exists, and it is neither declared, inferred, nor registered: you can observe it at runtime. Because it lives on a different axis from the other three, I want to name it and set it aside cleanly rather than fold it into inference. A service mesh, DNS, live traffic, a database connection resolved in production. That edge is &lt;em&gt;discovered&lt;/em&gt; by watching the system run, and it is genuinely powerful, because it is the only thing that sees the undeclared HTTP calls a service makes to three others through environment variables injected at runtime, calls no manifest declares. This is the live-state layer, the subject of &lt;a href="https://riftmap.dev/blog/can-ai-check-blast-radius-of-pr-before-merge/" rel="noopener noreferrer"&gt;the first post in this series&lt;/a&gt;. &lt;a href="https://sixdegree.ai/blog/blast-radius-analysis" rel="noopener noreferrer"&gt;SixDegree&lt;/a&gt; calls it "discovered", and their tie-break rule, prefer discovered over declared when the two conflict, is correct for the question it answers. Runtime observation is righter than any manifest about what is talking to what &lt;em&gt;right now&lt;/em&gt;. What it cannot tell you is anything about a change that has not been merged yet, because you cannot observe the traffic of a base-image bump that does not exist in production. The thing you want the blast radius of has not run. That is why this series is about blast radius &lt;em&gt;before merge&lt;/em&gt;, and before merge the edges you can actually know are the declared, inferred, and registered ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  Does a parsed dependency edge need a confidence score?
&lt;/h2&gt;

&lt;p&gt;Riftmap parses deterministically and still puts a confidence score on every edge, and those two facts only sound contradictory until you see what the score measures. It is not inference confidence. There is no model, so the number can never mean "we think this edge is real." It is resolution confidence: how cleanly the declared reference matched a known target in your org. The resolver's own dataclass documents it in one line. &lt;code&gt;"""Resolution confidence. 1.0 = exact match; lower = heuristic."""&lt;/code&gt; Every value below it comes from a string or path comparison, an &lt;code&gt;if&lt;/code&gt;, not a probability.&lt;/p&gt;

&lt;p&gt;The external precedent for this distinction is, again, Sourcegraph. Their &lt;a href="https://sourcegraph.com/docs/code-navigation/precise-code-navigation" rel="noopener noreferrer"&gt;precise vs search-based code navigation&lt;/a&gt; split does the same thing one layer up. Precise navigation is compiler-accurate when a SCIP index exists. Search-based navigation is what Sourcegraph falls back to, in their own words, "when precise navigation is not available." Neither mode is doubt about whether a symbol is real. The distinction is match quality on how the reference was resolved, precise index versus heuristic search. A declared-edge resolution score is the same shape of thing, one layer down at the artifact level.&lt;/p&gt;

&lt;p&gt;This is where I need to reconcile something honestly, because &lt;a href="https://riftmap.dev/blog/blast-radius-gate-merge-pipeline/" rel="noopener noreferrer"&gt;an earlier post in this series set a merge gate at &lt;code&gt;min_confidence=0.8&lt;/code&gt;&lt;/a&gt;, and it would be easy to read that as "declared edges are always at least 0.8." They are not. The 0.8 floor excludes a separate regex-heuristic layer that scans files no formal parser owns, whose findings sit at 0.4 to 0.7 by design, plus a few declared edges that resolved fuzzily. The score moves for two deterministic reasons: ambiguous declaration syntax, or an imperfect string match to a known target. Neither reason is doubt about existence. The live cloudposse graph proves it: the &lt;code&gt;terraform-null-label&lt;/code&gt; edge I pulled from the production API resolves at &lt;strong&gt;0.9&lt;/strong&gt;, not because anyone is unsure the edge exists, but because turning &lt;code&gt;cloudposse/label/null&lt;/code&gt; into the &lt;code&gt;cloudposse/terraform-null-label&lt;/code&gt; repo took a documented naming-convention rule rather than an exact string match. A &lt;code&gt;${var}&lt;/code&gt;-templated Terraform &lt;code&gt;source&lt;/code&gt; lands at 0.5. The four &lt;code&gt;${VAR}&lt;/code&gt;-templated &lt;code&gt;FROM&lt;/code&gt; lines in that cilium Dockerfile land at 0.7, each of them a real, declared base-image edge whose confidence is lower only because the image name is bound through a build argument. The number answers "how cleanly did this resolve." It never answers "do we think this is real," because nothing in the pipeline is guessing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three regimes differ in what keeps each edge honest
&lt;/h2&gt;

&lt;p&gt;The three regimes differ in the thing that decides whether they stay true: what keeps each edge honest.&lt;/p&gt;

&lt;p&gt;A declared edge is kept honest by the machine that executes it. Get a &lt;code&gt;FROM&lt;/code&gt; line wrong and the build breaks. Get a &lt;code&gt;module source&lt;/code&gt; wrong and &lt;code&gt;terraform init&lt;/code&gt; fails. The manifest is not honest because humans are diligent about it. It is honest because it is load-bearing, and the same machine that consumes the edge re-reads it on every run. An inferred edge is kept honest by nothing. There is no build that fails when the model guesses wrong; you re-roll the dice and get a different graph. A registered edge is kept honest by human diligence alone. Nothing executes a catalog, so it rots at exactly the rate that attention wanders, which is quickly.&lt;/p&gt;

&lt;p&gt;I have to concede a point the research made me sharpen, because a reader who knows the build-dependency-error literature will catch it otherwise. Declared is not infallible. It is true &lt;em&gt;to the manifest&lt;/em&gt;, not true to the world. A Helm &lt;code&gt;dependencies:&lt;/code&gt; entry nobody pruned, a Terraform &lt;code&gt;module&lt;/code&gt; block whose &lt;code&gt;source&lt;/code&gt; still points at code no longer wired into any resource. These are declared-but-dead edges, the same failure shape a catalog has. Declared and registered are both things somebody wrote down, and both can be stale while still parsing cleanly. The difference is not that declared never goes stale. It is that a declared edge lives in the file the machine runs, so it is cheaper to keep current than a catalog is: nobody has to remember to edit it, because the machine re-reads it every time it runs, and a wrong one tends to announce itself by breaking something. A catalog entry that goes wrong just sits there, wrong and quiet.&lt;/p&gt;

&lt;p&gt;That is why the declared regime is the substrate I would want under an agent making a small cross-repo infra change. You cannot ask an agent to maintain a catalog, and you cannot trust it to guess. What you can do is hand it the edges the org already declared, kept current by the same machines that already depend on them being correct, including the base-image or shared-module dependency that used to live only in the head of the engineer who just left.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Riftmap sits
&lt;/h2&gt;

&lt;p&gt;Riftmap lives in the declared regime, on purpose. It reads the edges your manifests already declare. Terraform &lt;code&gt;module source&lt;/code&gt;, Dockerfile &lt;code&gt;FROM&lt;/code&gt;, Helm &lt;code&gt;dependencies:&lt;/code&gt;, GitLab CI &lt;code&gt;include:&lt;/code&gt;, and the rest. It reads them deterministically, with no model anywhere in the parsing path, across an entire GitHub or GitLab organisation from one read-only token. No catalog YAML to maintain, because the edges are parsed straight from the files that already exist and re-read on every scan. It is not trying to be the inference tool for undeclared runtime calls, and it is not a catalog. It is the substrate: the cross-repo artifact graph the org already declared but never had assembled in one place. If you want to see what your own org declares that no single repo's clone can show you, &lt;a href="https://app.riftmap.dev/?utm_source=blog&amp;amp;utm_medium=post&amp;amp;utm_campaign=blast-radius&amp;amp;utm_content=declared-inferred-registered" rel="noopener noreferrer"&gt;run a scan&lt;/a&gt; against a read-only token and look at the graph before you bump the next base image.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;About Riftmap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Riftmap maps cross-repo dependencies across your entire GitLab or GitHub&lt;br&gt;
organisation — Terraform, Docker, CI templates, Helm, and more. One read-only&lt;br&gt;
token. No YAML to maintain.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Common questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's the difference between declared, inferred, and registered dependencies?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A declared dependency is written into a manifest the machine already executes to do its job, like a Dockerfile &lt;code&gt;FROM&lt;/code&gt; line or a Terraform &lt;code&gt;module source&lt;/code&gt;, so it is deterministic and re-read on every run. An inferred dependency is guessed from statistical signal such as embeddings or LLM output, so it comes with a probability and no ground truth. A registered dependency is one a human typed into a catalog like Backstage or Port, which no machine executes, so it is accurate only as of the last edit. The three differ in what keeps the edge honest: the machine that runs it, nothing, or human diligence alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do dependency-mapping tools actually detect dependencies, and are the edges parsed or inferred?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It depends on the tool, and "parsed vs inferred" hides a third option. Some tools parse the edge from a manifest declaration deterministically, some infer it from statistical signal like embeddings or model output, and some read it from a human-maintained catalog. Parsing gives you an edge that is true to the manifest and self-correcting because the machine re-reads it; inference gives you probabilistic coverage of couplings nothing declares; a catalog gives you relationships like ownership that neither can see, at the cost of drift. Observing an edge at runtime is a fourth mode, but it needs the change already running, so it cannot tell you the blast radius of something not yet merged.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do grep and symbol graphs miss infrastructure dependencies?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Grep finds the literal string but not the resolution rule behind it: a Terraform registry short-address, a Helm semver range, or a Dockerfile &lt;code&gt;FROM ${VAR}&lt;/code&gt; is text plus a rule that maps it to an actual repo, and grep does not run the rule. Symbol graphs index programming-language symbols, and Helm, Terraform, Dockerfile, GitLab CI, GitHub Actions, Ansible, and Kustomize constructs are not symbols any compiler parses. For &lt;code&gt;go.mod&lt;/code&gt; and npm the import path is a language symbol, so a symbol graph can resolve those cross-repo when cross-repo indexing is configured, which most installs skip.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does a parsed dependency edge need a confidence score?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not to say whether the edge exists. A parsed edge is read from a declaration, not guessed, so there is no probability that it is real. A confidence score on a parsed edge measures resolution quality instead: how cleanly the declared reference matched a known target, where 1.0 is an exact match and lower means a documented heuristic like a naming convention was needed. That is the same distinction Sourcegraph draws between precise and search-based navigation, and it is a different quantity from the existence-probability an inference tool attaches to a guessed edge.&lt;/p&gt;

</description>
      <category>crossrepodependencies</category>
      <category>blastradius</category>
      <category>dependencygraph</category>
      <category>platformengineering</category>
    </item>
    <item>
      <title>How to add a blast-radius gate to your merge pipeline</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Fri, 03 Jul 2026 08:56:40 +0000</pubDate>
      <link>https://dev.to/danielwe/how-to-add-a-blast-radius-gate-to-your-merge-pipeline-4jm2</link>
      <guid>https://dev.to/danielwe/how-to-add-a-blast-radius-gate-to-your-merge-pipeline-4jm2</guid>
      <description>&lt;p&gt;&lt;em&gt;A pull request to a repository that a hundred others build on should not merge with one approval from a phone. Here is a CI gate that routes the review by measured downstream exposure, in two HTTP calls and about forty lines, on GitLab CI or GitHub Actions.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Someone opens a one-line pull request. It bumps the default in a shared Terraform module, or edits the &lt;code&gt;FROM&lt;/code&gt; line in a base image, or changes an &lt;code&gt;include&lt;/code&gt; in a CI template. The plan is clean. The diff is three characters. CI goes green, one reviewer approves on their phone between meetings, and it merges. Then the next &lt;code&gt;terraform init&lt;/code&gt; in six other repositories resolves the new version, and the people who own those repositories find out from their own pipelines.&lt;/p&gt;

&lt;p&gt;The change was correct in isolation. What went wrong was the review. A repository that a hundred others build on had exactly one person look at the thing before it shipped, and that person had no way to see, from inside the pull request, who was standing downstream.&lt;/p&gt;

&lt;p&gt;The industry's answer to this has arrived as a wave of pre-merge blast-radius gates, and they are worth taking seriously. &lt;a href="https://github.com/overmindtech/actions" rel="noopener noreferrer"&gt;Overmind&lt;/a&gt; ships a GitHub Action that submits each pull request's Terraform plan and comments the blast radius straight onto the PR. An &lt;a href="https://dev.to/aws-builders/terraform-plan-shows-what-youre-changing-blast-radius-shows-what-youre-breaking-3324"&gt;open-source project&lt;/a&gt; reads live dependency relationships out of AWS Config and fails the build with a threshold gate when a change fans out too far. Amazon's answer, after its own change-failure numbers moved, was blunter: require senior sign-off on AI-assisted changes from junior and mid-level engineers. Three gates, and every one of them checks a different graph.&lt;/p&gt;

&lt;p&gt;A blast-radius merge gate is only ever as good as the graph it queries. And for the class of change that most needs a second pair of eyes, a base image bump, a shared module rename, a CI-template edit, the graph you want is the artifact graph: which repositories declare a build-time dependency on the thing this pull request changes. That is the graph none of the gates above reads, because a &lt;code&gt;FROM&lt;/code&gt; bump has no Terraform plan and no running resource and no code symbol, and it is the one you can query from CI today in two HTTP calls. This post is &lt;a href="https://riftmap.dev/blog/can-ai-check-blast-radius-of-pr-before-merge/" rel="noopener noreferrer"&gt;Post A&lt;/a&gt; made operational: the three-graph argument, turned into a job you can paste into a pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  The gate is two GET requests
&lt;/h2&gt;

&lt;p&gt;The whole gate is two GET requests and a threshold. You need one thing that is not in the pipeline, a Riftmap graph of your organisation, which is a one-off read-only scan I will come back to at the end. Given that, the gate resolves itself, because both platforms hand a CI job the repository's own path for free. It is &lt;code&gt;$CI_PROJECT_PATH&lt;/code&gt; on GitLab and &lt;code&gt;${{ github.repository }}&lt;/code&gt; on GitHub Actions, and that is exactly what the lookup call takes. Nested GitLab subgroups are included: on a project at &lt;code&gt;platform/runtime/base-images&lt;/code&gt;, &lt;code&gt;$CI_PROJECT_PATH&lt;/code&gt; is that whole three-segment path, which is exactly the form the scan stores, so the lookup matches nested namespaces without any massaging.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Resolve owner/repo to its Riftmap id.&lt;/span&gt;
&lt;span class="nv"&gt;REPO_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-sf&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-Key: &lt;/span&gt;&lt;span class="nv"&gt;$RIFTMAP_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://api.riftmap.dev/api/v1/repositories/lookup?full_path=&lt;/span&gt;&lt;span class="nv"&gt;$REPO_PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.id'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# 2. Ask who declares a dependency on it.&lt;/span&gt;
curl &lt;span class="nt"&gt;-sf&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-Key: &lt;/span&gt;&lt;span class="nv"&gt;$RIFTMAP_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://api.riftmap.dev/api/v1/repositories/&lt;/span&gt;&lt;span class="nv"&gt;$REPO_ID&lt;/span&gt;&lt;span class="s2"&gt;/impact?max_depth=3&amp;amp;min_confidence=0.8"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The impact call walks the dependency graph outward from your repository and returns every repository that depends on it, each tagged with a &lt;code&gt;depth&lt;/code&gt; and a &lt;code&gt;confidence&lt;/code&gt;, plus a &lt;code&gt;total_affected&lt;/code&gt; count. Depth 1 is who breaks first: the repositories whose manifests name yours directly, a &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-terraform-module/" rel="noopener noreferrer"&gt;&lt;code&gt;source&lt;/code&gt; block resolving to your module&lt;/a&gt;, a &lt;code&gt;FROM&lt;/code&gt; line pinned to your image, an &lt;code&gt;include&lt;/code&gt; pointing at your template. Deeper hops are the amplification. &lt;code&gt;min_confidence&lt;/code&gt; defaults to &lt;code&gt;0.8&lt;/code&gt;, which drops the heuristic matches and keeps the edges Riftmap parsed rather than guessed, and for a gate you want it there. (That number is resolution confidence, not existence probability, and the &lt;code&gt;0.8&lt;/code&gt; floor is doing something more specific than it looks; a companion post works through &lt;a href="https://riftmap.dev/blog/declared-inferred-registered/" rel="noopener noreferrer"&gt;how a tool knows an edge exists at all&lt;/a&gt; and what the score actually measures.)&lt;/p&gt;

&lt;p&gt;One thing has to be honest before you wire this to anything, because it decides whether the whole idea is useful or noise. The count is the standing consumer population as of the last scan, at the level of the whole repository. It is not a diff of which consumers your specific change breaks. A repository with 147 downstream consumers returns 147 whether this pull request renames an output every one of them uses or fixes a typo in a comment. So this is a gate on &lt;strong&gt;exposure, not on breakage&lt;/strong&gt;. Read the rest of this post with that framing and it stays sharp. Sell it to your team as a breakage detector, and the first person to run it on a busy shared repository will watch it fire on every pull request, including their own README fix, and quietly conclude the tool is broken. It is not measuring danger. It is measuring how many people a mistake here could reach.&lt;/p&gt;

&lt;h2&gt;
  
  
  The GitLab CI recipe
&lt;/h2&gt;

&lt;p&gt;Here is the entire gate as a GitLab CI job that runs on every merge request. It needs one thing configured, a masked CI/CD variable called &lt;code&gt;RIFTMAP_API_KEY&lt;/code&gt; holding a read-only Riftmap key (mint one labelled &lt;code&gt;ci&lt;/code&gt; so you can revoke it independently of the rest).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .gitlab-ci.yml&lt;/span&gt;
&lt;span class="na"&gt;blast-radius&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;alpine:3.20&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_PIPELINE_SOURCE == "merge_request_event"&lt;/span&gt;
  &lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;RIFTMAP_BASE_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.riftmap.dev/api/v1"&lt;/span&gt;
    &lt;span class="na"&gt;THRESHOLD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10"&lt;/span&gt;                    &lt;span class="c1"&gt;# direct consumers that warrant the review lane&lt;/span&gt;
  &lt;span class="na"&gt;before_script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;apk add --no-cache curl jq&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;# GitLab hands the job this repo's path for free.&lt;/span&gt;
      &lt;span class="s"&gt;REPO_ID=$(curl -sf -H "X-API-Key: $RIFTMAP_API_KEY" \&lt;/span&gt;
        &lt;span class="s"&gt;"$RIFTMAP_BASE_URL/repositories/lookup?full_path=$CI_PROJECT_PATH" | jq -r '.id')&lt;/span&gt;

      &lt;span class="s"&gt;# A repo Riftmap has not scanned yet returns nothing. Skip loudly rather than pass silently.&lt;/span&gt;
      &lt;span class="s"&gt;if [ -z "$REPO_ID" ] || [ "$REPO_ID" = "null" ]; then&lt;/span&gt;
        &lt;span class="s"&gt;echo "Repo not in the Riftmap graph yet; skipping blast-radius check."&lt;/span&gt;
        &lt;span class="s"&gt;exit 0&lt;/span&gt;
      &lt;span class="s"&gt;fi&lt;/span&gt;

      &lt;span class="s"&gt;IMPACT=$(curl -sf -H "X-API-Key: $RIFTMAP_API_KEY" \&lt;/span&gt;
        &lt;span class="s"&gt;"$RIFTMAP_BASE_URL/repositories/$REPO_ID/impact?max_depth=3&amp;amp;min_confidence=0.8")&lt;/span&gt;

      &lt;span class="s"&gt;DIRECT=$(echo "$IMPACT" | jq '[.affected_repositories[] | select(.depth == 1)] | length')&lt;/span&gt;
      &lt;span class="s"&gt;TOTAL=$(echo "$IMPACT"  | jq '.total_affected')&lt;/span&gt;

      &lt;span class="s"&gt;echo "Downstream consumers: $DIRECT direct, $TOTAL transitive."&lt;/span&gt;
      &lt;span class="s"&gt;if [ "$DIRECT" -ge "$THRESHOLD" ]; then&lt;/span&gt;
        &lt;span class="s"&gt;echo "Over threshold ($THRESHOLD): this change touches a high-fan-in repository."&lt;/span&gt;
      &lt;span class="s"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the complete gate. It resolves the repository, asks who depends on it, counts the direct consumers, and prints the number. It makes only GET requests, so it never trips Riftmap's rate limits, which apply to writes and not reads, and it holds no cloud credentials, because the cloud was never involved. The one guard that earns its place is the empty-&lt;code&gt;REPO_ID&lt;/code&gt; check: a repository the scan has not reached yet, and every brand-new repository is one of those, returns nothing from the lookup, and without the guard the rest of the job would quietly compute nothing and go green while printing blank counts. That is the single worst failure mode a gate can have, looking like it ran and found no exposure when it simply never ran. Skipping loudly keeps the gate honest about the one thing it is entitled to speak on, which is repositories Riftmap has actually scanned. The job then exits 0, which is deliberate. What to do with the number is the next section, and blocking the merge is the option you should reach for last, not first.&lt;/p&gt;

&lt;h2&gt;
  
  
  The same gate in GitHub Actions
&lt;/h2&gt;

&lt;p&gt;The GitHub Actions version is the identical two calls wearing GitHub's pull-request plumbing. &lt;code&gt;curl&lt;/code&gt; and &lt;code&gt;jq&lt;/code&gt; are already on the &lt;code&gt;ubuntu-latest&lt;/code&gt; runner, so there is no install step, and the repository path arrives as &lt;code&gt;${{ github.repository }}&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/blast-radius.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Blast radius&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;blast-radius&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;RIFTMAP_BASE_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://api.riftmap.dev/api/v1&lt;/span&gt;
      &lt;span class="na"&gt;THRESHOLD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10"&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Measure downstream exposure&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;RIFTMAP_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.RIFTMAP_API_KEY }}&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;# GitHub hands the job this repo's path for free.&lt;/span&gt;
          &lt;span class="s"&gt;REPO_ID=$(curl -sf -H "X-API-Key: $RIFTMAP_API_KEY" \&lt;/span&gt;
            &lt;span class="s"&gt;"$RIFTMAP_BASE_URL/repositories/lookup?full_path=${{ github.repository }}" | jq -r '.id')&lt;/span&gt;

          &lt;span class="s"&gt;# A repo Riftmap has not scanned yet returns nothing. Skip loudly rather than pass silently.&lt;/span&gt;
          &lt;span class="s"&gt;if [ -z "$REPO_ID" ] || [ "$REPO_ID" = "null" ]; then&lt;/span&gt;
            &lt;span class="s"&gt;echo "Repo not in the Riftmap graph yet; skipping blast-radius check."&lt;/span&gt;
            &lt;span class="s"&gt;exit 0&lt;/span&gt;
          &lt;span class="s"&gt;fi&lt;/span&gt;

          &lt;span class="s"&gt;IMPACT=$(curl -sf -H "X-API-Key: $RIFTMAP_API_KEY" \&lt;/span&gt;
            &lt;span class="s"&gt;"$RIFTMAP_BASE_URL/repositories/$REPO_ID/impact?max_depth=3&amp;amp;min_confidence=0.8")&lt;/span&gt;

          &lt;span class="s"&gt;DIRECT=$(echo "$IMPACT" | jq '[.affected_repositories[] | select(.depth == 1)] | length')&lt;/span&gt;
          &lt;span class="s"&gt;TOTAL=$(echo "$IMPACT"  | jq '.total_affected')&lt;/span&gt;

          &lt;span class="s"&gt;echo "### Blast radius: $DIRECT direct consumers, $TOTAL transitive" &amp;gt;&amp;gt; "$GITHUB_STEP_SUMMARY"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Store the key as an Actions secret named &lt;code&gt;RIFTMAP_API_KEY&lt;/code&gt;. Same forty lines, same two calls, same read-only key. The only real difference between the platforms shows up when you want the gate to say something on the pull request rather than in the job log, which is where they diverge, and it is worth being exact about why.&lt;/p&gt;

&lt;h2&gt;
  
  
  The self-updating version of CODEOWNERS
&lt;/h2&gt;

&lt;p&gt;The useful version of this gate does not block the merge. It routes the review. The instinct with a number and a threshold is to fail the build, and that instinct is the one thing to unlearn here, because the far more valuable job the number can do is decide who should be looking at the change.&lt;/p&gt;

&lt;p&gt;Think about what &lt;code&gt;CODEOWNERS&lt;/code&gt; already does for you. It routes review by path: touch files under &lt;code&gt;modules/&lt;/code&gt;, and the platform team is added as a reviewer automatically. This gate routes review by measured downstream exposure instead: this repository currently has N repositories building on it, so a change to it deserves an owner of that shared surface on the pull request, not just whoever opened it. The difference from a hand-written &lt;code&gt;CODEOWNERS&lt;/code&gt; line is that the number tracks the graph. A module that grows from three consumers to forty crosses your threshold on its own, the day the fortieth repository adds the dependency, with nobody remembering to edit a rule. A module that loses its consumers drops out of the lane the same way. It is &lt;code&gt;CODEOWNERS&lt;/code&gt; that maintains itself against what is actually downstream, rather than against what someone believed was downstream the last time they touched the file.&lt;/p&gt;

&lt;p&gt;That reframing is also why the exposure-not-breakage limit from earlier stops mattering. Routing review by exposure never needed to know whether your change was breaking. It only needs to know how many teams are downstream, because that is what makes pulling in a senior reviewer proportionate. You are sizing a coordination cost, not predicting a failure, and the count is exactly the right instrument for sizing a coordination cost.&lt;/p&gt;

&lt;p&gt;Underneath the routing sits a policy split I &lt;a href="https://riftmap.dev/blog/ai-doesnt-understand-blast-radius/" rel="noopener noreferrer"&gt;proposed in an earlier post&lt;/a&gt; and never actually shipped. A change with no external consumers gets the fast lane, because there is nothing downstream to coordinate and speed is free. A change under the threshold passes with the consumer list posted as a courtesy, so the author knows what they are near. A change over the threshold, or one that touches a repository tagged customer-critical, engages the review lane. Amazon reached for seniority as its proxy because seniority is trivial to encode: junior author, therefore review. Downstream exposure is the proxy Amazon actually meant. A senior engineer changing a shared base image needs the extra eyes more than a junior engineer fixing a log line in a leaf service, and only the graph can tell those two apart.&lt;/p&gt;

&lt;p&gt;Turning the number into a review is where the platforms differ. On GitHub, the built-in token does it for free. Add &lt;code&gt;permissions: { contents: read, pull-requests: write }&lt;/code&gt; and one step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;GH_TOKEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ github.token }}&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;PR=${{ github.event.pull_request.number }}&lt;/span&gt;
          &lt;span class="s"&gt;if [ "$DIRECT" -ge "$THRESHOLD" ]; then&lt;/span&gt;
            &lt;span class="s"&gt;gh pr edit "$PR" --add-label "blast-radius/high"&lt;/span&gt;
            &lt;span class="s"&gt;gh pr comment "$PR" --body \&lt;/span&gt;
              &lt;span class="s"&gt;"This change affects **$DIRECT** repositories directly ($TOTAL transitively). Engaging the shared-artifact review lane."&lt;/span&gt;
          &lt;span class="s"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On GitLab the merge-request notes API needs a token with &lt;code&gt;api&lt;/code&gt; scope, and the pipeline's own &lt;code&gt;$CI_JOB_TOKEN&lt;/code&gt; will not post notes, so store a project access token as a masked variable (&lt;code&gt;GITLAB_NOTE_TOKEN&lt;/code&gt;) and call the notes endpoint directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;      curl &lt;span class="nt"&gt;-sf&lt;/span&gt; &lt;span class="nt"&gt;--request&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"PRIVATE-TOKEN: &lt;/span&gt;&lt;span class="nv"&gt;$GITLAB_NOTE_TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--data-urlencode&lt;/span&gt; &lt;span class="s2"&gt;"body=This change affects &lt;/span&gt;&lt;span class="nv"&gt;$DIRECT&lt;/span&gt;&lt;span class="s2"&gt; repositories directly (&lt;/span&gt;&lt;span class="nv"&gt;$TOTAL&lt;/span&gt;&lt;span class="s2"&gt; transitively)."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CI_API_V4_URL&lt;/span&gt;&lt;span class="s2"&gt;/projects/&lt;/span&gt;&lt;span class="nv"&gt;$CI_PROJECT_ID&lt;/span&gt;&lt;span class="s2"&gt;/merge_requests/&lt;/span&gt;&lt;span class="nv"&gt;$CI_MERGE_REQUEST_IID&lt;/span&gt;&lt;span class="s2"&gt;/notes"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is what that comment looks like on a pull request to a genuinely high-fan-in repository. When Riftmap &lt;a href="https://riftmap.dev/blog/can-ai-check-blast-radius-of-pr-before-merge/" rel="noopener noreferrer"&gt;scanned Cloud Posse&lt;/a&gt;, &lt;code&gt;terraform-null-label&lt;/code&gt; came back with 147 direct consumers, 61% of the whole organisation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Blast radius: 147 repositories build on this&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This change is to &lt;code&gt;terraform-null-label&lt;/code&gt;, which 147 repositories in the organisation declare as a direct dependency. Engaging the shared-artifact review lane.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the comment the forty-line recipe produces from the impact call alone. Pull the consumer view for the artifact as well, the same shape the worked example in Post A returns, and the comment can carry the &lt;a href="https://riftmap.dev/blog/version-constraints-across-real-terraform-estates/" rel="noopener noreferrer"&gt;version detail&lt;/a&gt; that tells a reviewer where the coordination actually lands: of those 147, 138 are on the latest tag and 9 are lagging across six older ones. The reviewer arrives already knowing there are nine repositories to nudge onto the new version, not 147 to panic about. That is the difference between an exposure number and a coordination plan, and it is why the number is worth surfacing where a human will read it.&lt;/p&gt;

&lt;p&gt;If you do want the gate to hard-stop a merge, that is a one-line change and an opt-in, not the default. Exit non-zero over the threshold, and mark the job as a required check in branch protection or merge-request approval settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;      &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DIRECT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-ge&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$THRESHOLD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;exit &lt;/span&gt;1 &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start with the label and the comment. Reach for &lt;code&gt;exit 1&lt;/code&gt; only on the handful of repositories where a large blast radius genuinely should stop the world, and even then, expect to spend a week tuning &lt;code&gt;THRESHOLD&lt;/code&gt; before anyone trusts a red check that came from a consumer count.&lt;/p&gt;

&lt;p&gt;One last discipline, because the neighbours are already crossing it. The open-source AWS Config gate has an &lt;code&gt;ai-gate&lt;/code&gt; mode, and Port's guide has an LLM reason over catalogue relations and score the risk. Both are reasonable, and a written risk narrative is a genuinely nice thing to drop into a pull request. But it is a judgement, and you should not block a merge on a judgement that can vary between two runs on the same diff. The consumer count is not a judgement. It is a graph traversal that is either right or wrong about who declares the dependency, and it is the part you can safely automate a routing decision on. Gate on the enumeration. Treat the narrative as advice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this gate stops
&lt;/h2&gt;

&lt;p&gt;This gate sees one layer of dependency, and three kinds of blast radius sit outside it. Being exact about all three is the difference between a tool your team keeps and one they mute.&lt;/p&gt;

&lt;p&gt;The first is the one already covered: it enumerates, it does not diff. It answers "who is downstream," not "does this change break them." You keep the false alarms down by holding &lt;code&gt;min_confidence&lt;/code&gt; at &lt;code&gt;0.8&lt;/code&gt; and, if a particular repository is noisy, by only running the job when the files that actually declare interfaces change, through &lt;code&gt;paths:&lt;/code&gt; on GitHub or &lt;code&gt;changes:&lt;/code&gt; on GitLab. What you never do is describe it as catching breaking changes, because it does not, and the copy that says it does is the copy that gets the tool uninstalled.&lt;/p&gt;

&lt;p&gt;The second is freshness. The graph is a scan artifact, not live state, and a gate makes a merge decision, so eventual consistency with your scan cadence is a sharper caveat here than the same lag would be in a dashboard. State it symmetrically and it reads as engineering judgement rather than apology. Overmind and the AWS Config gate read live state, so their answer is current to the second, and they pay for it: both need cloud credentials inside the pipeline and a real plan step before they can say anything at all. Riftmap reads a graph built by a scan, so the answer is current to your scan cadence, and that is precisely why it is two GET requests with no cloud credentials in CI and no &lt;code&gt;terraform plan&lt;/code&gt; in the job. One buys freshness with access. The other buys cheapness with lag. For a coarse routing decision on a mature shared artifact, whose fan-in barely moves week to week, the lag is not what bites you, and you can surface it explicitly anyway. Every repository the lookup returns carries &lt;code&gt;last_scanned_at&lt;/code&gt; and &lt;code&gt;last_activity_at&lt;/code&gt;, so keep the whole lookup payload instead of pulling out only the id, and check the two against each other:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# capture the whole lookup response, not just .id&lt;/span&gt;
&lt;span class="nv"&gt;REPO&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-sf&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-Key: &lt;/span&gt;&lt;span class="nv"&gt;$RIFTMAP_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RIFTMAP_BASE_URL&lt;/span&gt;&lt;span class="s2"&gt;/repositories/lookup?full_path=&lt;/span&gt;&lt;span class="nv"&gt;$REPO_PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;REPO_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REPO&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.id'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;STALE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REPO&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="s1"&gt;'if .last_activity_at &amp;gt; .last_scanned_at then true else false end'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$STALE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"true"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Note: this repo has changed since Riftmap last scanned it; the count may be behind."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The third is scope, and it is where the honest line with the live-state tools lives. This gate is artifact-scoped and source-scoped. Overmind's is plan-scoped and runtime-scoped. Overmind sees the resource someone created in the console that your plan is about to touch, and the &lt;a href="https://riftmap.dev/blog/gitlab-orbit-and-the-artifact-layer/" rel="noopener noreferrer"&gt;GitLab Blast Radius Reviewer&lt;/a&gt; walks the symbol graph and prunes any change with no public symbols, so it catches an exported function's callers and never sees a &lt;code&gt;FROM&lt;/code&gt; bump at all. This gate sees the six repositories whose &lt;code&gt;FROM&lt;/code&gt; line resolves to the image you just rebuilt, and cannot see a runtime HTTP call that no manifest declares. A serious platform team at scale plausibly wants more than one of these running side by side, because they are blind in opposite directions, and the artifact layer is the one that has been &lt;a href="https://riftmap.dev/blog/riftmap-vs-overmind/" rel="noopener noreferrer"&gt;empty until now&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pull requests nobody looks at twice
&lt;/h2&gt;

&lt;p&gt;A merge gate is a bet about which mistakes are worth stopping to look at. The blast-radius gates shipping this year all make that bet on a graph, and the graph decides which mistakes the gate can even see. Bet on the live-cloud graph and you catch the console resource and miss the cross-repo consumer. Bet on the artifact graph and you catch the change your reviewer, and your coding agent, both think is safe: the &lt;code&gt;FROM&lt;/code&gt; line, the &lt;code&gt;source&lt;/code&gt; ref, the shared module whose consumer list nobody has counted since the person who set it up handed in their notice. And it matters more for the agent than the human, because a person opening that pull request at least half-remembers what is downstream, whereas an agent making the same change from one repository's clone knows nothing about the other repositories at all. Wire the number to a review lane, and the pull requests with the largest blast radius stop being the ones that merge with a single approval from a phone. They become the ones the right person was pulled in to see.&lt;/p&gt;

&lt;p&gt;None of this runs until Riftmap has a graph of your organisation to answer the two calls, and that graph is the part worth having. It is one read-only token across your GitLab group or GitHub organisation, no per-repo config and no YAML catalogue to keep current, and once it exists the blast radius of any repository is a single API call. The recipe above is just the reason the scan pays for itself. And there is deliberately no Riftmap Action or GitLab component yet: the gate is forty lines you can read, own, and change, and I would rather ship that than a black box you have to trust. If you would use a maintained drop-in instead, tell me, and I will build it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Questions engineers actually ask
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How do I add a blast radius check to my CI pipeline?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Compose two Riftmap API calls in a CI job. Look the repository up by its path (&lt;code&gt;$CI_PROJECT_PATH&lt;/code&gt; on GitLab, &lt;code&gt;${{ github.repository }}&lt;/code&gt; on GitHub Actions) to get its id, then call the impact endpoint to get every repository that declares a dependency on it. Count the direct consumers and comment or route on a threshold. It runs in seconds, needs only a read-only Riftmap key, and works the same whether a person or an agent opened the pull request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I fail a pull request when it affects too many repos?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Set a threshold on the direct downstream consumer count, exit the CI job non-zero when a change is over it, and mark that job as a required check in branch protection. The more useful default is not to block but to route: keep the job advisory and use the number to request review from the owners of the shared surface when exposure is high. Gate on the consumer count, which is a deterministic graph traversal, rather than on an AI-generated risk score, which is a judgement that can vary between runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need cloud credentials to check blast radius in CI?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not for the artifact layer. Riftmap answers from a dependency graph it built during a one-off scan of your organisation, so the pipeline needs only a read-only Riftmap API key and never touches your cloud. Live-state tools like Overmind and the AWS Config based gates do need cloud access, because they read the current state of your running infrastructure at pull-request time. The trade-off is scope: the artifact graph sees cross-repo build-time dependencies, live-state tools see runtime resources including ones created outside your IaC.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can an AI coding agent run the same blast-radius check?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes, and it matters more for the agent than for a human. A person opening the pull request often half-remembers what is downstream, whereas an agent making the same cross-repo change from a single repository's clone knows nothing about the other repositories at all. The same impact call gives either one the downstream consumer list before the merge, which is context the agent structurally cannot reconstruct on its own.&lt;/p&gt;

</description>
      <category>blastradius</category>
      <category>cicd</category>
      <category>platformengineering</category>
      <category>githubactions</category>
    </item>
    <item>
      <title>Can AI check the blast radius of a PR before you merge?</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Thu, 02 Jul 2026 14:22:46 +0000</pubDate>
      <link>https://dev.to/danielwe/can-ai-check-the-blast-radius-of-a-pr-before-you-merge-50k0</link>
      <guid>https://dev.to/danielwe/can-ai-check-the-blast-radius-of-a-pr-before-you-merge-50k0</guid>
      <description>&lt;p&gt;&lt;em&gt;Three products shipped the same promise this quarter: see the blast radius of your change before you merge. They mean three different dependency graphs — and each is blind to a kind of breakage the others catch. Picking a tool is really picking which graph gets consulted.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;In the past few weeks I have watched three different products promise the same sentence. Overmind's homepage now leads with engineers seeing the &lt;a href="https://overmind.tech" rel="noopener noreferrer"&gt;blast radius of their changes, before they merge&lt;/a&gt;, delivered by simulating the live cloud environment. A GitLab hackathon project shipped an MIT-licensed &lt;a href="https://medium.com/@poojabhavani19/pre-merge-blast-radius-analysis-with-the-gitlab-orbit-knowledge-graph-f49f3e181e89" rel="noopener noreferrer"&gt;Blast Radius Reviewer agent&lt;/a&gt; to the Duo catalog that runs pre-merge cross-project impact analysis on the Orbit knowledge graph. And Port published an &lt;a href="https://docs.port.io/guides/all/calculate-blast-radius-with-ai/" rel="noopener noreferrer"&gt;official guide&lt;/a&gt; to calculating blast radius with AI before production deploys.&lt;/p&gt;

&lt;p&gt;So, can AI check the blast radius of a PR before you merge? Yes, if there is a dependency graph for it to query. The interesting question is which graph, because at least three different graphs are being sold under that sentence right now, and they see different changes break.&lt;/p&gt;

&lt;p&gt;I build one of these tools, so I have an obvious interest here. I am going to try to earn your trust the boring way, by being precise about what each graph genuinely sees, where each one structurally stops, and then walking one real pre-merge check end to end on a real public organisation, with the actual API responses.&lt;/p&gt;

&lt;h2&gt;
  
  
  One phrase, three graphs
&lt;/h2&gt;

&lt;p&gt;"Blast radius before merge" currently describes at least three products that answer three different questions. The symbol layer asks what code calls the code you changed. The live-state layer asks what running cloud resources your plan touches. The artifact layer asks which repositories build on the thing you changed. Each layer is genuinely good at something the other two structurally cannot do, and none of them is lying to you. They are just answering different questions with the same vocabulary.&lt;/p&gt;

&lt;p&gt;I have made a version of this argument before, scoped to Terraform: &lt;a href="https://riftmap.dev/blog/riftmap-vs-overmind/" rel="noopener noreferrer"&gt;Terraform blast radius is three questions&lt;/a&gt;, the in-config graph, the live-cloud graph, and the cross-repo graph. A pull request is a more general object than a Terraform plan. A PR can be application code, which pulls the symbol layer into the picture, and the old in-config visualisers were never pre-merge gates in any serious sense. There is also a fourth kind of graph on the market, the modeled catalog, which I will come to at the end; it is less a layer than a maintenance regime. So for the question in this post's title, the map has three territories. Here is each one, honestly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The symbol layer: what code calls this code
&lt;/h2&gt;

&lt;p&gt;The symbol layer answers "who calls, imports, or references the code symbols this diff touches", and for application code it is the right layer to ask. The most interesting recent example is that Blast Radius Reviewer agent built on GitLab's Orbit knowledge graph. It extracts the public symbols from the diff, walks direct and transitive references outward up to three hops, weights risk by hop distance, pulls in reachable security findings and ownership, and lands a recommendation on the merge request. When Orbit is unavailable it degrades loudly to single-repo search and labels the result as partial rather than pretending it saw everything. It is well built, it is free, and it is exactly what I would want reviewing a Go or TypeScript change in a GitLab-native org. I have written before about how seriously I take &lt;a href="https://riftmap.dev/blog/gitlab-orbit-and-the-artifact-layer/" rel="noopener noreferrer"&gt;Orbit as a platform&lt;/a&gt;, and this agent is a good example of why.&lt;/p&gt;

&lt;p&gt;Now look at the assumption stated in its own writeup: changes that touch no public symbols are pruned early, because internal-only changes cannot have a cross-project blast radius. For application code, that is a reasonable heuristic. For infrastructure, it is exactly backwards. A Dockerfile &lt;code&gt;FROM&lt;/code&gt; line, a Terraform &lt;code&gt;source&lt;/code&gt; block, a Helm &lt;code&gt;Chart.yaml&lt;/code&gt; dependency, a GitLab CI &lt;code&gt;include&lt;/code&gt; directive: none of these is a public symbol. A diff that bumps them contains nothing for a symbol graph to traverse, so the highest-blast-radius PRs in an infrastructure estate are the ones this layer scores as impactless, by design rather than by defect.&lt;/p&gt;

&lt;p&gt;This is not a niche gap. When I &lt;a href="https://riftmap.dev/blog/cross-repo-edge-composition/" rel="noopener noreferrer"&gt;counted every cross-repo edge in two real organisations&lt;/a&gt;, not one was a code symbol. Symbol graphs and artifact graphs are &lt;a href="https://riftmap.dev/blog/symbol-graphs-and-artifact-graphs/" rel="noopener noreferrer"&gt;different categories&lt;/a&gt;, and the difference is the whole point of this post.&lt;/p&gt;

&lt;h2&gt;
  
  
  The live-state layer: what running resources the plan touches
&lt;/h2&gt;

&lt;p&gt;The live-state layer answers "what, in the cloud that is actually running, does this Terraform plan affect", and nothing else on this list can answer that. Overmind is the serious product here. It takes the plan JSON from your PR, queries your AWS, GCP, or Kubernetes environment in real time through read-only access, and maps the affected resources plus everything that depends on them, including resources created outside Terraform entirely, through the console or CloudFormation or a script someone ran in 2021. Those click-ops resources appear in no manifest anywhere. A parser can never find them, because there is nothing to parse. If your worry is "this plan will take down something nobody wrote down", Overmind's layer is the only one that can see it, and I have a lot of respect for it.&lt;/p&gt;

&lt;p&gt;Its structural boundary is the same thing that makes it powerful: it reasons from a plan against live state. The repositories that consume your module have not planned anything yet. When you change a shared module, the breakage does not happen in your plan. It happens later, in the plans of a hundred downstream repos, one at a time, as each of them eventually bumps. At the moment your PR is open there is no live signal in the place the damage will actually land. Overmind's graph is plan-scoped and runtime-scoped. Riftmap's is artifact-scoped and source-scoped. A serious platform team at scale plausibly wants both, and I mean that as a description of the architecture rather than as diplomacy. The &lt;a href="https://riftmap.dev/blog/riftmap-vs-overmind/" rel="noopener noreferrer"&gt;full comparison&lt;/a&gt; is its own post.&lt;/p&gt;

&lt;h2&gt;
  
  
  The artifact layer: which repos build on what you changed
&lt;/h2&gt;

&lt;p&gt;The artifact layer answers "which repositories declare a build-time dependency on the thing this PR changes", and in today's crop of pre-merge tools it is the question nobody else is answering. Nothing in the current wave walks Terraform &lt;code&gt;source&lt;/code&gt; blocks, Dockerfile &lt;code&gt;FROM&lt;/code&gt; lines, Helm chart dependencies, or CI &lt;code&gt;include&lt;/code&gt; directives across the repositories that were never checked out. So rather than argue it abstractly, here is one real pre-merge check, end to end.&lt;/p&gt;

&lt;p&gt;The organisation is &lt;a href="https://github.com/cloudposse" rel="noopener noreferrer"&gt;Cloud Posse&lt;/a&gt;, which maintains one of the largest public Terraform module estates there is. Riftmap scanned the org on 2026-07-02 with one read-only token: 242 repositories in about thirteen and a half minutes, zero errors, no per-repo configuration. The PR we will pretend to open is against &lt;a href="https://github.com/cloudposse/terraform-null-label" rel="noopener noreferrer"&gt;terraform-null-label&lt;/a&gt;, their naming and tagging convention module. Say it is a release-prep PR for a new tag that renames an output.&lt;/p&gt;

&lt;p&gt;Start with the two layers above, because the result is instructive. The diff touches HCL &lt;code&gt;output&lt;/code&gt; and &lt;code&gt;locals&lt;/code&gt; blocks, so there are no public symbols to extract; a symbol-layer reviewer prunes this change as having no cross-project impact. And the module declares no &lt;code&gt;resource&lt;/code&gt;, &lt;code&gt;data&lt;/code&gt;, or &lt;code&gt;module&lt;/code&gt; blocks at all, only &lt;code&gt;locals&lt;/code&gt;, &lt;code&gt;variable&lt;/code&gt;, and &lt;code&gt;output&lt;/code&gt; blocks. It provisions nothing, so &lt;code&gt;terraform plan&lt;/code&gt; shows nothing to add, change, or destroy, and there is no live infrastructure for a plan-simulation layer to map. Both layers report, correctly by their own definitions, that this PR touches nothing.&lt;/p&gt;

&lt;p&gt;Now ask the artifact graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"https://api.riftmap.dev/api/v1/repositories/{repo_id}/impact?max_depth=10"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$RIFTMAP_TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source_repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"terraform-null-label"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"full_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudposse/terraform-null-label"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"affected_repositories"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"full_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudposse/terraform-aws-acm-request-certificate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"depth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"full_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudposse/terraform-aws-alb"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;                     &lt;/span&gt;&lt;span class="nl"&gt;"depth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"full_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudposse/terraform-aws-alb-ingress"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;             &lt;/span&gt;&lt;span class="nl"&gt;"depth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_affected"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;147&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_depth_reached"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That response is real and only trimmed. 147 repositories, 61% of the org's 242 repos, declare terraform-null-label as a direct Terraform-module dependency (deduplicated, confidence ≥ 0.8, and intra-org only, a caveat I will come back to). The module that both other layers scored as impactless is the single highest-blast-radius artifact in the estate. Every one of those 147 edges sits at depth 1. This is not a deep chain amplifying a small number; it is 147 repositories importing one module directly, a module that contains no infrastructure at all, only the convention everything else is named by.&lt;/p&gt;

&lt;p&gt;The consumers view is where the pre-merge decision actually gets made, because it carries versions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"artifact"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"artifact_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"terraform_module"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudposse/terraform-null-label"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"consumer_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;147&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"latest_version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0.25.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"consumers_on_latest"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;138&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"consumers_lagging"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"state_breakdown"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pinned"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;139&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"floating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"branch"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"absent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"consumers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"full_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudposse/terraform-aws-elastic-beanstalk-application"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"version_constraint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0.25.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="nl"&gt;"version_constraint_state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pinned"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"source_file"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"context.tf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"source_line"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"full_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudposse/terraform-aws-cloudwatch-flow-logs"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"version_constraint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tags/0.3.1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"version_constraint_state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"branch"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"source_file"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kinesis.tf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"source_line"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"full_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudposse/terraform-aws-multi-az-subnets"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"version_constraint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0.24.1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="nl"&gt;"version_constraint_state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pinned"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"source_file"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"public.tf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nl"&gt;"source_line"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the &lt;code&gt;source_file&lt;/code&gt; and &lt;code&gt;source_line&lt;/code&gt; fields. Each edge points at the exact line of the exact manifest where the dependency is declared. Nothing here is guessed. The resolver collapses three declaration forms onto the same module: Terraform-registry version pins like &lt;code&gt;0.25.0&lt;/code&gt;, git-tag refs like &lt;code&gt;tags/0.3.1&lt;/code&gt;, and one raw git URL with no version at all (Cloud Posse's &lt;code&gt;geodesic&lt;/code&gt; image, which references the module that way and sits just outside these 147 as a lower-confidence edge).&lt;/p&gt;

&lt;p&gt;Two honest wrinkles, because precision is the product here. First, zero of the 147 consumers float a version range. Every single one hard-pins, so publishing a new tag breaks nobody at the moment of release. What the number tells you is something more useful: who has to coordinate the upgrade, and the version distribution tells you how well that coordination has gone historically. 138 consumers sit on the latest release and nine are scattered across six older tags, some dating back to the &lt;code&gt;tags/0.3.x&lt;/code&gt; era. Not one repo floats, yet the org is still fragmented across seven distinct versions. That drift is what &lt;a href="https://riftmap.dev/blog/version-constraints-across-real-terraform-estates/" rel="noopener noreferrer"&gt;version constraints look like in real estates&lt;/a&gt; generally, and it is the evidence that "everyone pins" and "everyone is current" are very different sentences.&lt;/p&gt;

&lt;p&gt;Second, when a consumer does bump, null-label's outputs feed resource names and tags, so the consumer's own plan can light up with renames, and at that moment, one consumer at a time, the live-state layer sees it. The artifact layer answers the question that comes before any of those plans exist: who has to open that PR at all.&lt;/p&gt;

&lt;p&gt;There is a fashionable objection to this layer, which is that parsed manifests capture declared dependencies rather than real ones. At the artifact layer the objection dissolves, because the declaration is the mechanism. The build executes the &lt;code&gt;FROM&lt;/code&gt; line. &lt;code&gt;terraform init&lt;/code&gt; resolves the &lt;code&gt;source&lt;/code&gt; block. The pipeline includes the &lt;code&gt;include&lt;/code&gt;. There is no separate runtime truth that the manifest is a stale snapshot of; the manifest is the instruction the machines follow. Undeclared runtime calls between services are a real blind spot, and they belong to a different layer of dependency entirely, one that runtime observation is the right tool for. Which repos build on your module is not a runtime question. It is written down, deterministically, at a file and line number, in repos you have never cloned. It just is not written down in the repo the PR is in, which is why neither the agent that opened the PR nor the reviewer reading it can see it, and why &lt;a href="https://riftmap.dev/blog/inferred-context-is-not-a-dependency-graph/" rel="noopener noreferrer"&gt;inferring it&lt;/a&gt; with embeddings or an LLM's confidence score is the wrong tool for a merge gate.&lt;/p&gt;

&lt;p&gt;The caveat I promised: all of these counts are intra-org. Cloud Posse's modules are used enormously across the public Terraform ecosystem, and none of that external usage appears in this scan. Within their own 242 repositories, the numbers above are exact. In the world, the true blast radius is far larger and nobody's graph sees all of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the catalogs sit
&lt;/h2&gt;

&lt;p&gt;Port's blast-radius guide is real and worth taking seriously, and it works on the fourth kind of graph I flagged earlier: a modeled one. You describe your services and their relations as catalog entities, and an AI agent reasons over those relations to produce a risk score and a deployment analysis. The output is a judgement, and the graph it judges from is as accurate as the last person who updated the catalog. I have written about &lt;a href="https://riftmap.dev/blog/modeled-graphs-and-parsed-graphs/" rel="noopener noreferrer"&gt;modeled graphs versus parsed graphs&lt;/a&gt; and about why hand-maintained catalogs &lt;a href="https://riftmap.dev/blog/the-catalog-maintenance-trap/" rel="noopener noreferrer"&gt;drift toward fiction&lt;/a&gt;, so I will keep it to one sentence here: a modeled graph answers with the accuracy of its YAML, and an enumeration beats a judgement anywhere an enumeration is available.&lt;/p&gt;

&lt;h2&gt;
  
  
  The blast radius the AI reports is the blast radius of the graph
&lt;/h2&gt;

&lt;p&gt;The answer to this post's title was never really in doubt. An agent can query any graph you hand it, and every product I have named will genuinely put something called a blast radius on your PR. What you are choosing when you pick one is not whether AI checks your change. It is which graph gets consulted, and therefore which category of breakage stays invisible. Symbol graphs cannot see the &lt;code&gt;FROM&lt;/code&gt; line. Live-state graphs cannot see the repos that have not planned yet. Artifact graphs cannot see the click-ops instance someone made in the console. The teams that get this right will not be the ones that bought the best AI. They will be the ones that knew which question their estate actually needed answered, and made sure a graph existed that could answer it before the merge button did.&lt;/p&gt;

&lt;p&gt;If your estate's risk lives where Cloud Posse's does, in the modules, images, charts, and templates that a hundred repos quietly build on, that graph is the artifact graph, and &lt;a href="https://riftmap.dev/blog/blast-radius-gate-merge-pipeline/" rel="noopener noreferrer"&gt;checking it from CI or an agent&lt;/a&gt; is one HTTP call.&lt;/p&gt;

&lt;p&gt;This post split the graphs by which layer an edge lives on; a companion piece works the orthogonal axis, &lt;a href="https://riftmap.dev/blog/declared-inferred-registered/" rel="noopener noreferrer"&gt;how a tool comes to know an edge exists at all&lt;/a&gt;, whether it is declared in a manifest, inferred from statistical signal, or registered in a catalog.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Can AI check the blast radius of a PR before you merge?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes, if there is a dependency graph for it to query at PR time. The AI resolves what the PR changes, requests the set of affected consumers from the graph, and reports the result before merge. Without a graph, an AI reviewer only sees the repository the PR is in, so cross-repo blast radius stays invisible until something breaks downstream. The practical question is which graph it queries: symbol graphs see code references, live-state graphs see running cloud resources, and artifact graphs see which repositories build on the changed artifact.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What tools show blast radius on a pull request before merge?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Three kinds, by layer. Symbol-layer tools like Sourcegraph and GitLab Orbit (including the open-source Blast Radius Reviewer agent) trace which code references the symbols in a diff. Live-state tools like Overmind simulate a Terraform plan against real-time AWS, GCP, or Kubernetes state and comment the affected resources on the PR. Artifact-layer tools like Riftmap enumerate the repositories whose manifests (Terraform &lt;code&gt;source&lt;/code&gt; blocks, Dockerfile &lt;code&gt;FROM&lt;/code&gt; lines, Helm dependencies, CI includes) consume what the PR changes. They answer different questions, and infrastructure-heavy estates usually need the third.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does Claude Code or Cursor know the blast radius of a change?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not on their own. Coding agents see the repository they have cloned or indexed, so a change to a shared module, base image, or CI template looks safe because the consumers live in repos the agent never opened. Given a queryable dependency graph, though, an agent can check blast radius at planning time with one API call and get the affected repositories back before it opens the PR. The agent is not the limitation; the missing graph is. More on that in &lt;a href="https://riftmap.dev/blog/claude-code-cursor-cross-repo-context/" rel="noopener noreferrer"&gt;Claude Code, Cursor, and the graph neither sees&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is blast radius analysis deterministic or AI-generated?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It depends on the layer, and honest tools are clear about which parts are which. Graph traversal is deterministic: "147 repositories declare this module at these file and line numbers" is an enumeration, and it is either right or wrong. Risk narratives, severity scores, and deployment recommendations are judgements, usually LLM-generated, layered on top of whichever graph the tool has. A reasonable rule for merge gates: gate on the enumeration, treat the narrative as advice.&lt;/p&gt;

</description>
      <category>blastradius</category>
      <category>impactanalysis</category>
      <category>pullrequests</category>
      <category>aicoding</category>
    </item>
    <item>
      <title>You can index every repo in Cursor. It still can't tell you what breaks.</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Wed, 01 Jul 2026 11:44:12 +0000</pubDate>
      <link>https://dev.to/danielwe/you-can-index-every-repo-in-cursor-it-still-cant-tell-you-what-breaks-4j5i</link>
      <guid>https://dev.to/danielwe/you-can-index-every-repo-in-cursor-it-still-cant-tell-you-what-breaks-4j5i</guid>
      <description>&lt;p&gt;&lt;em&gt;How to tune Cursor's codebase index for a large monorepo or a multi-repo workspace — and the one question no amount of tuning makes it answer.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Open a monorepo of any real size in Cursor and the first thing you meet is the indexing bar, and the first thing you learn is that it can take a while. A repository in the tens of thousands of files can sit there for a long time if you point Cursor at the root and walk away.&lt;/p&gt;

&lt;p&gt;So you do the sensible thing. You read the docs, you write a &lt;code&gt;.cursorignore&lt;/code&gt;, you open the package you actually work in instead of the whole tree, and the bar that took an age now takes a minute. &lt;code&gt;@Codebase&lt;/code&gt; gets faster and sharper. This is real work, it is worth doing, and most of this post is how to do it well.&lt;/p&gt;

&lt;p&gt;Then, one afternoon, you are about to change something shared. A base image half the org builds on. A module three Terraform stacks call. A contract package a dozen services import. And you ask the fast, well-tuned index the one question that actually matters before you press go. What breaks. And the answer comes back confident, and quick, and wrong in a way you cannot see.&lt;/p&gt;

&lt;p&gt;Here is the claim this post runs on. You can tune Cursor's index to cover a fifty-repo workspace and make &lt;code&gt;@Codebase&lt;/code&gt; genuinely fast and useful, and it still will not tell you which repositories break when you change a shared module, because the index answers similarity and a declared dependency is not a similarity relationship. Everything up to that ceiling is worth doing, and I will spend most of the post doing it. The ceiling itself is real, and no amount of tuning moves it, because it was never a tuning problem.&lt;/p&gt;

&lt;p&gt;There is a version of this post all over the internet right now, and most of it is good. The best one I have read walks &lt;a href="https://www.iamraghuveer.com/posts/cursor-codebase-indexing-monorepo/" rel="noopener noreferrer"&gt;multi-repo workspaces and per-service &lt;code&gt;.cursorignore&lt;/code&gt; files&lt;/a&gt; carefully and calls the result a microservices graph explorer. I want to be fair to that framing, because the tuning it describes is correct and I am about to repeat a fair amount of it. But "the index spans all my services" and "the index maps how my services depend on each other" are two different claims, and the distance between them is the whole second half of this post. I have made the &lt;a href="https://riftmap.dev/blog/the-repo-your-agent-didnt-clone/" rel="noopener noreferrer"&gt;structural version of that argument before&lt;/a&gt;, and shown &lt;a href="https://riftmap.dev/blog/claude-code-cursor-cross-repo-context/" rel="noopener noreferrer"&gt;how to wire the missing graph into Claude Code and Cursor&lt;/a&gt;. This one starts somewhere more practical. It starts with your index actually being slow.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Cursor's index actually is
&lt;/h2&gt;

&lt;p&gt;Cursor's index is a semantic search index, and knowing that precisely is what makes the rest of this post make sense. When you open a workspace, Cursor splits your code into chunks along syntactic boundaries, runs each chunk through a custom embedding model to get a vector, and stores those vectors in a &lt;a href="https://cursor.com/blog/secure-codebase-indexing" rel="noopener noreferrer"&gt;remote vector database&lt;/a&gt; built on Turbopuffer, keyed by an obfuscated path and line range. Your source is not kept server-side. Only the embeddings and masked metadata leave your machine, and the chunks are decrypted on the client when the agent needs them. A Merkle tree tracks which files changed so re-indexing only touches what moved, and the index syncs roughly every five minutes.&lt;/p&gt;

&lt;p&gt;When you search, your query becomes a vector too, and Cursor returns the chunks whose vectors sit nearest yours. That is what &lt;code&gt;@Codebase&lt;/code&gt; is underneath. Nearest-neighbour search over embeddings. The official description is exact about it: it returns the most semantically similar code, even when the matching chunk does not contain the words you searched for. Cursor's &lt;a href="https://cursor.com/blog/secure-codebase-indexing" rel="noopener noreferrer"&gt;own evaluation&lt;/a&gt; puts semantic search at around 12.5% more accurate than grep alone on large codebases, with the gain growing as the codebase grows, and that is a real result I have no interest in talking down.&lt;/p&gt;

&lt;p&gt;In 2026 this got more capable, and it is worth being current about, because the workflow changed. You mostly do not type &lt;code&gt;@Codebase&lt;/code&gt; any more. Cursor's Agent &lt;a href="https://cursor.com/docs/agent/tools/search" rel="noopener noreferrer"&gt;picks the search strategy itself&lt;/a&gt;, combining a fast custom grep it calls Instant Grep with semantic search, and it can spawn an Explore subagent that runs many searches in parallel without bloating the main context. Cursor's own line is that you do not choose the tool, you describe what you need and the Agent decides. This is a genuine improvement. But notice what did not change underneath the sophistication. Instant Grep matches strings. Semantic search matches meaning. Both are ways of finding text that resembles other text, and neither resolves a reference in one repository to the artifact another repository builds. The agent got much better at choosing which kind of resemblance to look for. It did not gain a new kind of edge to look over.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tuning the index for a large or multi-repo codebase
&lt;/h2&gt;

&lt;p&gt;The single lever that matters most on a large codebase is how much you ask Cursor to index, because both indexing cost and query noise scale with file count. Cursor's own numbers are blunt about the cost: a large repository indexed naively can take hours to reach its first query, and on the largest repos the ninety-ninth-percentile time-to-first-query is &lt;a href="https://cursor.com/blog/secure-codebase-indexing" rel="noopener noreferrer"&gt;over four hours&lt;/a&gt; before their teammate-index-sharing trick kicks in, with semantic search unavailable until the index is at least 80% built. One &lt;a href="https://www.rapidevelopers.com/cursor-tutorial/how-to-manage-cursor-ai-s-context-window-when-developing-large-monorepos-with-multiple-packages" rel="noopener noreferrer"&gt;monorepo tutorial&lt;/a&gt; clocks an 8,800-file repo at seven to twelve hours from the root, cut to minutes with the right exclusions. Everything below is a way of pointing the index at the code that is load-bearing for your task and keeping everything else out of it. Do them in roughly this order.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scope the index: open the package, not the root
&lt;/h3&gt;

&lt;p&gt;The highest-leverage move on a monorepo is to not open the monorepo. Opening a package directory as the workspace root makes Cursor treat that directory as the whole codebase and index only within it, and on a large tree that is the difference between a minute and several hours.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="c"&gt;# Indexes everything under the root, slowly
&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt; /&lt;span class="n"&gt;path&lt;/span&gt;/&lt;span class="n"&gt;to&lt;/span&gt;/&lt;span class="n"&gt;monorepo&lt;/span&gt;

&lt;span class="c"&gt;# Indexes one package, fast
&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt; /&lt;span class="n"&gt;path&lt;/span&gt;/&lt;span class="n"&gt;to&lt;/span&gt;/&lt;span class="n"&gt;monorepo&lt;/span&gt;/&lt;span class="n"&gt;packages&lt;/span&gt;/&lt;span class="n"&gt;api&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A shell alias per package you live in (&lt;code&gt;alias ca='cursor /path/to/monorepo/packages/api'&lt;/code&gt;) makes this frictionless. The cost is that references outside the package are no longer in the index, which is fine right up until your task actually crosses a package boundary, and then it is precisely the problem the second half of this post is about.&lt;/p&gt;

&lt;h3&gt;
  
  
  The two ignore files, and which one you actually want
&lt;/h3&gt;

&lt;p&gt;Cursor has two ignore files and they do different jobs, and mixing them up is the most common configuration mistake I see. &lt;code&gt;.cursorignore&lt;/code&gt; is a complete block: a file listed there is not indexed, not read, and not available even when you &lt;code&gt;@&lt;/code&gt;-mention it, as though it did not exist. &lt;code&gt;.cursorindexingignore&lt;/code&gt; is narrower: it keeps a file out of the index and out of search results, but the file stays readable, so you can still pull it in with &lt;code&gt;@Files&lt;/code&gt; when you genuinely need it.&lt;/p&gt;

&lt;p&gt;The practical rule the field has settled on is short. Reach for &lt;code&gt;.cursorindexingignore&lt;/code&gt; first, because it is the reversible choice, and promote a path to &lt;code&gt;.cursorignore&lt;/code&gt; only when the AI should never see it, like a secret.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="c"&gt;# .cursorindexingignore
# Kept out of the index, still reachable with @Files.
&lt;/span&gt;&lt;span class="n"&gt;tests&lt;/span&gt;/&lt;span class="n"&gt;fixtures&lt;/span&gt;/
&lt;span class="n"&gt;e2e&lt;/span&gt;/&lt;span class="n"&gt;recordings&lt;/span&gt;/
&lt;span class="n"&gt;packages&lt;/span&gt;/&lt;span class="n"&gt;legacy&lt;/span&gt;/

&lt;span class="c"&gt;# .cursorignore
# Invisible to indexing AND to all AI features.
&lt;/span&gt;.&lt;span class="n"&gt;env&lt;/span&gt;*
&lt;span class="n"&gt;secrets&lt;/span&gt;/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is one detail in Cursor's &lt;a href="https://cursor.com/docs/reference/ignore-file" rel="noopener noreferrer"&gt;default indexing exclusions&lt;/a&gt; worth knowing, because it is quietly on-topic. Cursor already skips lockfiles by default, &lt;code&gt;package-lock.json&lt;/code&gt;, &lt;code&gt;yarn.lock&lt;/code&gt;, &lt;code&gt;go.sum&lt;/code&gt;, and the rest. Those are the files that record the exact resolved version of every transitive dependency, which is to say the single most precise dependency information in your repository is the first thing the index throws away. It throws it away for sensible reasons, lockfiles are enormous and read as noise to a similarity search. Hold onto that, though, because it is a small preview of the larger point. The index is tuned to find code that reads like your question, and dependency records do not read like anything.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multiple repositories: the multi-root workspace
&lt;/h3&gt;

&lt;p&gt;For genuinely separate repositories, rather than packages in one tree, Cursor supports &lt;a href="https://cursor.com/docs/agent/tools/search" rel="noopener noreferrer"&gt;multi-root workspaces&lt;/a&gt;. A &lt;code&gt;.code-workspace&lt;/code&gt; file lists several folder roots, Cursor indexes all of them, and Agent can reach across the set. A workable pattern for a large estate is to group repositories into a few workspace files by domain, payments with identity with the API gateway in one, catalogue with search with recommendations in another, and switch between them rather than opening everything at once. One caveat to know going in: features that assume a single git root, like worktrees, are disabled in a multi-root workspace.&lt;/p&gt;

&lt;p&gt;Done well, this is genuinely useful. Open &lt;code&gt;web-app&lt;/code&gt;, &lt;code&gt;orders-service&lt;/code&gt;, and the shared contract repo together, and "how does the orders service validate a token" becomes one question instead of four context switches. This is the setup the multi-repo guides call a microservices graph explorer, and I understand why they reach for it. When every service sits in one index, &lt;code&gt;@Codebase&lt;/code&gt; stops being a single-service lookup and starts answering questions that range across the whole set.&lt;/p&gt;

&lt;p&gt;But the word doing too much work in "microservices graph explorer" is graph. The index now spans your services. It does not map them. It can surface the code in &lt;code&gt;orders-service&lt;/code&gt; that mentions the contract, and the code in the contract repo that defines it, because both are text and both might resemble your query. What it cannot do is tell you that &lt;code&gt;orders-service&lt;/code&gt; declares a dependency on that contract, and that &lt;code&gt;billing-service&lt;/code&gt;, which you did not open, declares one too. Spanning a set of repositories and mapping the edges between them are different operations, and the index only performs the first one.&lt;/p&gt;

&lt;h3&gt;
  
  
  Project rules and hierarchical ignore
&lt;/h3&gt;

&lt;p&gt;Project rules and hierarchical ignore are two smaller levers worth setting. Rules live in &lt;a href="https://cursor.com/docs/context/rules" rel="noopener noreferrer"&gt;&lt;code&gt;.cursor/rules/*.mdc&lt;/code&gt;&lt;/a&gt; now, one or more files that describe your architecture and conventions and load into context by relevance. If you are still carrying a root &lt;code&gt;.cursorrules&lt;/code&gt; file, note that it is legacy and ignored in Agent mode, so migrating it is overdue. Rules are where you tell the agent how the monorepo fits together and which boundaries not to cross, and they do help. Notice, though, that they help by you writing the structure down, which makes them a hand-maintained description with the same &lt;a href="https://riftmap.dev/blog/the-catalog-maintenance-trap/" rel="noopener noreferrer"&gt;decay problem every written map has&lt;/a&gt;: the description is only as current as the last engineer to update it, and it drifts from the code at exactly the speed the code changes.&lt;/p&gt;

&lt;p&gt;Hierarchical Cursor Ignore, a setting rather than a file, lets Cursor walk up parent directories collecting &lt;code&gt;.cursorignore&lt;/code&gt; files, so you can keep a global exclusion set at the root of a large monorepo and let each package layer its own on top. It is the right tool for keeping a big tree's index configuration from repeating itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  The freshness you're actually working with
&lt;/h3&gt;

&lt;p&gt;Cursor's index is as current as your checkout, and no more, and it is worth being honest with yourself about what that means. The index reflects the files on your disk. It does not pull your colleagues' commits, so a function a teammate added to the identity service this morning is not in your index until you pull and Cursor re-indexes the file. And even for your own work the index trails the actual files by a sync interval. The consequence is a clean line: the index is reliable for the stable shape of a system, and unreliable for the change that landed an hour ago in a repository you did not open. Both of those are outside it. Keep that boundary in mind, because it compounds with the one the next section is about.&lt;/p&gt;

&lt;h2&gt;
  
  
  The question a tuned index still can't answer
&lt;/h2&gt;

&lt;p&gt;A perfectly tuned Cursor index still cannot tell you which repositories break when you change a shared module, because it answers similarity and a cross-repo dependency is a declared edge, not a similarity relationship. Do all of it. Scope the index to the package, split the workspace by domain, get the file count into the low thousands, write the rules, keep it fresh. You now have a fast, lean, accurate semantic index across every repository you care about, and &lt;code&gt;@Codebase&lt;/code&gt; is as good as it gets. Now ask it the question you opened this whole workflow to answer. You are about to bump a base image, retire a shared module, or change a contract. Which repositories break.&lt;/p&gt;

&lt;p&gt;Here is what happens, concretely, and you can check me on it because the org is public. Take the &lt;a href="https://riftmap.dev/showcase/prometheus/" rel="noopener noreferrer"&gt;Prometheus organisation&lt;/a&gt;: as of Riftmap's May 2026 scan, fifty-six repositories, and when you &lt;a href="https://riftmap.dev/blog/what-56-prometheus-repos-depend-on/" rel="noopener noreferrer"&gt;parse the dependency edges&lt;/a&gt; between them, a hundred and eighty-eight cross-repository edges. A handful of repositories carry most of it, &lt;code&gt;prometheus/common&lt;/code&gt; with twenty-five dependents, &lt;code&gt;client_model&lt;/code&gt; with twenty-four, &lt;code&gt;procfs&lt;/code&gt; with twenty-three, &lt;code&gt;client_golang&lt;/code&gt; with twenty-two. (Those counts drift a little between scans, which is exactly why they are dated here; the &lt;a href="https://riftmap.dev/showcase/prometheus/" rel="noopener noreferrer"&gt;live showcase&lt;/a&gt; always renders the current number.) Clone all fifty-six, open them in one immaculately tuned Cursor multi-root workspace, wait for the index to finish, and ask &lt;code&gt;@Codebase&lt;/code&gt;: what depends on &lt;code&gt;client_golang&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You will get chunks. Files that mention &lt;code&gt;client_golang&lt;/code&gt;, code that resembles your query, the definition itself. What you will not get is the list of twenty-two repositories that declare a &lt;code&gt;require&lt;/code&gt; on it, because that list is not a similarity relationship. It is twenty-two &lt;code&gt;go.mod&lt;/code&gt; files, in twenty-two repositories, each with a line naming &lt;code&gt;client_golang&lt;/code&gt; and a version. A &lt;code&gt;go.mod&lt;/code&gt; require line and the &lt;code&gt;client_golang&lt;/code&gt; source it points at share almost no tokens, nothing an embedding would place near the other. The edge is not latent in the text, waiting to be retrieved. It was declared once, in a manifest, and it is either parsed from that manifest or it is not found.&lt;/p&gt;

&lt;p&gt;And it is finer than a repository count, which is the part that should give you pause before a change. The same scan finds &lt;code&gt;prometheus/common&lt;/code&gt; required twice from &lt;code&gt;prometheus/prometheus&lt;/code&gt; alone, two separate manifests in one repository, each naming &lt;code&gt;common&lt;/code&gt; with its own version, and the graph tracks them as two references rather than folding them into a single repo-to-repo edge.&lt;/p&gt;

&lt;p&gt;"We bumped the dependency" and "we bumped every reference to the dependency" are different statements, and Go monorepos are exactly where that difference hides. A parser surfaces each of those references as its own edge, because it read each manifest and knows how many there are. A similarity index has no concept of "the second &lt;code&gt;go.mod&lt;/code&gt; that requires this". It has chunks, ranked by resemblance, and resemblance was never going to count references in files it treats as prose.&lt;/p&gt;

&lt;p&gt;This is not a tuning failure, and that distinction matters. There is no &lt;code&gt;.cursorignore&lt;/code&gt; you could write, no workspace split, no rule, that turns a nearest-neighbour search into a dependency resolver. The index is answering the question it was built to answer, which is "what code resembles this", and it answers it well. The question a breaking change asks is "what declares a dependency on this", and that is a different question with a different data structure behind it. You cannot tune your way from one to the other, because they were never the same machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The graph that answers it is parsed, not tuned
&lt;/h2&gt;

&lt;p&gt;The edges the index can't retrieve are not missing, they are declared in the manifests you already have, in constructs built for exactly this. The contract in a &lt;code&gt;go.mod&lt;/code&gt; require or a &lt;code&gt;package.json&lt;/code&gt; dependency. The module in a Terraform &lt;code&gt;source&lt;/code&gt; block. The image in a Dockerfile &lt;code&gt;FROM&lt;/code&gt;. The chart in a &lt;code&gt;Chart.yaml&lt;/code&gt; dependency. The template in a GitLab CI &lt;code&gt;include:project&lt;/code&gt; or a reusable Actions &lt;code&gt;uses:&lt;/code&gt;. I spent &lt;a href="https://riftmap.dev/blog/series/find-every-consumer/" rel="noopener noreferrer"&gt;a whole series&lt;/a&gt; walking those one ecosystem at a time. They are deterministic. Parsed, not inferred. The dependency graph across your organisation already exists, declared and unassembled, in files a similarity index reads as text and a parser reads as edges. This is the difference between &lt;a href="https://riftmap.dev/blog/inferred-context-is-not-a-dependency-graph/" rel="noopener noreferrer"&gt;inferred context and a dependency graph&lt;/a&gt;, and tuning the index is orthogonal to it: a better index is a better answer to a different question.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://riftmap.dev/for-agents/" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt; is that graph. It parses those manifests across your entire GitHub or GitLab organisation from one read-only token, resolves each reference to the repository that owns the artifact, and returns the answer the index can't, every repository that depends on the thing you are about to change, with the version each one declared. It is the &lt;a href="https://riftmap.dev/showcase/prometheus/" rel="noopener noreferrer"&gt;Prometheus graph above&lt;/a&gt;, for your own org. If you want that graph in front of the agent rather than in a browser tab, wiring it into Cursor and Claude Code is &lt;a href="https://riftmap.dev/blog/claude-code-cursor-cross-repo-context/" rel="noopener noreferrer"&gt;its own post&lt;/a&gt;, because the graph is useful to the engineer holding the pager first and the agent second. Either way it is the same move. Stop asking a tool built for resemblance to answer a question about dependency, and hand over a graph that was parsed for exactly that.&lt;/p&gt;

&lt;p&gt;Tune Cursor's index until it is perfect and you have made it excellent at finding the code that resembles your question. The repositories that break when you change a shared module were never going to resemble your question. They were a set of &lt;code&gt;FROM&lt;/code&gt; lines and &lt;code&gt;require&lt;/code&gt; blocks and &lt;code&gt;source&lt;/code&gt; references, declared once across repositories you may not have even opened, waiting to be read. The index reads them as text. You need something that reads them as edges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Questions teams ask
&lt;/h2&gt;

&lt;p&gt;The same questions come up whenever I help someone tune this, so here they are, answered straight.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I speed up Cursor indexing on a large monorepo?&lt;/strong&gt; Index less. A repository in the tens of thousands of files can take hours to index from the root, so the highest-leverage move is to open the package you work in as the workspace root rather than the whole tree. Add a &lt;code&gt;.cursorignore&lt;/code&gt; for build output, dependencies, and generated files, aim for a few thousand indexed files rather than tens of thousands, and use &lt;code&gt;.cursorindexingignore&lt;/code&gt; for large directories you still want to &lt;code&gt;@&lt;/code&gt;-mention occasionally. Most slow-index complaints come down to indexing &lt;code&gt;node_modules&lt;/code&gt; and vendored code nobody needed in the index in the first place.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Should I use &lt;code&gt;.cursorignore&lt;/code&gt; or &lt;code&gt;.cursorindexingignore&lt;/code&gt;?&lt;/strong&gt; Use &lt;code&gt;.cursorindexingignore&lt;/code&gt; unless the AI should never see the file at all. &lt;code&gt;.cursorindexingignore&lt;/code&gt; keeps a file out of the index and out of search results but leaves it readable, so you can still pull it in with &lt;code&gt;@Files&lt;/code&gt;, which makes it the reversible, lower-risk choice for large or noisy directories. Reserve &lt;code&gt;.cursorignore&lt;/code&gt; for things that must be fully invisible, like secrets or files you never want referenced, because it blocks reading and &lt;code&gt;@&lt;/code&gt;-mentioning as well as indexing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does Cursor's &lt;code&gt;@Codebase&lt;/code&gt; understand dependencies between repositories?&lt;/strong&gt; Not in the sense a breaking change needs. &lt;code&gt;@Codebase&lt;/code&gt; is nearest-neighbour search over an embedding index, so it returns the code most similar to your query, which is a different set from the repositories that declare a dependency on what you are changing. A &lt;code&gt;go.mod&lt;/code&gt; require line, or a Dockerfile &lt;code&gt;FROM&lt;/code&gt;, and the repository it points at are not similar text, so no similarity search reliably connects them. Indexing more repositories widens what &lt;code&gt;@Codebase&lt;/code&gt; can resemble against, but a cross-repo dependency edge has to be parsed from a manifest, not retrieved by similarity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can Cursor index multiple repositories at once?&lt;/strong&gt; Yes. A multi-root workspace, defined in a &lt;code&gt;.code-workspace&lt;/code&gt; file, can hold several repository roots, and Cursor indexes all of them and lets Agent search across the set. That makes &lt;code&gt;@Codebase&lt;/code&gt; span your repositories, which is genuinely useful for understanding a system. It does not make the index map how those repositories depend on each other, which is a separate thing that comes from parsing manifests rather than from a wider index.&lt;/p&gt;

</description>
      <category>cursor</category>
      <category>monorepo</category>
      <category>codebaseindexing</category>
      <category>multirepo</category>
    </item>
    <item>
      <title>Claude Code reads your clone. Cursor reads similarity. Neither sees the graph.</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Tue, 30 Jun 2026 09:19:21 +0000</pubDate>
      <link>https://dev.to/danielwe/claude-code-reads-your-clone-cursor-reads-similarity-neither-sees-the-graph-487i</link>
      <guid>https://dev.to/danielwe/claude-code-reads-your-clone-cursor-reads-similarity-neither-sees-the-graph-487i</guid>
      <description>&lt;p&gt;&lt;em&gt;Both agents can be handed more than one repository. Here is how to wire cross-repo blast radius into each, and the exact point where Claude Code's clone and Cursor's index each stop.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;You have Claude Code open in one repository and Cursor in another. The task in front of you is small, and the kind agents are good at. Bump a base image. Tighten a variable on a Terraform module. Delete a CI job that you are fairly sure three other projects still call. Before you let the agent run, you want the one thing it cannot tell you. Who else breaks.&lt;/p&gt;

&lt;p&gt;Both tools have a multi-repo story now, so it feels like a question they should be able to answer. Claude Code has &lt;code&gt;/add-dir&lt;/code&gt; and native MCP. Cursor indexes an entire workspace and answers questions about it. You could be forgiven for assuming that somewhere in there is a feature that knows your &lt;code&gt;orders-service&lt;/code&gt; consumes the contract you are about to change. There is not, and the reason is different for each tool. Knowing exactly where each one stops is the difference between wiring up something that genuinely helps and trusting something that quietly does not.&lt;/p&gt;

&lt;p&gt;So here is the claim this post runs on. Claude Code and Cursor can both be handed more than one repository, and neither can tell you which repositories a base-image or Terraform change will break, because Claude Code only sees the repositories you checked out, and Cursor's index answers similarity, not dependency. The edge that breaks the other repo was declared in a manifest, and a manifest is neither a file the agent cloned nor a chunk that embeds near your query.&lt;/p&gt;

&lt;p&gt;I have covered the &lt;a href="https://riftmap.dev/blog/how-to-give-copilot-cross-repo-context/" rel="noopener noreferrer"&gt;Copilot version of this question&lt;/a&gt; before, and made the case for &lt;a href="https://riftmap.dev/blog/the-repo-your-agent-didnt-clone/" rel="noopener noreferrer"&gt;why this blindness is structural rather than a gap a smarter model closes&lt;/a&gt;. This is the Claude Code and Cursor version, and it gets concrete about the wiring.&lt;/p&gt;

&lt;h2&gt;
  
  
  What each agent can actually see today
&lt;/h2&gt;

&lt;p&gt;What Claude Code and Cursor can see across repositories is different for each, and it moves fast enough that this section carries a date. As of June 2026, here is the precise shape of it.&lt;/p&gt;

&lt;p&gt;Claude Code sees the directory you started it in, recursing up the tree to load any &lt;code&gt;CLAUDE.md&lt;/code&gt; it finds along the way. You extend that reach with the &lt;a href="https://code.claude.com/docs/en/claude-directory" rel="noopener noreferrer"&gt;&lt;code&gt;--add-dir&lt;/code&gt; flag, the &lt;code&gt;/add-dir&lt;/code&gt; command, or &lt;code&gt;permissions.additionalDirectories&lt;/code&gt;&lt;/a&gt; in &lt;code&gt;.claude/settings.json&lt;/code&gt;. It reads and greps those trees, and it can edit them. What it does not do is hold a dependency graph of them. Add MCP servers and it gains tools that can. So Claude Code's cross-repo reach is exactly the set of directories you granted, navigated by reading and search, plus whatever tools you wired in.&lt;/p&gt;

&lt;p&gt;Cursor's reach is its index. It chunks your code, embeds each chunk with a custom model, and stores the vectors in a &lt;a href="https://cursor.com/blog/secure-codebase-indexing" rel="noopener noreferrer"&gt;remote vector database&lt;/a&gt;, and &lt;code&gt;@Codebase&lt;/code&gt; answers by finding the chunks whose embeddings sit nearest your query. &lt;a href="https://cursor.com/docs/agent/tools/search" rel="noopener noreferrer"&gt;Multi-root workspaces are supported&lt;/a&gt;, so several repositories can be indexed at once and Agent can reach all of them. The index covers the repositories you opened, it trails your local checkout by a sync interval, and it answers by similarity.&lt;/p&gt;

&lt;p&gt;Notice what both give you. A way to put more repositories in front of the agent. Notice what neither gives you. A way to turn "the agent can &lt;em&gt;see&lt;/em&gt; these repositories" into "the agent knows how these repositories &lt;em&gt;depend&lt;/em&gt; on each other." That second thing is the whole post, and the rest of it is, first, why each tool stops short of it, and then how to hand each tool the part it is missing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Code: widen the window, write the map, add the tool
&lt;/h2&gt;

&lt;p&gt;Claude Code gives you three ways to reach across repositories, and they map cleanly onto the three families every agent's users reach for, in roughly this order. Each one solves a real problem. The first two stop at the same wall, and the third is where the fix actually goes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Widen the window with /add-dir
&lt;/h3&gt;

&lt;p&gt;The first instinct is access, and Claude Code makes it cheap. &lt;code&gt;--add-dir&lt;/code&gt; at launch, &lt;code&gt;/add-dir&lt;/code&gt; mid-session, or &lt;code&gt;permissions.additionalDirectories&lt;/code&gt; in &lt;code&gt;.claude/settings.json&lt;/code&gt; to make a set of sibling directories part of the project so everyone working in that area gets them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# At launch&lt;/span&gt;
claude &lt;span class="nt"&gt;--add-dir&lt;/span&gt; ../orders-service &lt;span class="nt"&gt;--add-dir&lt;/span&gt; ../platform-charts

&lt;span class="c"&gt;# Or permanently, in .claude/settings.json&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"permissions"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"additionalDirectories"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"../orders-service"&lt;/span&gt;, &lt;span class="s2"&gt;"../platform-charts"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each granted directory becomes readable and editable, and as of a recent version Claude Code can even load the &lt;code&gt;CLAUDE.md&lt;/code&gt; from an added directory if you set &lt;code&gt;CLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD=1&lt;/code&gt;. At a handful of tightly coupled repositories, with someone keeping the set current, this is the right first move and it costs an afternoon. I want to be fair to it before drawing the line.&lt;/p&gt;

&lt;p&gt;The line is that access is not selection. You chose which directories to add, and you chose them from memory. Nothing in &lt;code&gt;/add-dir&lt;/code&gt; tells you that a third repository consumes the same contract and is not in the set at all. And the failure that bites hardest is quieter still. You can only add a directory that exists on your disk, so the repository you forgot to clone is not a directory you can add. Widening the window shows the agent more rooms. It does not hand it the floor plan, and I made the longer version of that argument in &lt;a href="https://riftmap.dev/blog/repo-access-was-never-the-hard-part/" rel="noopener noreferrer"&gt;Repo access was never the hard part&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Write the map in CLAUDE.md
&lt;/h3&gt;

&lt;p&gt;The second instinct is to write the structure down. This is the &lt;code&gt;CLAUDE.md&lt;/code&gt; and &lt;code&gt;AGENTS.md&lt;/code&gt; layer, files loaded at session start and held for the whole session, layered enterprise, user, and project deep, that describe how the system fits together. More than sixty thousand repositories now carry one. The most documented version is Mabl's, an 850-line coordination graph their agents query at planning time, and it works.&lt;/p&gt;

&lt;p&gt;It also decays, and the engineers writing these files know it. The sharpest line I have read on the pattern comes from someone who &lt;a href="https://karun.me/blog/2026/03/26/structuring-claude-code-for-multi-repo-workspaces/" rel="noopener noreferrer"&gt;layered the files org, team, and repo deep&lt;/a&gt; and concluded: "I learned not to list repos here. Lists go stale. Instead, tell Claude where to look." That is the trap in one sentence. A hand-written map has to be kept current by humans at the same throughput the agents are changing the repositories, and an agent navigating by a stale map does not feel stale. It feels fast, right up until the change lands. A map that decays is the &lt;a href="https://riftmap.dev/blog/the-catalog-maintenance-trap/" rel="noopener noreferrer"&gt;developer-portal catalog problem&lt;/a&gt; wearing new clothes, and it loses to the same thing: a graph parsed from the source, which cannot drift from the source because it is read from it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add the graph as an MCP tool
&lt;/h3&gt;

&lt;p&gt;The third instinct is the right one. Stop describing the structure and hand the agent a tool that holds it. MCP is native to Claude Code, configured with &lt;code&gt;claude mcp add&lt;/code&gt; or a committable &lt;code&gt;.mcp.json&lt;/code&gt; at the project root, and this is where the fix belongs.&lt;/p&gt;

&lt;p&gt;The catch is what is usually on the other end of that tool call. Most of what you can plug in here is a symbol index or a semantic one. It follows imports and call edges, or it embeds your code and retrieves by similarity. Both stop at the language boundary, the same wall I walked in the &lt;a href="https://riftmap.dev/blog/the-repo-your-agent-didnt-clone/" rel="noopener noreferrer"&gt;flagship&lt;/a&gt;: a &lt;a href="https://riftmap.dev/blog/symbol-graphs-and-artifact-graphs/" rel="noopener noreferrer"&gt;symbol graph&lt;/a&gt; answers "who calls this function" and never sees a &lt;code&gt;FROM&lt;/code&gt; line. The tool Claude Code actually needs at this layer is one that resolves a Dockerfile &lt;code&gt;FROM&lt;/code&gt; to the repository that builds the image, and a Terraform &lt;code&gt;source&lt;/code&gt; to the repository that owns the module. That tool is an &lt;a href="https://riftmap.dev/what-is-an-artifact-dependency-graph/" rel="noopener noreferrer"&gt;artifact graph&lt;/a&gt;, and wiring it in is the back half of this post.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cursor: the index is the context model, and it answers the wrong question
&lt;/h2&gt;

&lt;p&gt;Cursor's cross-repo story is its index, and the index is built to answer a different question than the one a breaking change asks. This is the part worth slowing down on, because Cursor is genuinely good and the distinction is easy to miss.&lt;/p&gt;

&lt;p&gt;Here is how it works. Cursor splits your code into chunks, runs each chunk through an embedding model to get a vector, and stores those vectors in a database. When you ask &lt;code&gt;@Codebase&lt;/code&gt; something, your question becomes a vector too, and Cursor returns the chunks whose vectors sit nearest yours. The official description is exact about this: it returns the most semantically similar code chunks, even when they do not contain the keywords you used. Across a multi-root workspace it does that over every repository you opened.&lt;/p&gt;

&lt;p&gt;I want to be generous here, because it deserves it. Semantic search is a real improvement over grep. Cursor's &lt;a href="https://cursor.com/docs/agent/tools/search" rel="noopener noreferrer"&gt;own research&lt;/a&gt; puts it at around 12.5% more accurate on large codebases, and the gain grows with size. Open &lt;code&gt;web-app&lt;/code&gt;, &lt;code&gt;orders-service&lt;/code&gt;, and the shared contract repo in one workspace, and "how does the orders service validate a token" becomes one question instead of four context switches. For understanding a system, exploring it, finding where a concept lives, it is excellent, and nothing below is a knock on that.&lt;/p&gt;

&lt;p&gt;The line is that semantic search returns the code most similar to your query, and a declared dependency is not a similarity relationship. A Dockerfile line that reads &lt;code&gt;FROM platform/base-go:1.21&lt;/code&gt; and the repository that builds &lt;code&gt;platform/base-go&lt;/code&gt; share almost no tokens, no structure, nothing an embedding would place near the other. The edge between them is not latent in the text, waiting to be retrieved. It was declared once, in a manifest, and it is either parsed from that manifest or it is not found.&lt;/p&gt;

&lt;p&gt;Ask &lt;code&gt;@Codebase&lt;/code&gt; "what depends on this base image" and you get the files that mention base images, which is a different set from the repositories that inherit this one. This is the difference between &lt;a href="https://riftmap.dev/blog/inferred-context-is-not-a-dependency-graph/" rel="noopener noreferrer"&gt;inferred context and a dependency graph&lt;/a&gt;, and it is sharpest exactly where Cursor is most loved.&lt;/p&gt;

&lt;p&gt;Two more limits sit underneath that one, and they bite even if you set the similarity question aside. The index covers only the repositories you opened in the workspace, and someone chose that &lt;code&gt;.code-workspace&lt;/code&gt; folder list, from memory, which is the catalog trap again in a different file. And the index trails your checkout by a sync interval, so by Cursor's own guidance it is reliable for stable architecture and unreliable for recent change. The repository you did not open and the push from this morning are both outside it. Cursor indexes what looks related. The thing you need before a base-image bump is what is declared dependent, and those are different graphs built by different machinery.&lt;/p&gt;

&lt;h2&gt;
  
  
  The wall both share, and the graph that clears it
&lt;/h2&gt;

&lt;p&gt;Widen the window, write the map, or index the workspace, and you reach the same wall from three directions. &lt;code&gt;/add-dir&lt;/code&gt; gives access without selection. &lt;code&gt;CLAUDE.md&lt;/code&gt; gives structure that decays. The index gives similarity, never the declared edge. None of them resolves a &lt;code&gt;source&lt;/code&gt; block to the repository that owns the module, because that was never the job any of them was built to do.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;How it reaches across repos&lt;/th&gt;
&lt;th&gt;What that answers well&lt;/th&gt;
&lt;th&gt;What it can't resolve&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reads the directories you grant it with &lt;code&gt;/add-dir&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Anything inside a repository you checked out&lt;/td&gt;
&lt;td&gt;Which repositories a change affects, and any repo you didn't clone&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Semantic search over an embedding index of the open workspace&lt;/td&gt;
&lt;td&gt;"Where does this live", "how does this work"&lt;/td&gt;
&lt;td&gt;A declared dependency, which is not a similarity relationship&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;But the edges that matter are already written down. The contract package in a &lt;code&gt;package.json&lt;/code&gt; dependency or a &lt;code&gt;go.mod&lt;/code&gt; require. The module in a Terraform &lt;code&gt;source&lt;/code&gt; block. The image in a Dockerfile &lt;code&gt;FROM&lt;/code&gt;. The chart in a &lt;code&gt;Chart.yaml&lt;/code&gt; dependency. The template in a GitLab CI &lt;code&gt;include:project&lt;/code&gt; or a reusable Actions &lt;code&gt;uses:&lt;/code&gt;. I spent &lt;a href="https://riftmap.dev/blog/series/find-every-consumer/" rel="noopener noreferrer"&gt;a whole series&lt;/a&gt; walking those edges one ecosystem at a time. They are deterministic. Parsed, not inferred. The graph that should be answering the agent's question already exists in your organisation's manifests, unassembled. The rest of this post is assembling it once and wiring it into Claude Code and Cursor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wiring the graph into Claude Code and Cursor
&lt;/h2&gt;

&lt;p&gt;A cross-repo dependency graph helps either agent in three ways, and they are worth doing in order, because each one is more setup and more enforcement than the last. The graph here is &lt;a href="https://riftmap.dev/for-agents/" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt;, which parses these edges across your whole GitHub or GitLab organisation from one read-only token and serves the result over an HTTP API. You can adopt the same architecture with any parsed graph, including one you build yourself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tier one: let the agent call the API
&lt;/h3&gt;

&lt;p&gt;The simplest version works today and installs nothing. Tell the agent the graph is an HTTP call, and let it make the call with the shell tool it already has. The &lt;a href="https://docs.riftmap.dev/agents/overview" rel="noopener noreferrer"&gt;recommended pattern&lt;/a&gt; is three endpoints. Resolve the working tree to a node, hydrate that node's context in one round-trip, and ask for the transitive cascade when the change actually warrants it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Resolve the working tree's clone URL to a Riftmap repo&lt;/span&gt;
&lt;span class="nv"&gt;REPO_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://api.riftmap.dev/api/v1/repositories/lookup?url=https://github.com/myorg/platform-charts"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-Key: &lt;/span&gt;&lt;span class="nv"&gt;$RIFTMAP_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.id'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# One round-trip: the repo, its dependencies, its dependents, its artifacts&lt;/span&gt;
curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"https://api.riftmap.dev/api/v1/repositories/&lt;/span&gt;&lt;span class="nv"&gt;$REPO_ID&lt;/span&gt;&lt;span class="s2"&gt;/context"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-Key: &lt;/span&gt;&lt;span class="nv"&gt;$RIFTMAP_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# When the change is breaking, ask for the transitive blast radius&lt;/span&gt;
curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"https://api.riftmap.dev/api/v1/repositories/&lt;/span&gt;&lt;span class="nv"&gt;$REPO_ID&lt;/span&gt;&lt;span class="s2"&gt;/impact?max_depth=3"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-Key: &lt;/span&gt;&lt;span class="nv"&gt;$RIFTMAP_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To make the agent reach for this without being reminded each time, put the instruction where it already reads. A line in &lt;code&gt;CLAUDE.md&lt;/code&gt; for Claude Code, a rule in &lt;code&gt;.cursor/rules/&lt;/code&gt; for Cursor, along the lines of: before planning any change to a shared artifact (a base image, a Terraform module, a Helm chart, a CI template), call the Riftmap API to list the dependents and fold them into the plan.&lt;/p&gt;

&lt;p&gt;This is not theoretical. I have a &lt;a href="https://riftmap.dev/for-agents/" rel="noopener noreferrer"&gt;recorded Claude Code session&lt;/a&gt; doing exactly this. The prompt asks it to delete a &lt;code&gt;helm:deploy&lt;/code&gt; job, because Helm deploys are moving into an umbrella chart, and to flag who would break. The agent calls the API on its own, surfaces 51 consumers across the organisation, and sorts them by whether they pin a release tag, and so have a grace period, or float on &lt;code&gt;main&lt;/code&gt;, and so break on the next pipeline run. It can make that distinction because every dependency edge the API returns carries the &lt;code&gt;version_constraint&lt;/code&gt; the consumer declared. It also flags, unprompted, a caveat about the granularity of one edge, which is the kind of self-correction you want from an agent about to make a breaking change. That is the entire point. The structural account is in front of the agent before the first edit, not discovered in CI twenty minutes later.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tier two: make it a first-class tool with MCP
&lt;/h3&gt;

&lt;p&gt;Calling &lt;code&gt;curl&lt;/code&gt; from a prompt works, but the agent has to be told the shape of the call each time, and the call lives in your instructions rather than in the tool list where it belongs. Both Claude Code and Cursor are MCP clients, so the more native option is to expose the graph as MCP tools.&lt;/p&gt;

&lt;p&gt;Riftmap does not ship its own MCP server yet, and the &lt;a href="https://docs.riftmap.dev/agents/mcp-cli-roadmap" rel="noopener noreferrer"&gt;reason is worth stating plainly&lt;/a&gt;, because it is the honest one: the endpoints are the load-bearing piece, packaging them is mechanical, and the build is being held until real users ask for it rather than designed for a workflow nobody has yet. If you want it, the roadmap page takes an issue, and concrete demand is what unblocks it.&lt;/p&gt;

&lt;p&gt;Until it lands, you bridge in about five minutes, because Riftmap publishes a static OpenAPI schema and there are mature generators that turn any OpenAPI spec into an MCP server. Point one at the schema, give it the API base and your key, and the three endpoints become tools the agent sees in its tool list. For Claude Code, in a committable &lt;code&gt;.mcp.json&lt;/code&gt; at the project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"riftmap"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@ivotoby/openapi-mcp-server"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"OPENAPI_SPEC_PATH"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://app.riftmap.dev/openapi.json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"API_BASE_URL"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://api.riftmap.dev/api/v1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"API_HEADERS"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"X-API-Key:${RIFTMAP_API_KEY}"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cursor reads the same &lt;code&gt;mcpServers&lt;/code&gt; format, so the identical block drops into &lt;code&gt;.cursor/mcp.json&lt;/code&gt;, with one change: Cursor's variable syntax for the key is &lt;code&gt;${env:RIFTMAP_API_KEY}&lt;/code&gt; rather than &lt;code&gt;${RIFTMAP_API_KEY}&lt;/code&gt;. One detail in that config is worth knowing rather than copying blind. The schema is served from &lt;code&gt;app.riftmap.dev&lt;/code&gt; and the API answers on &lt;code&gt;api.riftmap.dev&lt;/code&gt;, which is why the base URL is set explicitly instead of being inferred from the spec's own host.&lt;/p&gt;

&lt;p&gt;With either client, "who depends on &lt;code&gt;platform-charts&lt;/code&gt;" is now a tool call the agent makes itself, in Plan Mode or mid-task, with no HTTP in the prompt. When Riftmap's own server ships it will be a thinner version of the same thing, &lt;code&gt;pipx install riftmap-mcp&lt;/code&gt; against the same endpoints, and switching to it is a config line.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tier three: make it a gate, not a hint
&lt;/h3&gt;

&lt;p&gt;A tool the agent may call is still a tool the agent may skip. For the highest-stakes changes you want the graph between the change and &lt;code&gt;main&lt;/code&gt;, not merely available to the planner. There are two honest places to put it.&lt;/p&gt;

&lt;p&gt;In the loop, Claude Code runs hooks, skills, and subagents. A skill the agent invokes during planning, or a check on the commit boundary, can make the dependents query a step that has to happen before a change to a shared artifact proceeds. This keeps the graph in the agent's path rather than its discretion, though it depends on the agent's cooperation to fire.&lt;/p&gt;

&lt;p&gt;At review, the cleaner gate is CI, because it does not depend on the agent cooperating at all. On any pull request that touches a shared component, the pipeline calls &lt;code&gt;/impact&lt;/code&gt;, posts the consumer list as a comment, and the human reviewing the agent's change is checking it against the same structural account the agent planned with, not against memory. This is the architecture I think the whole category lands on, and it is the one &lt;a href="https://www.mabl.com/blog/how-we-built-a-system-for-ai-agents-to-ship-real-code-across-75-repos" rel="noopener noreferrer"&gt;Mabl built by hand&lt;/a&gt; before running agents across a hundred repositories on top of it.&lt;/p&gt;

&lt;p&gt;What makes either gate trustworthy rather than confidently wrong is the freshness contract. Every repository the API returns carries &lt;code&gt;last_scanned_at&lt;/code&gt; and &lt;code&gt;last_activity_at&lt;/code&gt;, and the single rule is that if the repository has been pushed to since Riftmap last scanned it, the graph is treated as stale. For an interactive agent that means warn and proceed with the caveat. For a CI gate it means trigger a rescan and re-poll before the merge. A gate that knows when it is out of date is the opposite of the stale &lt;code&gt;CLAUDE.md&lt;/code&gt; that feels fast right up until it is wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  The graph you hand the agent, not the one it tries to infer
&lt;/h2&gt;

&lt;p&gt;Give Claude Code every repository you have checked out, and give Cursor an index of all of them, and you have given each agent a faster way to read what is already in front of it. You have not given either one the list of repositories that break when it bumps the base image. That list was never a file Claude Code could clone or a chunk Cursor could embed. It was a set of &lt;code&gt;FROM&lt;/code&gt; lines and &lt;code&gt;source&lt;/code&gt; blocks sitting in repositories the agent never opened, declared once and never assembled. The agent cannot infer it, because it was never there to infer. It has to be parsed and handed over.&lt;/p&gt;

&lt;p&gt;Riftmap is that graph, built from one read-only token across your GitHub or GitLab organisation. It parses the manifests that already declare these edges across twelve ecosystems and resolves each reference to the repository that owns the artifact, then serves the result two ways. An interactive blast-radius view for the engineer who owns the estate and holds the pager, and an HTTP API, three endpoints with an OpenAPI schema and a freshness field on every response, that Claude Code and Cursor can call during planning, or that you can put in front of a merge. Auto-discovered, never catalogued. Parsed, not inferred. The MCP server is coming when enough people ask. Until then it is one config block away.&lt;/p&gt;

&lt;h2&gt;
  
  
  Questions teams ask
&lt;/h2&gt;

&lt;p&gt;The same questions come up whenever I help someone wire this in, so here they are, answered straight.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I give Claude Code cross-repo dependency awareness?&lt;/strong&gt; Claude Code sees the repositories you grant it with &lt;code&gt;/add-dir&lt;/code&gt; and reads them, but it does not build a dependency graph across them. To give it cross-repo blast radius, expose a parsed dependency graph as a tool it calls during planning. Either let it call the Riftmap HTTP API from its shell, resolving the repo with &lt;code&gt;lookup&lt;/code&gt; and then asking for &lt;code&gt;context&lt;/code&gt; or &lt;code&gt;impact&lt;/code&gt;, or wrap the OpenAPI schema as an MCP server in &lt;code&gt;.mcp.json&lt;/code&gt;. Then it can ask which repositories a change affects before it edits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does Cursor's @Codebase understand cross-repo dependencies?&lt;/strong&gt; Not in the sense a breaking change needs. &lt;code&gt;@Codebase&lt;/code&gt; is semantic search over an embedding index, so it returns the code most similar to your query, which is a different set from the repositories that declare a dependency on what you are changing. A Dockerfile &lt;code&gt;FROM&lt;/code&gt; line and the repository that builds that base image are not similar text, so no embedding search reliably connects them. Cross-repo dependency edges have to be parsed from manifests, not retrieved by similarity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I add a Riftmap MCP server to Claude Code or Cursor?&lt;/strong&gt; Not a first-party one yet. It is on the roadmap, deferred until there is real demand. Today you bridge in a few minutes: Riftmap publishes a static OpenAPI schema at &lt;code&gt;app.riftmap.dev/openapi.json&lt;/code&gt;, and a generic OpenAPI-to-MCP server turns it into MCP tools you register in &lt;code&gt;.mcp.json&lt;/code&gt; for Claude Code or &lt;code&gt;.cursor/mcp.json&lt;/code&gt; for Cursor. Or skip MCP entirely and have the agent call the three HTTP endpoints directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why doesn't /add-dir tell Claude Code which repositories a change affects?&lt;/strong&gt; Because &lt;code&gt;/add-dir&lt;/code&gt; grants access, not selection. It makes the directories you name readable and editable, but you chose those directories from memory, and it cannot add a repository that is not checked out on your disk. Knowing which repositories to add is the cross-repo dependency question itself, and that answer comes from a parsed graph, not from a wider window.&lt;/p&gt;

</description>
      <category>aicodingagents</category>
      <category>crossrepocontext</category>
      <category>claudecode</category>
      <category>cursor</category>
    </item>
    <item>
      <title>What 208 kubernetes-sigs repos actually depend on</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Mon, 29 Jun 2026 12:20:16 +0000</pubDate>
      <link>https://dev.to/danielwe/what-208-kubernetes-sigs-repos-actually-depend-on-19jh</link>
      <guid>https://dev.to/danielwe/what-208-kubernetes-sigs-repos-actually-depend-on-19jh</guid>
      <description>&lt;p&gt;&lt;em&gt;A scan of the kubernetes-sigs organisation, rendered as a graph: 208 repos, 1,128 cross-repo dependencies, and 153 repos that depend on &lt;code&gt;yaml&lt;/code&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;If you've worked on a Kubernetes operator, you have a rough mental model of how kubernetes-sigs fits together. controller-runtime and controller-tools at the framework layer, kubebuilder above them as scaffolding, cluster-api and its army of cloud providers off to one side, kustomize and kind as standalone tools, dozens of CSI drivers each doing their own thing, and an unglamorous foundation of small utility libraries (&lt;code&gt;sigs.k8s.io/yaml&lt;/code&gt;, &lt;code&gt;sigs.k8s.io/json&lt;/code&gt;, &lt;code&gt;sigs.k8s.io/structured-merge-diff&lt;/code&gt;) underneath the lot.&lt;/p&gt;

&lt;p&gt;What you probably haven't seen is what that mental model looks like rendered as a graph.&lt;/p&gt;

&lt;p&gt;This is the second post in the series. The &lt;a href="https://riftmap.dev/blog/what-56-prometheus-repos-depend-on/" rel="noopener noreferrer"&gt;first scanned the Prometheus org&lt;/a&gt; with &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt; and used it as a calibration target: 56 repos, well-understood, you could check the scanner against your own intuition. The shape mostly matched. kubernetes-sigs is the harder test. 208 public repos, structured very differently from Prometheus, no single owning team that holds the whole thing in their head.&lt;/p&gt;

&lt;p&gt;The shape mostly matches here too. There are also a few specific things it doesn't see, which I'll get into below.&lt;/p&gt;

&lt;p&gt;The biggest single observation, before any of the screenshots: kubernetes-sigs is a federation. Where Prometheus had a hub-and-spoke shape with &lt;code&gt;client_golang&lt;/code&gt; and &lt;code&gt;prometheus/common&lt;/code&gt; at the centre, kubernetes-sigs has a thin shared utility layer at the bottom and otherwise-independent projects on top. That's not a flaw in the scan. It's what a 208-repo SIG-governed org actually looks like.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of the org at a glance
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmbto15ch6wfpn26drom4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmbto15ch6wfpn26drom4.png" alt="Riftmap dashboard view of the kubernetes-sigs org showing 208 repos, 1,128 cross-repo dependencies, 528 distinct artifacts, the top-impact panel, and a dependency breakdown by ecosystem" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The numbers from a single scan: 208 repositories, 1,128 cross-repo dependencies, 528 distinct artifacts consumed somewhere in the org. The breakdown is heavily Go-skewed (857 Go module references, 257 git URL references, 8 GitHub Actions, 5 Kubernetes manifests, 1 Helm), which is what you'd expect for a Go-native operator ecosystem.&lt;/p&gt;

&lt;p&gt;The top of the impact list reads about how a Kubernetes contributor would predict the bottom of it but maybe not the top:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;kubernetes-sigs/yaml&lt;/code&gt; — 152 dependents&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;kubernetes-sigs/json&lt;/code&gt; — 142 dependents&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;kubernetes-sigs/randfill&lt;/code&gt; — 110 dependents&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;kubernetes-sigs/controller-runtime&lt;/code&gt; — 109 dependents&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;kubernetes-sigs/structured-merge-diff&lt;/code&gt; — 102 dependents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The calibration moment here isn't quite the same as Prometheus. With Prometheus the top of the ranking confirmed the mental model: &lt;code&gt;prometheus/common&lt;/code&gt;, &lt;code&gt;client_golang&lt;/code&gt;, &lt;code&gt;client_model&lt;/code&gt;, the predictable centre. The top of the kubernetes-sigs ranking has controller-runtime where you'd expect it at #4, but the three repos above it are utility libraries most contributors couldn't pick out of a lineup. &lt;code&gt;sigs.k8s.io/yaml&lt;/code&gt; is a thin wrapper around &lt;code&gt;gopkg.in/yaml.v2&lt;/code&gt; with some helpers. &lt;code&gt;sigs.k8s.io/json&lt;/code&gt; is a small JSON marshaller. &lt;code&gt;sigs.k8s.io/randfill&lt;/code&gt; is a randomised-value fuzzer used in test generation.&lt;/p&gt;

&lt;p&gt;These three sit at the top because every operator-flavoured repo in the org imports them, directly or one hop away through controller-runtime or apimachinery, and Riftmap counts every direct import. The story isn't that they're surprising. It's that the most-depended-on things in a 208-repo Kubernetes org are not the ones with public-facing names. That's how shared infrastructure works.&lt;/p&gt;

&lt;p&gt;The out-degree list tells a complementary story. Most-importing repos in the org:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Repo&lt;/th&gt;
&lt;th&gt;Out-degree&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cluster-api&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;kueue&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cluster-api-provider-aws&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cluster-api-provider-azure&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cluster-api-operator&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The cluster-api family dominates. Each provider imports the core, the shared utilities, and a dozen integration libraries. Combine the two rankings and you get the actual shape of the federation: a small utility floor at the bottom, controller-runtime as the framework layer, and the cluster-api family as the densest coordination cluster on top.&lt;/p&gt;

&lt;h2&gt;
  
  
  The graph view
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgswypltdqnkxjz47zj06.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgswypltdqnkxjz47zj06.png" alt="Riftmap default graph view of the kubernetes-sigs org after auto-clustering — the 208 underlying repos folded into cluster groups, with the cluster-api family as a dense group, CSI drivers as a looser constellation, and a wide flat row of standalone projects" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The default graph view shows the kubernetes-sigs org after auto-clustering — all 208 repos folded into a navigable set of cluster nodes. The first thing worth noticing is what's missing. Prometheus's graph had eight tight clusters around clear anchors (&lt;code&gt;common&lt;/code&gt;, &lt;code&gt;client_model&lt;/code&gt;, &lt;code&gt;procfs&lt;/code&gt;, &lt;code&gt;exporter-toolkit&lt;/code&gt;, &lt;code&gt;promci&lt;/code&gt;) and a single dense hub-and-spoke pattern. kubernetes-sigs has a much flatter shape. The cluster-api family forms a recognisable group. The CSI drivers form a loose constellation. The kubebuilder-adjacent repos cluster together. And a wide row of standalone projects (kind, kustomize, gateway-api, kueue, metrics-server) sit on their own.&lt;/p&gt;

&lt;p&gt;That isn't a clustering failure. The auto-clustering groups repos with similar dependency profiles, and on kubernetes-sigs there genuinely are fewer profiles to find. Most cluster-api providers look like each other and cluster together. Most CSI drivers look like each other and cluster together. But &lt;code&gt;kind&lt;/code&gt; doesn't look like &lt;code&gt;kustomize&lt;/code&gt;, and neither looks like &lt;code&gt;gateway-api&lt;/code&gt;, and the graph faithfully reports that.&lt;/p&gt;

&lt;p&gt;The clustering is what makes 208 repos legible at this zoom. Without it you get a hairball. With it you can see, at a glance, which parts of the org are coupled and which parts are independent neighbours.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern at the centre
&lt;/h2&gt;

&lt;p&gt;I picked &lt;code&gt;cluster-api&lt;/code&gt; for the focus-mode shot. controller-runtime would be the easier choice. It's the closest analogue to &lt;code&gt;client_golang&lt;/code&gt; from the Prometheus post, with 109 in-org consumers radiating outward. cluster-api is more interesting. It sits at #10 on the in-degree ranking (38 dependents) but tied for #1 on out-degree (29 imports). That combination is rare. Most heavy importers aren't also heavy producers. Most central repos don't pull in two dozen of their siblings. cluster-api does both.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F70u7gtm6u69k59kp9lyj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F70u7gtm6u69k59kp9lyj.png" alt="Riftmap focus mode on kubernetes-sigs/cluster-api showing the repo at the centre with cluster-api-provider-aws, -azure, -gcp, -vsphere, -ibmcloud, -openstack, -cloudstack and the cluster-api-operator radiating out on one side and controller-runtime, the shared utility imports, and the metrics-server observability subgraph on the other" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What focus mode shows: cluster-api sits in the middle of a two-sided neighbourhood. On the consumer side, the cluster-api-provider family (-aws, -azure, -gcp, -vsphere, -ibmcloud, -openstack, -cloudstack, and a long tail) fans in. On the dependency side, cluster-api pulls in controller-runtime, the shared utilities (yaml, json, randfill, structured-merge-diff), apimachinery integrations, and an observability subgraph including metrics-server.&lt;/p&gt;

&lt;p&gt;You don't need to be a Cluster API maintainer to read this graph. A new engineer joining one of the provider teams could open it on their first day and see the coordination layer that ties their work to everyone else's, plus the upstream layer they share with cluster-api itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the receipts live
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fw66r5yxlltk9bjm0przc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fw66r5yxlltk9bjm0przc.png" alt="Side panel detail view for cluster-api showing 38 in-org consumers, dependencies grouped by ecosystem, and the metrics-server Helm chart reference highlighted at hack/observability/metrics-server/kustomization.yaml line 5, version 3.13.0, confidence 0.9" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click any node and the side panel opens with file-level evidence. For cluster-api the worth-pausing-on detail is this one: cluster-api consumes the metrics-server Helm chart at version 3.13.0, referenced from &lt;code&gt;hack/observability/metrics-server/kustomization.yaml:5&lt;/code&gt;. Confidence 0.9.&lt;/p&gt;

&lt;p&gt;That's not a Go module edge. It's a Helm chart referenced from a kustomization file, in a different repo, in a directory most contributors never open. The Prometheus post leaned hard on go.mod parsing: three separate &lt;code&gt;go.mod&lt;/code&gt; files in &lt;code&gt;prometheus/prometheus&lt;/code&gt;, each with its own pinned version of &lt;code&gt;client_golang&lt;/code&gt;. Cross-ecosystem edges like the metrics-server one are where the cost of not having this kind of graph compounds. If metrics-server publishes a breaking change in the chart and cluster-api's observability hack directory still references 3.13.0, no Go tooling will flag it. The Helm chart edge sits at confidence 0.9 rather than 1.0 because chart references resolve through registry name matching rather than exact module paths. Across the full scan, 83% of edges resolve at confidence 1.0 and the rest at 0.6 to 0.9. The confidence is part of the receipt, not a hedge.&lt;/p&gt;

&lt;p&gt;A second receipt worth noting, on the Go side: &lt;code&gt;sigs.k8s.io/yaml&lt;/code&gt; is imported by 149 distinct repos via two raw module paths (&lt;code&gt;sigs.k8s.io/yaml&lt;/code&gt; and &lt;code&gt;github.com/kubernetes-sigs/yaml&lt;/code&gt;). Both resolve to the same artifact. 212 total declarations from those 149 unique repos. The gap is the multi-module monorepos importing yaml from several &lt;code&gt;go.mod&lt;/code&gt; files. &lt;code&gt;kubernetes-sigs/cluster-addons&lt;/code&gt; is the strongest example: it's a multi-module monorepo with submodules for bootstrap, coredns, dashboard, flannel, metrics-server, nodelocaldns and others. Each produces its own &lt;code&gt;go_module&lt;/code&gt; artifact (&lt;code&gt;sigs.k8s.io/cluster-addons/bootstrap&lt;/code&gt;, &lt;code&gt;sigs.k8s.io/cluster-addons/coredns&lt;/code&gt;, and so on). Each submodule's yaml import is tracked separately. "We updated yaml in the root go.mod" and "we updated every reference to yaml" remain different statements.&lt;/p&gt;

&lt;p&gt;One related quirk: cluster-addons' &lt;code&gt;kubeproxy/go.mod&lt;/code&gt; declares &lt;code&gt;module addon-operators/kubeproxy&lt;/code&gt; (no domain, no &lt;code&gt;sigs.k8s.io&lt;/code&gt; prefix). Riftmap surfaces it exactly as declared. When you scan an org, the right behaviour is to report what's in source rather than to normalise it.&lt;/p&gt;

&lt;h2&gt;
  
  
  If I changed cluster-api, what breaks?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F9x241i8i206srq1fcaud.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F9x241i8i206srq1fcaud.png" alt="Riftmap Impact Mode canvas view for cluster-api showing the focused neighborhood with red cascade edges radiating outward to all 38 affected repos including cluster-api-provider-aws, -azure, -gcp, -vsphere, -ibmcloud, -openstack, -cloudstack, cluster-api-operator, cluster-api-addon-provider-helm, kueue, and the transitive tail" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Toggle Impact Mode on the same focus view and the canvas lights up. Red cascade edges radiate from cluster-api through every repo in its blast radius. 38 repos affected at maximum depth 3: the seven cluster-api-providers you've heard of, the four you haven't (kubevirt, packet, hetzner, and a handful more), cluster-api-operator, cluster-api-addon-provider-helm, the standalone consumers like kueue, and a small transitive tail.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftnw4d8bjqva45g3wk7k1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftnw4d8bjqva45g3wk7k1.png" alt="Impact tab in the Riftmap side panel for cluster-api showing 38 repos affected at maximum depth 3 with depth-1 badges on each affected repo and a .md export button at the top" width="402" height="1025"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The side panel has the list with depth labels and the file:line that would need to change in each downstream. There's a Markdown export at the bottom. Click it and you get a copyable list of every affected repo with its evidence pointers, ready to paste into a deprecation announcement, a v1beta migration RFC, or a release-notes block. That loop, "show me the blast radius and let me hand the list to the team that owns each downstream repo," is the same workflow the Prometheus post described. The list is just longer here.&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;/span&gt;See it live&lt;/p&gt;

&lt;p&gt;Try the same Impact Mode yourself&lt;/p&gt;

&lt;p&gt;The kubernetes-sigs scan isn't published as a live demo yet — but the Prometheus org is, with the exact interactive Impact Mode described above. Click any repo and watch its blast radius cascade across the graph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://riftmap.dev/showcase/prometheus/" rel="noopener noreferrer"&gt;Explore the live graph →&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Riftmap didn't see (and why)
&lt;/h2&gt;

&lt;p&gt;Same honesty section as the Prometheus post. Three categories.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real parser gaps.&lt;/strong&gt; Riftmap can't currently parse:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;pyproject.toml&lt;/code&gt;, &lt;code&gt;requirements.txt&lt;/code&gt;, and &lt;code&gt;setup.py&lt;/code&gt;, so the four detected Python packages in the org (&lt;code&gt;inference-perf&lt;/code&gt;, &lt;code&gt;jobset&lt;/code&gt;, &lt;code&gt;k8s-agent-sandbox&lt;/code&gt;, &lt;code&gt;kubespray_component_hash_update&lt;/code&gt;) show as produced artifacts but with no cross-org consumption resolved&lt;/li&gt;
&lt;li&gt;Terraform module consumption from a downstream repo's &lt;code&gt;module "x" { source = "..." }&lt;/code&gt; blocks. Riftmap detects produced Terraform roots and modules (6 and 4 respectively) but doesn't yet resolve the consumer side&lt;/li&gt;
&lt;li&gt;Ansible playbook and collection consumption. Same pattern: 1 collection and 3 playbooks produced, zero consumption resolved&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Cargo.toml&lt;/code&gt;, so the org's one Rust experiment sits unconnected&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are real, named gaps. They are roughly the next four on the parser roadmap, in that order. Current ecosystem coverage is documented in the &lt;a href="https://riftmap.dev/blog/auto-discovering-infrastructure-dependencies-across-10-ecosystems/" rel="noopener noreferrer"&gt;auto-discovery write-up&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Correctly parsed, deliberately not drawn.&lt;/strong&gt; Most CSI driver repos sit as leaves with low or no in-degree. &lt;code&gt;aws-ebs-csi-driver&lt;/code&gt;, &lt;code&gt;azuredisk-csi-driver&lt;/code&gt;, &lt;code&gt;gcp-compute-persistent-disk-csi-driver&lt;/code&gt; and their siblings have no kubernetes-sigs consumers. That's correct. They consume &lt;code&gt;k8s.io/...&lt;/code&gt; upstreams (different org) and external SDKs (&lt;code&gt;aws-sdk-go&lt;/code&gt;, &lt;code&gt;azure-sdk-for-go&lt;/code&gt;), and they produce their own container images that get consumed by cluster operators outside the org. Their dependency graph leaves kubernetes-sigs and never comes back. Drawing them as connected when they aren't would be the bug.&lt;/p&gt;

&lt;p&gt;The GitHub Actions edge count is similarly low (8 in-org references) for the same reason. kubernetes-sigs repos overwhelmingly use &lt;code&gt;actions/checkout&lt;/code&gt;, &lt;code&gt;actions/setup-go&lt;/code&gt;, &lt;code&gt;actions/cache&lt;/code&gt; from the &lt;code&gt;actions/*&lt;/code&gt; org. Cross-org Actions inside kubernetes-sigs are genuinely sparse. The eight that exist are real.&lt;/p&gt;

&lt;p&gt;The single in-org Helm edge looks low until you check the chart sources. kubernetes-sigs charts mostly depend on external bases (bitnami, cert-manager, ingress-nginx, prometheus-community), all of which live in other orgs. The one in-org Helm edge that exists is the metrics-server reference above.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Federation means less clustering.&lt;/strong&gt; Riftmap's auto-clustering finds fewer cohesive groups on kubernetes-sigs than on Prometheus, and that reflects the org rather than a clustering shortcoming. cluster-api providers cluster tightly because they share dependency profiles. CSI drivers form a looser group. Standalone projects (kind, kustomize, gateway-api, kueue) cluster with whichever repos they happen to share imports with, which often isn't very many. The flat shape is the finding.&lt;/p&gt;

&lt;p&gt;The polyglot Python packages are parser gaps. The unconnected CSI drivers are correct silence. Both look the same in the graph at first glance. They are not the same thing, and the difference is what the side panel is for.&lt;/p&gt;

&lt;h2&gt;
  
  
  Methodology, briefly
&lt;/h2&gt;

&lt;p&gt;Before Prometheus and kubernetes-sigs, Riftmap was validated against two private adversarial test groups designed to mimic production orgs. The first (27 repos, 83 expected edges) covers 20 intentional edge cases: diamond dependencies, dual-artifact repos, nested CI includes, ARG-based Docker &lt;code&gt;FROM&lt;/code&gt; lines, Terraform subdirectory syntax, &lt;code&gt;COPY --from&lt;/code&gt; cross-repo references, and version lag across pinned consumers. The second (55 repos, ~135 expected edges across nine ecosystems) adds cross-language artifact reuse, circular Go module deps, multi-artifact single repos, dependency chains five levels deep, and unsupported-ecosystem node rendering. Both have hand-verified ground-truth edge lists. Every scanner change is tested against them.&lt;/p&gt;

&lt;p&gt;The kubernetes-sigs scan completed in 13m 34s across all 208 repos, with 0 skipped and no errors. 83% of edges resolved at confidence 1.0 (exact path matches); the remainder at 0.6 to 0.9 (Helm chart references and Kubernetes image references that resolve through name matching rather than exact paths).&lt;/p&gt;

&lt;p&gt;None of the heuristics involved in artifact resolution are public. The scan output is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two asks
&lt;/h2&gt;

&lt;p&gt;If you work on a kubernetes-sigs repo and you spot something Riftmap got wrong about it, a missing edge, a misattributed version, a parser failure I should add to the roadmap, email me at &lt;a href="mailto:daniel@riftmap.dev"&gt;daniel@riftmap.dev&lt;/a&gt;. I'll fix the parser and credit you in the follow-up.&lt;/p&gt;

&lt;p&gt;If you're running something similar in your own org and want to see what your graph looks like, Riftmap is at &lt;a href="https://riftmap.dev/" rel="noopener noreferrer"&gt;riftmap.dev&lt;/a&gt;. The kubernetes-sigs scan took about 13 minutes for 208 repos. Yours probably finishes in a fraction of that.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>kubernetessigs</category>
      <category>dependencygraph</category>
      <category>crossrepodependencies</category>
    </item>
    <item>
      <title>Change failure rate is up 30% — here's how to measure yours in an afternoon</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Mon, 29 Jun 2026 12:18:58 +0000</pubDate>
      <link>https://dev.to/danielwe/change-failure-rate-is-up-30-heres-how-to-measure-yours-in-an-afternoon-1ong</link>
      <guid>https://dev.to/danielwe/change-failure-rate-is-up-30-heres-how-to-measure-yours-in-an-afternoon-1ong</guid>
      <description>&lt;p&gt;&lt;em&gt;A practitioner's guide to calculating your team's CFR without a vendor platform — the DORA formula, the SQL, and the AI-assisted vs human-authored split nobody is publishing yet.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Cortex's 2026 benchmark says change failure rate has risen about 30% industry-wide since AI coding adoption accelerated. The number has been quoted in every engineering newsletter I read. It keeps showing up in LinkedIn posts. I cited it myself in my &lt;a href="https://riftmap.dev/blog/ai-doesnt-understand-blast-radius/" rel="noopener noreferrer"&gt;last post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here's the uncomfortable follow-up question: &lt;em&gt;what's yours?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Most platform teams I've worked with couldn't give me a number. They could estimate a direction — "it feels worse lately" — but the actual percentage wasn't anywhere. And without a number, the 30% headline is just other people's data. You can't improve what you haven't measured.&lt;/p&gt;

&lt;p&gt;This post walks through how to compute your team's CFR in an afternoon using data you already have, and how to split it in a way nobody is doing yet: AI-assisted PRs vs. human-authored. You don't need a vendor platform for any of this.&lt;/p&gt;

&lt;h2&gt;
  
  
  What CFR actually measures
&lt;/h2&gt;

&lt;p&gt;DORA's definition, lifted from the source:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The percentage of changes to production or releases to users that result in degraded service and subsequently require remediation — a hotfix, rollback, fix forward, or patch.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's the whole thing. Three details matter and they're the ones most vendor posts get slightly wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Only production counts.&lt;/strong&gt; A test that fails in CI isn't a change failure. A canary that catches a bad deploy before it reaches real users isn't one either. If your release engineering is working, a lot of would-be failures never count — which is the point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation has to happen.&lt;/strong&gt; A deployment that's merely suboptimal isn't a failure. The question is whether it needed a rollback, hotfix, fix-forward, or patch after the fact. "We wrote a Jira ticket" isn't remediation; "we pushed another deploy to fix the first one" is.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The denominator is changes, not deployments.&lt;/strong&gt; If you push three deploys and two of them are fix-only remediations of the first, you made one change, not three. Fix-only deploys come out of both the numerator and the denominator — they are neither new changes nor new failures in the sense CFR measures.&lt;/p&gt;

&lt;p&gt;So:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Number of changes   = Production deployments − Fix-only deployments
CFR                 = Failed changes ÷ Number of changes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://dora.dev/research/2025/dora-report/" rel="noopener noreferrer"&gt;DORA's 2025 report&lt;/a&gt; found that about 16.7% of teams maintain CFR at 4% or below — that's the elite band. Most teams sit well above it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 90-minute version
&lt;/h2&gt;

&lt;p&gt;You need three things, all of which you already have somewhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. A list of production deployments.&lt;/strong&gt; From your CI (GitHub Actions, GitLab CI, Jenkins, CircleCI, Argo), filtered to production environment only, successful runs only. Most of these systems have an API or a database you can query. If you can get &lt;code&gt;deployment_id&lt;/code&gt;, &lt;code&gt;service&lt;/code&gt;, &lt;code&gt;deployed_at&lt;/code&gt;, and &lt;code&gt;commit_sha&lt;/code&gt;, you're set.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. A list of production incidents.&lt;/strong&gt; From PagerDuty, Opsgenie, Incident.io, your internal spreadsheet — wherever your on-call logs live. Filter to anything that required an engineering response. You want &lt;code&gt;incident_id&lt;/code&gt;, &lt;code&gt;service&lt;/code&gt;, &lt;code&gt;started_at&lt;/code&gt;, and ideally the SHA or deployment that was identified as the root cause.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. A rule for joining them.&lt;/strong&gt; The simplest rule that works: an incident "belongs to" a deployment if the incident started within some window after the deployment, on the same service. A 24-hour window is standard; some teams use 48 hours for services with slow-burn failure modes. This isn't causal attribution — it's a proxy, and it's close enough.&lt;/p&gt;

&lt;p&gt;Here's the shape of the query once both datasets are in the same place:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;changes&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;deployment_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;deployed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;commit_sha&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;deployments&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;environment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'production'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'success'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;deployed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;is_fix_only&lt;/span&gt;  &lt;span class="c1"&gt;-- exclude rollbacks/hotfixes&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;failed_changes&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deployment_id&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;changes&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
  &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;incidents&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;service&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;started_at&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deployed_at&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deployed_at&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'24 hours'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deployment_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                       &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_changes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;fc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deployment_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                      &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;failed_changes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;fc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deployment_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deployment_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                   &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;change_failure_rate&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;changes&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;failed_changes&lt;/span&gt; &lt;span class="n"&gt;fc&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deployment_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your deployment tool doesn't track &lt;code&gt;is_fix_only&lt;/code&gt;, the practical workaround is a convention — require engineers to prefix fix-only PRs with &lt;code&gt;fix:&lt;/code&gt; or tag them with a &lt;code&gt;fix-only&lt;/code&gt; label, and filter on that. The data gets better once you start asking for it.&lt;/p&gt;

&lt;p&gt;Run the query over the last 90 days. That's your CFR. Longer windows are noisier; shorter ones are too volatile to trust.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cut nobody is making yet
&lt;/h2&gt;

&lt;p&gt;Here's where it gets interesting. The Cortex 30% number is an aggregate. It tells you the industry has gotten worse. It doesn't tell you &lt;em&gt;which of your PRs&lt;/em&gt; are driving your team's number.&lt;/p&gt;

&lt;p&gt;You can find out.&lt;/p&gt;

&lt;p&gt;Tag your PRs. There are several reasonable ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PR label.&lt;/strong&gt; Add an &lt;code&gt;ai-assisted&lt;/code&gt; label manually at review time. Lowest overhead, most honest, relies on the author.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PR template checkbox.&lt;/strong&gt; "Did you use AI coding tools in this PR?" as a checkbox that a small bot reads and labels accordingly. Works well for teams with a review culture that already uses templates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit trailer.&lt;/strong&gt; &lt;code&gt;AI-Assisted: yes&lt;/code&gt; or a &lt;code&gt;Co-authored-by: ...&lt;/code&gt; line pointing at a bot account. Survives rebases and is machine-readable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool-reported attribution.&lt;/strong&gt; Some tooling (&lt;a href="https://github.com/git-ai-project/git-ai" rel="noopener noreferrer"&gt;Git AI's open standard&lt;/a&gt; on Git Notes is a good example) can record which ranges of a diff were model-authored at the source, before the PR is even opened. Heavier setup, higher fidelity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Any of these is fine. The worst option is to defer tagging until you "find the right platform." Pick a convention, write it down, roll it out on Monday.&lt;/p&gt;

&lt;p&gt;Once PRs are tagged, split the CFR query two ways:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- AI-assisted PRs&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;changes&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;commit_sha&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;sha&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ai_assisted_prs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- Human-authored PRs&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;changes&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;commit_sha&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;sha&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ai_assisted_prs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you have two CFRs. Compare them.&lt;/p&gt;

&lt;p&gt;If your AI-assisted CFR is meaningfully higher than your human-authored CFR — and, based on every public benchmark from the last six months, it probably is — you have your own version of the 30% number. Not an industry aggregate. Your team's aggregate, on your codebase, for your definition of failure. That number is the one that actually motivates change.&lt;/p&gt;

&lt;p&gt;It's also a fair number in a way the industry stat isn't. If your AI-assisted CFR is &lt;em&gt;lower&lt;/em&gt; than your human CFR, that tells you something real too — your team has figured out how to use these tools well, and the finding is worth internal publicity.&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;/span&gt;See it live&lt;/p&gt;

&lt;p&gt;Cut the failures you can see coming&lt;/p&gt;

&lt;p&gt;Most change failures are blast-radius surprises — a downstream consumer nobody flagged in review. Open the live Prometheus graph, click any repo, and toggle Impact Mode to see exactly who breaks before the change merges.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://riftmap.dev/showcase/prometheus/" rel="noopener noreferrer"&gt;Explore the live graph →&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do with the number
&lt;/h2&gt;

&lt;p&gt;I wrote most of this in the &lt;a href="https://riftmap.dev/blog/ai-doesnt-understand-blast-radius/" rel="noopener noreferrer"&gt;previous post&lt;/a&gt;, so I'll keep it brief.&lt;/p&gt;

&lt;p&gt;The patterns that reduce CFR at teams I've seen up close are the boring ones. Smaller PRs. Trunk-based development with feature flags instead of long-lived branches. Canary deploys with automatic rollback. Strong ownership over shared infrastructure artifacts. And — the one most teams skip — visibility into the &lt;a href="https://riftmap.dev/blog/infrastructure-dependency-problem/" rel="noopener noreferrer"&gt;cross-repo blast radius&lt;/a&gt; of a change before it merges, so that the review can ask the right question rather than a generic one.&lt;/p&gt;

&lt;p&gt;What doesn't work is adding process layers that slow every change without discriminating by risk. The goal isn't to slow the agents down; it's to route high-blast-radius changes through more scrutiny than low-blast-radius ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How is change failure rate calculated?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CFR equals the number of failed changes divided by the number of changes, over a given time window. A failed change is a production deployment that required remediation — a rollback, hotfix, fix-forward, or patch. Fix-only deployments are excluded from both sides of the ratio because they aren't new changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is a good change failure rate?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;DORA's 2025 data suggests that about 16.7% of teams achieve a CFR of 4% or lower, which is the elite band. A CFR in the 0–15% range generally indicates a mature delivery process. Above 30% typically points at gaps in testing, release safety, or ownership clarity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Should I include staging or pre-production failures in CFR?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No. CFR is a production-only metric by DORA's definition. A canary that catches a bad deploy before it reaches real users is a win, not a failure — counting it penalises the very controls you want teams to invest in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I track AI-assisted code for CFR purposes?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The simplest approach is a PR label or commit trailer that engineers apply at authoring or review time. More sophisticated options include PR templates with a checkbox, bot-applied labels based on known AI-tool user accounts, and tools like the Git AI open standard that record AI-authored diff ranges in Git Notes. Perfect attribution is not required — a consistent convention used by the team is enough to split the metric meaningfully.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How long should my CFR measurement window be?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ninety days is the usual default. Shorter windows (two to four weeks) are too noisy for most teams — a single rough week swings the number. Longer windows (six months or more) smooth out recent changes in your delivery practices and are slow to react to regressions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;A week of work, most of which is data plumbing you probably already have, gets you an honest CFR number and a split between AI-assisted and human-authored changes. That's a better starting point than any aggregate benchmark from any vendor report.&lt;/p&gt;

&lt;p&gt;I'm building &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt; to solve the other half of this — giving teams visibility into the cross-repo blast radius of a change &lt;em&gt;before&lt;/em&gt; the CFR number moves. Auto-discovery across Terraform, Docker, CI templates, Helm, Go, npm, Python, Ansible, Kubernetes, and Kustomize. One read-only token. No YAML to maintain.&lt;/p&gt;

&lt;p&gt;If this is familiar territory, reach me at &lt;a href="mailto:daniel@riftmap.dev"&gt;daniel@riftmap.dev&lt;/a&gt;, or try a free scan at &lt;a href="https://app.riftmap.dev" rel="noopener noreferrer"&gt;app.riftmap.dev&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources referenced
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;DORA, &lt;em&gt;Software delivery performance metrics&lt;/em&gt; — &lt;a href="https://dora.dev/guides/dora-metrics-four-keys/" rel="noopener noreferrer"&gt;dora.dev&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Google Cloud / DORA, &lt;em&gt;2025 State of AI-assisted Software Development Report&lt;/em&gt; — &lt;a href="https://dora.dev/research/2025/dora-report/" rel="noopener noreferrer"&gt;dora.dev&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Cortex, &lt;em&gt;Engineering in the Age of AI: 2026 Benchmark Report&lt;/em&gt; — &lt;a href="https://www.cortex.io/report/engineering-in-the-age-of-ai-2026-benchmark-report" rel="noopener noreferrer"&gt;cortex.io&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Swarmia, &lt;em&gt;DORA change failure rate — what, why, and how&lt;/em&gt; — &lt;a href="https://www.swarmia.com/blog/dora-change-failure-rate/" rel="noopener noreferrer"&gt;swarmia.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Git AI open standard for AI authorship attribution via Git Notes — &lt;a href="https://github.com/git-ai-project/git-ai" rel="noopener noreferrer"&gt;github.com/git-ai-project&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://riftmap.dev/blog/ai-doesnt-understand-blast-radius/" rel="noopener noreferrer"&gt;AI Doesn't Understand Blast Radius&lt;/a&gt; — Why change failure rates are up 30% and what's structurally driving it.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://riftmap.dev/blog/ai-coding-agents-need-cross-repo-context/" rel="noopener noreferrer"&gt;AI coding agents need cross-repo context&lt;/a&gt; — What teams running AI coding agents at scale are publishing about the missing dependency substrate.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://riftmap.dev/blog/meta-tribal-knowledge-engine-build-the-graph-first/" rel="noopener noreferrer"&gt;Meta needed 50+ AI agents to map their tribal knowledge&lt;/a&gt; — How a 50-agent system quietly rests on a single graph index that does the heavy lifting.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dorametrics</category>
      <category>changefailurerate</category>
      <category>platformengineering</category>
      <category>aicoding</category>
    </item>
    <item>
      <title>How to Find Every Consumer of Your Internal Python Package</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Mon, 29 Jun 2026 12:18:12 +0000</pubDate>
      <link>https://dev.to/danielwe/how-to-find-every-consumer-of-your-internal-python-package-3egp</link>
      <guid>https://dev.to/danielwe/how-to-find-every-consumer-of-your-internal-python-package-3egp</guid>
      <description>&lt;p&gt;&lt;em&gt;You maintain an internal Python package on a private index. You need to change its API. Which repos across the org depend on it, and at which version? The public Python ecosystem has an answer to that question. The moment you move the package onto your own index, everything that knows the answer is looking somewhere your package never appears.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;npm puts a Dependents tab on the registry page for every public package. PyPI has nothing of the sort. Open the project page for &lt;code&gt;requests&lt;/code&gt; or &lt;code&gt;flask&lt;/code&gt; and there is no reverse-dependency view, no list of what builds on top of it, no count. What answers the question for public packages is third-party and sits beside the index rather than inside it: Google's &lt;a href="https://deps.dev/" rel="noopener noreferrer"&gt;deps.dev&lt;/a&gt; and &lt;a href="https://libraries.io/" rel="noopener noreferrer"&gt;libraries.io&lt;/a&gt;, both of which crawl the public index and will show you who depends on a given package.&lt;/p&gt;

&lt;p&gt;Now make the package yours. Rename it from &lt;code&gt;confparse&lt;/code&gt; to &lt;code&gt;yourco-config&lt;/code&gt;, set it to private, and publish it to AWS CodeArtifact or the GitLab PyPI registry instead of &lt;code&gt;pypi.org&lt;/code&gt;. deps.dev and libraries.io go dark immediately, because they crawl the public index and your package is not on it. pip has nothing to offer either. &lt;code&gt;pip show yourco-config&lt;/code&gt; lists a "Required-by" field, but it only reflects what is installed in the environment you happen to run it in, and &lt;a href="https://github.com/pypa/pip/issues/4968" rel="noopener noreferrer"&gt;pip has had an open request for a real reverse-dependency command for years&lt;/a&gt;. Dependabot and Renovate know implicitly who depends on what, because they are configured per repo, but they are updaters, not mappers, and only where they are switched on.&lt;/p&gt;

&lt;p&gt;There is a second gap underneath the first, and it is worth sitting with. Even for a public package, the dependents that deps.dev and libraries.io can show you are mostly &lt;em&gt;other published packages&lt;/em&gt;, because a published package is what an index crawler can see. The things consuming your internal library are overwhelmingly applications. Services, pipelines, DAG repos, batch jobs. None of those are published to any index, so they would not appear as dependents even if your package were public. So the answer exists for the packages that cannot hurt you, and is missing for the one that can. The shared client every service imports, the config package forty repos pull in, the feature library the whole ML platform builds on. The one whose breaking change is your problem is precisely the one with no consumer view at all. This post is about getting that view back, and about why Python makes it genuinely harder than the other ecosystems in &lt;a href="https://riftmap.dev/blog/series/find-every-consumer/" rel="noopener noreferrer"&gt;this series&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The scenario
&lt;/h2&gt;

&lt;p&gt;Your platform team, or your ML-platform team, publishes a package. Maybe it is &lt;code&gt;yourco-clients&lt;/code&gt;, a generated client for your internal APIs that half the services import. Maybe it is &lt;code&gt;yourco-observability&lt;/code&gt;, the structured-logging and trace-propagation library every service is supposed to use. Maybe it is &lt;code&gt;yourco-config&lt;/code&gt;, a thin package that standardises settings loading so nobody hand-rolls it. Maybe it is &lt;code&gt;yourco-features&lt;/code&gt;, a shared feature-store and data-access layer the whole ML org builds on.&lt;/p&gt;

&lt;p&gt;It started as a way to stop copy-pasting. A few repos adopted it. Then more. And here is where Python diverges from every other post in this series, immediately, before we even get to the hard part. There is no single place a consumer declares the dependency. There is barely a single &lt;em&gt;format&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;One service declares it in &lt;code&gt;requirements.txt&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;yourco-clients==2.4.1
yourco-observability~=1.7
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Another uses the modern standard, &lt;a href="https://peps.python.org/pep-0621/" rel="noopener noreferrer"&gt;PEP 621&lt;/a&gt; dependencies in &lt;code&gt;pyproject.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[project]&lt;/span&gt;
&lt;span class="py"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"checkout-service"&lt;/span&gt;
&lt;span class="py"&gt;dependencies&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="py"&gt;"yourco-clients&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"yourco-observability~&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.7&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nn"&gt;[project.optional-dependencies]&lt;/span&gt;
&lt;span class="py"&gt;dev&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="py"&gt;["yourco-testtools&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="s"&gt;"]&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A third is on Poetry, which until recently used its own table with its own syntax. The caret is not a PEP 440 operator, the resolution semantics are Poetry's:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[tool.poetry.dependencies]&lt;/span&gt;
&lt;span class="py"&gt;python&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"^3.11"&lt;/span&gt;
&lt;span class="py"&gt;yourco-clients&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"^2.4.1"&lt;/span&gt;
&lt;span class="py"&gt;yourco-observability&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"~1.7"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A fourth predates all of that and declares its dependencies in &lt;code&gt;setup.py&lt;/code&gt;, in arbitrary Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reporting-service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;install_requires&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;yourco-clients&amp;gt;=2.4,&amp;lt;3.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;yourco-observability~=1.7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A fifth never stood up a private index at all and pulls your code straight from git:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-e git+https://gitlab.yourco.com/platform/yourco-clients.git@v2.4.1#egg=yourco-clients
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And because the package is private, every consumer that does use the index carries routing config that points at it, the way an &lt;code&gt;.npmrc&lt;/code&gt; does for a scoped npm package. In Python that lives in &lt;code&gt;pip.conf&lt;/code&gt;, or a &lt;code&gt;.netrc&lt;/code&gt;, or a Poetry source, or a uv index table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="c"&gt;# pip.conf
&lt;/span&gt;&lt;span class="nn"&gt;[global]&lt;/span&gt;
&lt;span class="py"&gt;index-url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;https://pypi.org/simple&lt;/span&gt;
&lt;span class="py"&gt;extra-index-url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;https://gitlab.yourco.com/api/v4/projects/42/packages/pypi/simple&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Twenty repos adopted the package, across four or five of these mechanisms. Then you stopped counting, because nothing in your toolchain counts for you. Now you need to change it. Drop a parameter, rename an export, cut a major. The question is the one that runs through &lt;a href="https://riftmap.dev/blog/series/find-every-consumer/" rel="noopener noreferrer"&gt;every post in this series&lt;/a&gt;: &lt;strong&gt;which repos across the org depend on this package, at which version, and which of them break when I publish?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The change you ship without shipping it
&lt;/h2&gt;

&lt;p&gt;Before the tooling, the part that makes this sharper than it first looks, and it is more acute in Python than in the npm version of &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-internal-npm-package/" rel="noopener noreferrer"&gt;this same argument&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A loose constraint is a standing instruction to adopt your next release. A consumer on &lt;code&gt;yourco-clients&amp;gt;=2.4&lt;/code&gt; is not pinned. They will take whatever the newest version is the next time their environment is resolved fresh. The &lt;a href="https://peps.python.org/pep-0440/" rel="noopener noreferrer"&gt;PEP 440 compatible-release operator&lt;/a&gt;, &lt;code&gt;~=1.7&lt;/code&gt;, is the same thing inside a band: it means &lt;code&gt;&amp;gt;=1.7, &amp;lt;2.0&lt;/code&gt;, so every 1.x you publish is a candidate. Poetry's &lt;code&gt;^2.4.1&lt;/code&gt; resolves to &lt;code&gt;&amp;gt;=2.4.1, &amp;lt;3.0.0&lt;/code&gt;, which is a subscription to every minor and patch you ship in the 2.x line.&lt;/p&gt;

&lt;p&gt;Python makes this land more easily than npm does, for one structural reason. A very large number of Python repos have no committed lockfile. They have a loose &lt;code&gt;requirements.txt&lt;/code&gt; that gets &lt;code&gt;pip install&lt;/code&gt;-ed during a Docker build, on every build, against the live index. There is no &lt;code&gt;poetry.lock&lt;/code&gt; or &lt;code&gt;uv.lock&lt;/code&gt; holding the line. So the resolution is not a one-time event that someone reviews in a pull request. It happens every time the image is rebuilt, silently, on the consumer's schedule rather than yours. You did not roll out your 2.5.0. They did, the next time CI ran, and the first you hear of a regression is somebody else's red pipeline.&lt;/p&gt;

&lt;p&gt;This is not a fringe worry, it is being actively argued about right now. uv, the fast-rising resolver, &lt;a href="https://www.loopwerk.io/articles/2026/uv-ux-mess/" rel="noopener noreferrer"&gt;defaults to constraints with no upper bound&lt;/a&gt;, which means &lt;code&gt;uv lock --upgrade&lt;/code&gt; will happily pull a breaking major across every transitive dependency, and the friction of that has pushed uv to add a &lt;code&gt;--bounds&lt;/code&gt; option so &lt;code&gt;uv add&lt;/code&gt; can produce a safer &lt;code&gt;&amp;gt;=2.13.4,&amp;lt;3.0.0&lt;/code&gt;. The community has not settled on how tight constraints should be. While that argument runs, your consumers are scattered across every position on the spectrum, and you cannot see which.&lt;/p&gt;

&lt;p&gt;The reverse case is just as bad in the other direction. When you do the honest thing and cut a genuine breaking change as a major, &lt;code&gt;2.x&lt;/code&gt; to &lt;code&gt;3.0.0&lt;/code&gt;, a &lt;code&gt;&amp;lt;3.0&lt;/code&gt; pin or a Poetry caret correctly refuses to follow. That is the right behaviour. It also leaves you with a long tail of repos stranded on the old major, indefinitely, with no list of who they are. You cannot deprecate 2.x because you cannot see who is still on it.&lt;/p&gt;

&lt;p&gt;Either way the constraint is the mechanism, and the constraint is exactly what a quick search across your repos cannot evaluate. You need to know who consumes the package and how their constraint relates to what you are about to publish. Both halves of that live in files most audits never open, in formats most scripts do not all parse.&lt;/p&gt;

&lt;h2&gt;
  
  
  What existing tools give you (and where they stop)
&lt;/h2&gt;

&lt;p&gt;I want to be fair to the options, because several are genuinely useful for the slice they cover, and I reach for some of them myself.&lt;/p&gt;

&lt;h3&gt;
  
  
  PyPI, deps.dev, libraries.io, GitHub dependents
&lt;/h3&gt;

&lt;p&gt;For public packages, deps.dev and libraries.io are the right tools, and I would point you straight at them. GitHub's dependency graph adds a "Used by" panel for repositories that publish a package, though &lt;a href="https://docs.github.com/en/code-security/supply-chain-security/understanding-your-software-supply-chain/exploring-the-dependencies-of-a-repository" rel="noopener noreferrer"&gt;its own documentation calls the dependent counts approximate&lt;/a&gt;. The structural problem is not that any of these are bad. It is that they are properties of the &lt;em&gt;public&lt;/em&gt; index. A private package is access-controlled by design, served from your own registry behind a token, and never indexed by anything that crawls &lt;code&gt;pypi.org&lt;/code&gt;. The same access control that keeps your code off the public internet keeps it off every public consumer graph. There is nothing to fix here. The data is unreachable, on purpose. And as above, even for public packages these views count published packages far better than they count the unpublished services that are usually your real consumers.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;pip show&lt;/code&gt;, &lt;code&gt;pipdeptree&lt;/code&gt;, and the reverse-dependency tools
&lt;/h3&gt;

&lt;p&gt;These do answer a reverse question, and people reach for them first. &lt;code&gt;pip show yourco-clients&lt;/code&gt; lists a "Required-by" field. &lt;a href="https://pypi.org/project/pipdeptree/" rel="noopener noreferrer"&gt;pipdeptree&lt;/a&gt; and &lt;a href="https://pypi.org/project/deptree/" rel="noopener noreferrer"&gt;deptree&lt;/a&gt; will invert the tree and show you dependents with &lt;code&gt;-r&lt;/code&gt;. They are the right tools for "what in &lt;em&gt;this environment&lt;/em&gt; depends on this."&lt;/p&gt;

&lt;p&gt;But they operate on one installed environment at a time, outward from whatever happens to be in that virtualenv. They cannot tell you which &lt;em&gt;other repos&lt;/em&gt; in the org depend on your package. There is no index-side reverse query to ask, either. &lt;a href="https://github.com/pypa/pip/issues/4968" rel="noopener noreferrer"&gt;pip has had an open request for a reverse-dependency command for years&lt;/a&gt;, and the standing workaround is a script that walks installed distributions. To build the org-wide view you would clone every repo, create a clean environment in each, install, run &lt;code&gt;pipdeptree -r&lt;/code&gt;, and aggregate the output yourself. By the time you finished, the resolutions you installed from would have moved.&lt;/p&gt;

&lt;h3&gt;
  
  
  The private index itself
&lt;/h3&gt;

&lt;p&gt;This is the one people assume covers them, because the index is the thing all the packages flow through. AWS CodeArtifact, JFrog Artifactory, Sonatype Nexus, the GitLab PyPI registry, &lt;a href="https://pypi.org/project/devpi-server/" rel="noopener noreferrer"&gt;devpi&lt;/a&gt;, GemFury. They host your private packages, cache the public ones, and serve both from one endpoint behind auth.&lt;/p&gt;

&lt;p&gt;They are very good at it. What none of them model is consumption at the source level. The index records that some authenticated client downloaded &lt;code&gt;yourco-clients 2.4.1&lt;/code&gt;. It does not record which repo's &lt;code&gt;pyproject.toml&lt;/code&gt; declared the dependency, which team's CI pipeline the install ran in, or whether the thing that pulled it was a service you care about or a throwaway branch. It is a distribution and caching layer, not a consumption graph. This is the same gap I described for &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-go-module/" rel="noopener noreferrer"&gt;internal Go module proxies in the Go edition&lt;/a&gt;: a proxy logs fetches, not the manifest that triggered them. The download event is not the dependency edge.&lt;/p&gt;

&lt;p&gt;There is one fair exception worth naming. Some registry products, &lt;a href="https://blog.inedo.com/python/package-dependencies" rel="noopener noreferrer"&gt;ProGet among them&lt;/a&gt;, do surface a consumer view, listing applications by name and version against a package. That is closer than most, and if you run one, use it. But it sees consumption that flows through &lt;em&gt;that&lt;/em&gt; registry, of packages &lt;em&gt;it&lt;/em&gt; hosts. It does not read the source manifest in every repo regardless of which index they use, and it does not see the git-ref consumption that never touches a registry at all. The next section is mostly a list of the consumption that escapes a registry-centred view.&lt;/p&gt;

&lt;h3&gt;
  
  
  Renovate and Dependabot
&lt;/h3&gt;

&lt;p&gt;Both support Python as a first-class ecosystem, including private indexes once you give them credentials, across &lt;code&gt;requirements.txt&lt;/code&gt;, Poetry, Pipfile, and &lt;code&gt;setup.py&lt;/code&gt;. Because they are configured per consumer, they implicitly know which repos depend on what, and they will open pull requests to bump your package when you publish. As with &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-terraform-module/" rel="noopener noreferrer"&gt;Terraform modules&lt;/a&gt; and the rest of the series, the knowledge is in there.&lt;/p&gt;

&lt;p&gt;But they are updaters, not mappers. There is no org-level "show me every repo that depends on &lt;code&gt;yourco-clients&lt;/code&gt;, and what constraint each one declares" view to query. They react to new versions going out. The question you have &lt;em&gt;before&lt;/em&gt; you publish a breaking one, who is currently consuming the old version and how, is not something either tool surfaces. And both only cover repos where they have been switched on for your private index. A team that never configured private-index auth in their Renovate config is simply invisible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code search, and the script
&lt;/h3&gt;

&lt;p&gt;You can search your GitHub org or GitLab group for the package name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;org:yourco "yourco-clients"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a one-off audit, fine. It finds files that mention the string and gives you a starting list of repos. Then the familiar problems land all at once, and in Python they land harder. It returns the declared constraint, not the installed version. It will not normalise &lt;code&gt;yourco_clients&lt;/code&gt; and &lt;code&gt;yourco-clients&lt;/code&gt; to the same project. It misses a consumer that pulled the package over &lt;code&gt;git+https&lt;/code&gt;. And the index lags your most recent commits.&lt;/p&gt;

&lt;p&gt;So someone writes the script. Enumerate every repo, fetch every &lt;code&gt;requirements*.txt&lt;/code&gt;, &lt;code&gt;pyproject.toml&lt;/code&gt;, &lt;code&gt;setup.py&lt;/code&gt;, &lt;code&gt;setup.cfg&lt;/code&gt;, &lt;code&gt;Pipfile&lt;/code&gt;, and &lt;code&gt;environment.yml&lt;/code&gt;, parse all of them, handle three &lt;code&gt;pyproject.toml&lt;/code&gt; dialects and arbitrary &lt;code&gt;setup.py&lt;/code&gt; code, normalise names, evaluate PEP 440 specifiers, run it on a schedule. People build exactly this. The clearest evidence is &lt;a href="https://pypi.org/project/all-repos-depends/" rel="noopener noreferrer"&gt;&lt;code&gt;all-repos-depends&lt;/code&gt;&lt;/a&gt;, a real org-scanner whose providers read the &lt;code&gt;setup.py&lt;/code&gt; AST for the package name and &lt;code&gt;install_requires&lt;/code&gt; and parse the requirements-file conventions. The fact that this keeps getting independently rebuilt is the strongest evidence the question matters. It is also, tellingly, honest about its own limit: it can only read a &lt;code&gt;setup.py&lt;/code&gt; that sets its name literally. Which is the first of the corner cases below, and that tool ran straight into it too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is harder than it looks
&lt;/h2&gt;

&lt;p&gt;A naive search for the package name both overcounts and undercounts, because Python dependency consumption is not one fact in one place. It is spread across constructs that each behave differently, and Python has more of them than any other ecosystem in this series.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;There is no single manifest, and the formats disagree.&lt;/strong&gt; Go has one canonical &lt;code&gt;go.mod&lt;/code&gt;. npm has one canonical &lt;code&gt;package.json&lt;/code&gt;. Python has at least six families, and once you count the dialects, closer to nine distinct shapes a scanner has to handle: &lt;code&gt;requirements.txt&lt;/code&gt; and its &lt;code&gt;requirements-*.txt&lt;/code&gt; siblings, split &lt;code&gt;requirements/&lt;/code&gt; trees and pip-tools &lt;code&gt;.in&lt;/code&gt; inputs, &lt;code&gt;setup.py&lt;/code&gt; &lt;code&gt;install_requires&lt;/code&gt;, declarative &lt;code&gt;setup.cfg&lt;/code&gt;, &lt;code&gt;pyproject.toml&lt;/code&gt; in three different dialects (PEP 621 &lt;code&gt;[project]&lt;/code&gt;, Poetry's &lt;code&gt;[tool.poetry]&lt;/code&gt; with its groups, and the &lt;a href="https://peps.python.org/pep-0735/" rel="noopener noreferrer"&gt;PEP 735&lt;/a&gt; &lt;code&gt;[dependency-groups]&lt;/code&gt; that uv and recent pip understand), &lt;code&gt;Pipfile&lt;/code&gt;, and conda's &lt;code&gt;environment.yml&lt;/code&gt; with its nested &lt;code&gt;pip:&lt;/code&gt; block. &lt;a href="https://peps.python.org/pep-0723/" rel="noopener noreferrer"&gt;PEP 723 even added inline dependencies inside a single &lt;code&gt;.py&lt;/code&gt; script&lt;/a&gt;, so the surface is still growing. The version grammar is not even shared: a Poetry caret is not a PEP 440 operator. And the declared constraint and the installed version are different facts living in different places, with the resolved version buried in whichever of &lt;code&gt;poetry.lock&lt;/code&gt;, &lt;code&gt;Pipfile.lock&lt;/code&gt;, &lt;code&gt;uv.lock&lt;/code&gt;, or a pip-compiled &lt;code&gt;requirements.txt&lt;/code&gt; the repo happens to use. Real orgs mix all of this. To find every consumer you have to read all of it, and reconcile it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The distribution name and the import name are different facts.&lt;/strong&gt; This one is pure Python. You &lt;code&gt;pip install scikit-learn&lt;/code&gt; and &lt;code&gt;import sklearn&lt;/code&gt;. &lt;code&gt;PyYAML&lt;/code&gt; imports as &lt;code&gt;yaml&lt;/code&gt;, &lt;code&gt;beautifulsoup4&lt;/code&gt; as &lt;code&gt;bs4&lt;/code&gt;, &lt;code&gt;opencv-python&lt;/code&gt; as &lt;code&gt;cv2&lt;/code&gt;. Your &lt;code&gt;yourco-data-clients&lt;/code&gt; might &lt;code&gt;import yourco_data&lt;/code&gt;. To bind a &lt;em&gt;declared&lt;/em&gt; dependency you match the &lt;a href="https://peps.python.org/pep-0503/" rel="noopener noreferrer"&gt;PEP 503 normalised name&lt;/a&gt;, which lowercases and collapses any run of &lt;code&gt;.&lt;/code&gt;, &lt;code&gt;-&lt;/code&gt;, or &lt;code&gt;_&lt;/code&gt; to a single &lt;code&gt;-&lt;/code&gt;, so &lt;code&gt;Yourco.Clients&lt;/code&gt;, &lt;code&gt;yourco_clients&lt;/code&gt;, and &lt;code&gt;yourco-clients&lt;/code&gt; are one project. A grep treats them as three. The confusion is real enough that scikit-learn ships a &lt;a href="https://github.com/scikit-learn/sklearn-pypi-package" rel="noopener noreferrer"&gt;defensive &lt;code&gt;sklearn&lt;/code&gt; shim on PyPI purely to stop people and tools getting it wrong&lt;/a&gt;, and that shim's own remediation advice is, word for word, a find-every-consumer task: track down which packages declare &lt;code&gt;sklearn&lt;/code&gt; instead of &lt;code&gt;scikit-learn&lt;/code&gt;. It is enough of a trap that &lt;a href="https://gitlab.com/gitlab-org/gitlab/-/issues/440391" rel="noopener noreferrer"&gt;GitLab's own SBOM scanner normalised names by the wrong rule and produced incorrect dependency results&lt;/a&gt;. That normalised &lt;em&gt;distribution&lt;/em&gt; name is what a manifest declares and what you match on. The &lt;em&gt;import&lt;/em&gt; name, the thing that actually appears in &lt;code&gt;import&lt;/code&gt; statements, is a different layer again, and it is the one a symbol graph lives in. More on that distinction when we get to the limits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;setup.py&lt;/code&gt; is code, not data.&lt;/strong&gt; &lt;code&gt;package.json&lt;/code&gt; is JSON and &lt;code&gt;go.mod&lt;/code&gt; has a defined grammar, so a parser can trust them. &lt;code&gt;setup.py&lt;/code&gt; is a Python script, and &lt;code&gt;install_requires&lt;/code&gt; can be a literal list, or read from a file, or assembled in a loop, or gated on markers computed at runtime. The literal case is statically parseable from the AST. The dynamically constructed case is not knowable without executing untrusted code, which no scanner should do. &lt;code&gt;setup.cfg&lt;/code&gt; and &lt;code&gt;pyproject.toml&lt;/code&gt; are declarative and parse cleanly, so the holdout is specifically the older &lt;code&gt;setup.py&lt;/code&gt; repos, and even the literal case is a best-effort heuristic rather than a guarantee. This is not a footnote you have to take on faith, it shows up in the consumer view as a lower confidence score on &lt;code&gt;setup.py&lt;/code&gt; rows than on the declarative ones. &lt;code&gt;all-repos-depends&lt;/code&gt; hit exactly this wall and drew exactly this line.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The same name is not always your package.&lt;/strong&gt; A repo whose index routing is wrong, or missing, can resolve a &lt;em&gt;public&lt;/em&gt; package that happens to share your internal name, which looks identical in the manifest and is not your code. This is not hypothetical: &lt;a href="https://pip.pypa.io/en/stable/cli/pip_install/" rel="noopener noreferrer"&gt;pip's own documentation warns that &lt;code&gt;--extra-index-url&lt;/code&gt; is unsafe precisely because a public index can serve a package with the same name as your private one&lt;/a&gt;, the dependency-confusion problem. So the name in a manifest is a claim, not proof. Binding the name to your repo safely means resolving it to the in-org repo that actually produces that package, not assuming every matching string is yours. A name nothing in your org produces is an external dependency, not a consumer of your code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The value is not always a version, and git references travel without an index.&lt;/strong&gt; A dependency's value can carry the real target, and for internal Python this is how a great deal of code travels without anyone standing up a private index at all. &lt;a href="https://peps.python.org/pep-0508/" rel="noopener noreferrer"&gt;PEP 508 direct references&lt;/a&gt;, &lt;code&gt;yourco-clients @ git+https://github.com/yourco/clients.git@v2.4.1&lt;/code&gt;, resolve straight from a repo. So does &lt;code&gt;-e git+https://...#egg=yourco-clients&lt;/code&gt;, a bare &lt;code&gt;git+...&lt;/code&gt; line, a Poetry &lt;code&gt;{ git = ... }&lt;/code&gt; source, a Pipfile &lt;code&gt;{ git = ..., ref = ... }&lt;/code&gt;, and a &lt;code&gt;git+&lt;/code&gt; line inside a conda &lt;code&gt;pip:&lt;/code&gt; block. Every one of those points at a repo, with the committish standing in for the version. A purely local &lt;code&gt;./libs/shared&lt;/code&gt; or a &lt;code&gt;file:&lt;/code&gt; install carries no cross-repo signal, and a plain wheel or sdist URL is not a repo edge either. So the honest split is: git references are first-class consumers and resolve to a repo, while local paths and non-git URLs are not cross-repo edges at all. A scanner that reads only registry-style names misses the entire git-sourced half of how internal Python is consumed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;constraints.txt&lt;/code&gt; and includes mean the line you grep is not always the effective version.&lt;/strong&gt; A repo can declare &lt;code&gt;yourco-clients&amp;gt;=2.4&lt;/code&gt; in &lt;code&gt;requirements.in&lt;/code&gt; and then pin it hard via a global &lt;code&gt;-c constraints.txt&lt;/code&gt; that says &lt;code&gt;yourco-clients==2.4.1&lt;/code&gt;. Or chain &lt;code&gt;-r requirements/base.txt&lt;/code&gt; so the real dependency list is assembled across several files. The line you grep is the declared constraint. The effective version, after a constraints overlay, is closer to a lock. For the question this post is about, who adopts your next release, the declared constraint in the dependency line is the load-bearing fact, and the constraints-pinned version is the adjacent, lockfile-shaped question. They are different facts, and a search that reads one file in isolation cannot tell which it is looking at.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extras and dependency groups change the blast radius, and markers make the edge conditional.&lt;/strong&gt; A consumer that needs your package only as a PEP 621 &lt;code&gt;[project.optional-dependencies]&lt;/code&gt; extra, a Poetry &lt;code&gt;group.test&lt;/code&gt;, or a PEP 735 dev group is a weaker consumer than one that imports it at runtime in production. And a &lt;a href="https://peps.python.org/pep-0508/" rel="noopener noreferrer"&gt;PEP 508 marker&lt;/a&gt; makes the dependency conditional outright: &lt;code&gt;yourco-clients&amp;gt;=2.4; python_version &amp;gt;= "3.11"&lt;/code&gt;, or &lt;code&gt;; sys_platform == "linux"&lt;/code&gt;. So a consumer may depend on you only inside an extra nobody installs in production, or only on a platform they do not ship. Flattening every declaration into one undifferentiated "depends on" both overstates and understates the blast radius, depending on which way you are wrong. The scope of each declaration, runtime against dev against optional, is part of the answer, not noise to discard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Namespace packages mean the import namespace is not the unit of dependency.&lt;/strong&gt; &lt;a href="https://peps.python.org/pep-0420/" rel="noopener noreferrer"&gt;PEP 420 implicit namespace packages&lt;/a&gt; let &lt;code&gt;yourco.clients&lt;/code&gt; and &lt;code&gt;yourco.auth&lt;/code&gt; live in separate distributions, in separate repos, under one shared &lt;code&gt;yourco&lt;/code&gt; namespace. The distributions ship and version independently, so the unit of dependency is the distribution, &lt;code&gt;yourco-clients&lt;/code&gt; or &lt;code&gt;yourco-auth&lt;/code&gt;, not the &lt;code&gt;yourco&lt;/code&gt; namespace they share. A tool that treats the top-level import namespace as one package conflates things that release on different schedules. This is the same crack the distribution-versus-import-name beat opened: the manifest layer is about which distribution you declared, and "who imports &lt;code&gt;yourco.auth&lt;/code&gt;" is a question one level down, at the symbol layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the full answer requires
&lt;/h2&gt;

&lt;p&gt;To reliably answer "who consumes this internal Python package," you need a system that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Scans every repo in the org&lt;/strong&gt;, parsing each manifest family (&lt;code&gt;requirements.txt&lt;/code&gt; and its &lt;code&gt;requirements-*.txt&lt;/code&gt; and split &lt;code&gt;requirements/&lt;/code&gt; and pip-tools &lt;code&gt;.in&lt;/code&gt; variants, &lt;code&gt;pyproject.toml&lt;/code&gt; across the PEP 621, Poetry, and PEP 735 dialects, &lt;code&gt;setup.cfg&lt;/code&gt;, &lt;code&gt;setup.py&lt;/code&gt;, &lt;code&gt;Pipfile&lt;/code&gt;, and conda &lt;code&gt;environment.yml&lt;/code&gt;), without requiring each team to opt in or register&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Normalises distribution names per PEP 503&lt;/strong&gt;, so &lt;code&gt;-&lt;/code&gt;, &lt;code&gt;_&lt;/code&gt;, &lt;code&gt;.&lt;/code&gt;, and case variants bind to one project, and resolves each declared dependency to the repo that actually produces the package, so a public package sharing your internal name resolves as external rather than binding to the wrong repo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reads the value, not just the name&lt;/strong&gt;, resolving PEP 508 direct references, &lt;code&gt;git+https&lt;/code&gt; and &lt;code&gt;-e&lt;/code&gt; git installs, Poetry and Pipfile git sources, and conda &lt;code&gt;pip:&lt;/code&gt; git lines to the right in-org repo, with the committish recorded as the version&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keeps real cross-repo references&lt;/strong&gt; while dropping purely local &lt;code&gt;file:&lt;/code&gt; and &lt;code&gt;./path&lt;/code&gt; installs and plain wheel or sdist URLs that carry no cross-repo signal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Records the scope of each declaration&lt;/strong&gt;, runtime against dev against optional or extra, so a test-only or extras-only consumer is not weighed the same as a runtime one&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leaves test, example, and fixture trees out of the consumer count&lt;/strong&gt;, so a repo that imports your package only in a test harness does not read as a production consumer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reports the constraint each repo declares&lt;/strong&gt;, which is the fact that governs who adopts your next release, rather than the exact version a lockfile resolved this minute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stays current through rescans&lt;/strong&gt;, rather than a one-time snapshot that is stale the moment a manifest changes&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is one of the specific problems &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt; is built to solve. It connects to your GitHub or GitLab organisation with one read-only token, scans every repo, and parses &lt;code&gt;requirements.txt&lt;/code&gt;, split &lt;code&gt;requirements/&lt;/code&gt; trees and pip-tools &lt;code&gt;.in&lt;/code&gt; inputs, &lt;code&gt;pyproject.toml&lt;/code&gt; across PEP 621, Poetry groups and PEP 735 dependency groups, &lt;code&gt;setup.cfg&lt;/code&gt;, &lt;code&gt;setup.py&lt;/code&gt;, &lt;code&gt;Pipfile&lt;/code&gt;, and conda &lt;code&gt;environment.yml&lt;/code&gt; including the nested &lt;code&gt;pip:&lt;/code&gt; block. It normalises names per PEP 503 and resolves each declared dependency to the repo that produces the package, so a name nothing in your org produces is treated as external rather than a consumer. It reads the value rather than just the name, so a &lt;code&gt;pip install git+https://gitlab.yourco.com/platform/yourco-clients.git&lt;/code&gt; resolves to the &lt;code&gt;platform/yourco-clients&lt;/code&gt; repo and is recorded as a consumer with no private index required, while purely local &lt;code&gt;file:&lt;/code&gt; installs and plain wheel URLs are skipped. Each edge carries the constraint the consumer declares and the manifest line where it lives. Parsed from what each repo declares, not inferred from what the index happened to serve.&lt;/p&gt;

&lt;p&gt;A few honest limits, in the spirit of the rest of this series. Riftmap reads the declared dependency in the manifest, not the resolved lockfile tree, so it shows the constraint each repo declares, which governs who adopts your next release, rather than the exact version each one has installed right now. It binds at the distribution layer, who declares a dependency on the package, not the import-symbol layer, which repos &lt;code&gt;import&lt;/code&gt; the specific function or class you are changing. For that symbol-level question a symbol graph like &lt;a href="https://sourcegraph.com/" rel="noopener noreferrer"&gt;Sourcegraph&lt;/a&gt; is the right tool, and a complementary one, because symbol graphs and artifact dependency graphs are different categories. It records a runtime, dev, or optional scope for each declaration, though surfacing that distinction in the consumer view is still on the near-term roadmap, so the panel today shows the manifest, the line, and the constraint rather than a scope label. And because Python has no &lt;code&gt;@scope&lt;/code&gt; convention the way npm does, Riftmap recognises an internal package by the fact that some repo in the scanned org produces it, not by a name prefix. A &lt;code&gt;yourco-&lt;/code&gt; prefix is a useful convention, not a guarantee.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ffw91o7h8iw5frnm7m48k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ffw91o7h8iw5frnm7m48k.png" alt="Riftmap Dependents panel for the internal Python package python-auth, produced by the repo polaris-python-auth, listing the repositories in the org that declare a dependency on it. The Dependents tab shows 17 consumers, the first nine visible. Each row carries a Python source badge, a confidence score, the consumer repository, the producing repository, and on the right the manifest file and line where the dependency is declared plus the version constraint. The rows span the range of Python manifest formats for one package: analytics-api declares it in both pyproject.toml and setup.cfg; migration-scripts, payment-gateway-adapter and etl-pipelines in pyproject.toml; ml-models in a conda environment.yml; payment-worker with an exact pin ==2.4.1; crm-integration in a legacy setup.py scored at 90% — lower than the 100% declarative rows because the setup.py parse is an AST heuristic; and route-optimizer with a Poetry caret ^2.0.0. Exact pins, a Poetry caret, a conda block and a lower-confidence setup.py row side by side: the divergent-constraint picture a single grep cannot assemble." width="800" height="978"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The result is the view the rest of this series describes. Before you drop that parameter, rename that export, or cut &lt;code&gt;3.0.0&lt;/code&gt;, you open the graph, click the package, and read the consumer list: every repo that declares a dependency on it, the constraint each one carries, the manifest and line where the dependency lives, whether it came from a registry name or a git reference, and which team owns it. You know who breaks. You know who is riding a loose constraint and will pull your next minor on their next Docker rebuild whether you meant them to or not. You know who is stranded on the old major and needs a migration before you can &lt;a href="https://riftmap.dev/blog/deprecate-internal-library-find-consumers/" rel="noopener noreferrer"&gt;deprecate it&lt;/a&gt;. No clean-installing thirty repos to run &lt;code&gt;pipdeptree&lt;/code&gt;. No script juggling nine manifest shapes. No waiting to see whose build goes red.&lt;/p&gt;

&lt;h2&gt;
  
  
  The dependency was never written down in one place
&lt;/h2&gt;

&lt;p&gt;Here is the closing thought. With the other ecosystems in this series, the reverse question is hard because the consumer graph lives behind access control, or in a proxy that logs the wrong event. That is true for Python too. But Python adds a second, deeper reason, and it is the one worth sitting with.&lt;/p&gt;

&lt;p&gt;Python gives you a dozen honest ways to declare a dependency, and two different names for every package, and no single place that reconciles them. The dependency on your library exists, but it was never written down in one canonical form. It is a constraint in a &lt;code&gt;requirements.txt&lt;/code&gt; here, a PEP 621 entry there, a Poetry caret somewhere else, a &lt;code&gt;git+https&lt;/code&gt; reference in a fourth repo, all under a distribution name that does not match the import name, half of them rerouted by a constraints file you have to read separately. The reverse question is not hard because Python is messy. It is hard because the thing you are trying to find was never recorded as one thing. It lives spread across nine manifest shapes and two namespaces, in the relationship between repos that no single checkout contains. That was never a property of pip, or of PyPI, or of your index. It was a property of asking the question from inside one repo, when the answer was always somewhere between them.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is the eighth post in the &lt;a href="https://riftmap.dev/blog/series/find-every-consumer/" rel="noopener noreferrer"&gt;Find Every Consumer&lt;/a&gt; series. Previous posts cover &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-docker-base-image/" rel="noopener noreferrer"&gt;Docker base images&lt;/a&gt;, &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-terraform-module/" rel="noopener noreferrer"&gt;Terraform modules&lt;/a&gt;, &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-github-actions-workflow/" rel="noopener noreferrer"&gt;GitHub Actions workflows&lt;/a&gt;, &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-helm-chart/" rel="noopener noreferrer"&gt;Helm charts&lt;/a&gt;, &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-go-module/" rel="noopener noreferrer"&gt;Go modules&lt;/a&gt;, &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-gitlab-ci-template/" rel="noopener noreferrer"&gt;GitLab CI templates&lt;/a&gt; and &lt;a href="https://riftmap.dev/blog/how-to-find-every-consumer-of-your-internal-npm-package/" rel="noopener noreferrer"&gt;internal npm packages&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If this is a problem your platform team deals with, I would be interested to hear how you are solving it today. You can find more at &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;riftmap.dev&lt;/a&gt; or reach me at the address on the &lt;a href="https://riftmap.dev/about/" rel="noopener noreferrer"&gt;about page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About Riftmap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Riftmap maps cross-repo dependencies across your entire GitLab or GitHub organisation — Terraform, Docker, CI templates, Helm, Python, Go, npm, and more. One read-only token. No YAML to maintain.&lt;/p&gt;

</description>
      <category>python</category>
      <category>platformengineering</category>
      <category>devops</category>
      <category>dependencymanagement</category>
    </item>
    <item>
      <title>Backstage alternatives in 2026: first ask why you wanted Backstage</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Mon, 29 Jun 2026 12:16:23 +0000</pubDate>
      <link>https://dev.to/danielwe/backstage-alternatives-in-2026-first-ask-why-you-wanted-backstage-56l</link>
      <guid>https://dev.to/danielwe/backstage-alternatives-in-2026-first-ask-why-you-wanted-backstage-56l</guid>
      <description>&lt;p&gt;&lt;em&gt;Every "Backstage alternatives" roundup lists the same five portals. None of them asks the question that decides which alternative is right: what job sent you looking in the first place?&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;A senior platform engineer at a Nordic consultancy summarised his Backstage evaluation to me in one sentence: the cost of setting it up and keeping it maintained was bigger than what they got back. He is not an outlier. I have heard the same verdict, in nearly the same words, from engineers across r/devops threads, client engagements, and direct conversations. The team evaluates Backstage seriously, sometimes runs a proof of concept, and walks away. Then they type "Backstage alternatives" into a search box, and the search results take over.&lt;/p&gt;

&lt;p&gt;Go read those results. As of mid-2026, every page that ranks is a vendor roundup, and every roundup follows the same script. &lt;a href="https://www.port.io/blog/top-backstage-alternatives" rel="noopener noreferrer"&gt;Port lists alternatives&lt;/a&gt; and Port is the best one. &lt;a href="https://www.cortex.io/post/backstage-alternatives-what-engineering-leaders-need-to-know-in-2026" rel="noopener noreferrer"&gt;Cortex lists alternatives&lt;/a&gt; and Cortex is the most comprehensive. &lt;a href="https://www.opslevel.com/resources/backstage-io-alternatives-4-top-tools-to-use-instead" rel="noopener noreferrer"&gt;OpsLevel lists alternatives&lt;/a&gt; and OpsLevel is the fully managed answer. The supporting cast rotates between Roadie, Mia-Platform, Configure8, Rely.io, and Atlassian Compass, but the structure never changes. Backstage is hard, here are five portals that are easier, ours is first.&lt;/p&gt;

&lt;p&gt;Here is the thing none of those pages will tell you, because their business depends on not telling you. "Backstage alternatives" is not one search. It is at least three different searches wearing the same query, and the right alternative depends entirely on which one is yours. Two of the three are well served by the portal vendors in those roundups. The third is not served by any of them, because the portals inherit the exact property that made you walk away from Backstage.&lt;/p&gt;

&lt;p&gt;This post is the triage the roundups skip. I will try to be fair to every tool in it, including Backstage, because the engineers reading this can smell a strawman from the next time zone. And I will be upfront that I build a tool that fits exactly one of the three jobs, and explicitly does not fit the other two.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Backstage actually is, honestly
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://backstage.io/" rel="noopener noreferrer"&gt;Backstage&lt;/a&gt; is an open-source framework for building internal developer portals, created at Spotify and open-sourced in March 2020. It remains a &lt;a href="https://www.cncf.io/projects/backstage/" rel="noopener noreferrer"&gt;CNCF Incubating project&lt;/a&gt; with one of the largest contributor communities in the foundation. It pioneered the developer-portal category, and most of the commercial portals in those roundups exist because Backstage proved the demand first.&lt;/p&gt;

&lt;p&gt;The origin story matters more than people give it credit for. Backstage began as an internal Spotify project called System Z, built so that engineers in a fast-growing organisation could understand ownership, dependencies, and versions across an exploding service landscape. Hold onto that word "dependencies". It comes back later.&lt;/p&gt;

&lt;p&gt;The criticisms are equally well established, and I will not pretend they are mine. Backstage is a framework, not a product. You clone it, stand up a PostgreSQL database, configure authentication, and start writing or installing plugins, most of which are community-maintained without vendor support. The estimates for what this costs are public and not in dispute. The community site internaldeveloperplatform.org puts the true cost of ownership at &lt;a href="https://internaldeveloperplatform.org/developer-portals/backstage/" rel="noopener noreferrer"&gt;around $150,000 per 20 developers&lt;/a&gt;, a figure that Port and OpsLevel both cite in their own marketing. Cortex's roundup says most organisations need two or three full-time engineers for six months or more just to stand up a basic service catalog. Other practitioners put production-readiness at six to twelve months. Gartner has noted that organisations mistakenly believe Backstage is a ready-to-use portal, and that the rude awakening during implementation leads to projects being put on hold or abandoned.&lt;/p&gt;

&lt;p&gt;So far, the roundups and I agree. Backstage is genuinely expensive to run. Where we part ways is on what that means. The roundup logic is: Backstage is expensive, therefore buy a cheaper portal. The actual logic should be: Backstage is expensive, therefore figure out which part of it you wanted, because you might be able to buy just that part, and for one specific part, no portal sells it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three searches hiding inside one query
&lt;/h2&gt;

&lt;p&gt;When a team types "Backstage alternatives", they arrived there from one of three places. The triage question is which one.&lt;/p&gt;

&lt;h3&gt;
  
  
  Job one: you want what a portal does
&lt;/h3&gt;

&lt;p&gt;Some teams want the portal itself. Golden-path templates for scaffolding new services. Scorecards that track whether services have runbooks, SLOs, and passing security scans. A single pane of glass for ownership, on-call, and documentation. Self-service actions that let a developer spin up an environment without filing a ticket.&lt;/p&gt;

&lt;p&gt;If this is your job, the roundups are right and I have nothing contrarian to offer. The commercial portals are real products built by serious teams, and the honest comparison between them comes down to taste and scale. &lt;a href="https://www.port.io/" rel="noopener noreferrer"&gt;Port&lt;/a&gt; gives you a flexible data model you configure visually rather than in code, which suits organisations whose workflows do not fit standard patterns. &lt;a href="https://www.cortex.io/" rel="noopener noreferrer"&gt;Cortex&lt;/a&gt; leans hardest into scorecards and engineering standards, which suits organisations whose pain is "we have 400 services and no idea which ones meet our bar". &lt;a href="https://www.opslevel.com/" rel="noopener noreferrer"&gt;OpsLevel&lt;/a&gt; is deliberately opinionated, which suits teams that want the vendor to have made the workflow decisions already. All three will get you to a working portal in weeks instead of quarters, and all three cost real money at scale, which is the trade you are making.&lt;/p&gt;

&lt;p&gt;What I want you to notice is what these products have in common with Backstage underneath the better onboarding. They are all catalog-model systems. Each one maintains a registry of entities, services, teams, resources, and the relationships between them, and that registry is populated by some mix of integrations and humans registering things. That is the right architecture for the portal job. Ownership is something a human decides. A runbook link is something a human writes down. Scorecards evaluate criteria a human defined. The catalog model fits because the data genuinely originates with people.&lt;/p&gt;

&lt;h3&gt;
  
  
  Job two: you want Backstage itself, without operating it
&lt;/h3&gt;

&lt;p&gt;Some teams evaluated Backstage and concluded the product was right but the operational burden was not. They want the open-source ecosystem, the plugin library, the CNCF governance, and they want someone else to run it.&lt;/p&gt;

&lt;p&gt;This path matured significantly in the last year. &lt;a href="https://backstage.spotify.com/" rel="noopener noreferrer"&gt;Spotify Portal for Backstage&lt;/a&gt; went GA in October 2025 as a fully managed, no-code SaaS version of Backstage operated by Spotify itself, with setup wizards in place of the configuration work that used to consume the first quarter. &lt;a href="https://roadie.io/" rel="noopener noreferrer"&gt;Roadie&lt;/a&gt; has offered managed Backstage for years and remains the established independent option, handling hosting, upgrades, and the GitHub rate-limit problems that bite self-hosters.&lt;/p&gt;

&lt;p&gt;If your evaluation said yes to Backstage's model and no to its operations, this is your category, and it is a perfectly defensible choice. You keep the ecosystem and shed the toil. I have no quarrel with it.&lt;/p&gt;

&lt;p&gt;But notice, again, what does not change. Managed Backstage is still Backstage. The Software Catalog is still populated by &lt;code&gt;catalog-info.yaml&lt;/code&gt; files in your repos, and the relationships in it, including the &lt;code&gt;dependsOn&lt;/code&gt; entries, are still whatever a human last wrote there. Spotify operating the infrastructure does not update your YAML when an engineer changes a Terraform module source. The hosting was never the part that went stale.&lt;/p&gt;

&lt;h3&gt;
  
  
  Job three: you wanted to see what depends on what
&lt;/h3&gt;

&lt;p&gt;Now the third search, the one I keep meeting in the wild.&lt;/p&gt;

&lt;p&gt;A meaningful fraction of teams never wanted golden paths or scorecards. They reached for Backstage because of the dependency graph. They wanted the answer to "what breaks if I change this", or "which repos consume this base image", or "the engineer who understood how these sixty repos fit together is leaving in three weeks". They saw the Software Catalog's dependency view, recognised the thing they were missing, and adopted a developer portal to get it. That is not a misreading of Backstage. It is the original System Z brief: ownership, dependencies, versions. It is also the kind of question a parsed graph answers concretely rather than in the abstract: when I &lt;a href="https://riftmap.dev/blog/what-208-kubernetes-sigs-repos-actually-depend-on/" rel="noopener noreferrer"&gt;scanned all 208 kubernetes-sigs repos&lt;/a&gt;, 153 of them turned out to import a single shared module — the sort of fan-in a stale catalog never shows you.&lt;/p&gt;

&lt;p&gt;For this job, the catalog model is not the solution with some maintenance cost attached. The maintenance cost &lt;em&gt;is&lt;/em&gt; the failure mode. I wrote about this pattern at length in &lt;a href="https://riftmap.dev/blog/the-catalog-maintenance-trap/" rel="noopener noreferrer"&gt;the catalog maintenance trap&lt;/a&gt;, but the short version goes like this. A dependency entry in &lt;code&gt;catalog-info.yaml&lt;/code&gt; is a second, hand-registered copy of a fact your repos already declare. The original declaration is the Terraform &lt;code&gt;source&lt;/code&gt; block, the Dockerfile &lt;code&gt;FROM&lt;/code&gt; line, the &lt;code&gt;go.mod&lt;/code&gt; &lt;code&gt;require&lt;/code&gt;, the &lt;code&gt;.gitlab-ci.yml&lt;/code&gt; &lt;code&gt;include&lt;/code&gt;, the Helm &lt;code&gt;Chart.yaml&lt;/code&gt; dependency. Engineers must edit those files to ship. Nothing forces them to edit the catalog YAML to match, so within weeks the manifest and the catalog entry diverge, and the graph in the portal becomes documentation that was supposed to be authoritative. Which is worse than no graph, because people make blast-radius decisions on the assumption it is current. This is the structural weakness of a &lt;a href="https://riftmap.dev/blog/declared-inferred-registered/" rel="noopener noreferrer"&gt;registered dependency&lt;/a&gt;: unlike a declared edge the build re-reads on every run, nothing executes a catalog, so it drifts at exactly the rate attention wanders.&lt;/p&gt;

&lt;p&gt;Here is the part the roundups structurally cannot say. Switching portal vendors does not escape this. Port's marketing makes the point against its rivals better than I could: it criticises YAML-based catalogs for creating developer overhead and not updating in real time from the source of truth, eroding trust and adoption. That criticism is correct, and it applies to the entire category whenever the data in question is the dependency graph, because dependencies are facts about source files, and source files change with every commit. A portal can ingest from integrations, and the good ones do for cloud resources and Kubernetes objects. But the cross-repo dependency edges your infrastructure actually runs on, module sources, image references, CI includes, chart dependencies, live in manifests that no portal in those roundups parses.&lt;/p&gt;

&lt;p&gt;So if job three is your job, the honest answer to "what is the best Backstage alternative" is: not a portal. Any portal. The alternative is a different architecture entirely, one where the graph is parsed from the declarations that already exist instead of modelled from entries you ask humans to register. I went deep on that architectural distinction in &lt;a href="https://riftmap.dev/blog/modeled-graphs-and-parsed-graphs/" rel="noopener noreferrer"&gt;modeled graphs and parsed graphs&lt;/a&gt;; the one-line version is that a parsed graph cannot go stale relative to the source, because the source is the input.&lt;/p&gt;

&lt;h2&gt;
  
  
  The triage, in one table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Why you wanted Backstage&lt;/th&gt;
&lt;th&gt;Right category&lt;/th&gt;
&lt;th&gt;Representative options&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Golden paths, scaffolding, scorecards, ownership, self-service&lt;/td&gt;
&lt;td&gt;Commercial developer portal&lt;/td&gt;
&lt;td&gt;Port, Cortex, OpsLevel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backstage's model and ecosystem, minus the operations&lt;/td&gt;
&lt;td&gt;Managed Backstage&lt;/td&gt;
&lt;td&gt;Spotify Portal, Roadie&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dependency visibility and blast radius across repos&lt;/td&gt;
&lt;td&gt;Parsed dependency graph&lt;/td&gt;
&lt;td&gt;Riftmap, or build your own parser&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Keeping third-party dependencies up to date&lt;/td&gt;
&lt;td&gt;Automated update tooling&lt;/td&gt;
&lt;td&gt;Renovate, Dependabot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code search and symbol navigation across repos&lt;/td&gt;
&lt;td&gt;Code intelligence&lt;/td&gt;
&lt;td&gt;Sourcegraph&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I added the last two rows because they are the other jobs I see mislabelled as portal problems. &lt;a href="https://riftmap.dev/blog/the-state-of-infrastructure-dependency-tooling-2026/" rel="noopener noreferrer"&gt;Renovate and Dependabot&lt;/a&gt; keep versions current but tell you nothing about who consumes what. Sourcegraph's symbol graph is genuinely excellent at code-level navigation and stops at the infrastructure boundary, a distinction I unpacked in &lt;a href="https://riftmap.dev/blog/symbol-graphs-and-artifact-graphs/" rel="noopener noreferrer"&gt;symbol graphs and artifact graphs&lt;/a&gt;. Neither is a Backstage alternative, but both get evaluated as one, which tells you how muddled this category's vocabulary is. The muddle gained its biggest entrant in June, when GitLab shipped Orbit, a group-wide symbol-and-SDLC graph. &lt;a href="https://riftmap.dev/blog/gitlab-orbit-and-the-artifact-layer/" rel="noopener noreferrer"&gt;I read everything GitLab Orbit actually shipped&lt;/a&gt;, and it stops at the same infrastructure boundary as Sourcegraph: no HCL, no Dockerfiles, no artifact edges.&lt;/p&gt;

&lt;p&gt;And a row I deliberately left out: "build your own portal from scratch". Teams do it. Canva did, then migrated off it, and the engineer who ran that migration described the homegrown portal as something they got value from while using it, not wasted work. That is the right way to think about sunk platform investment generally, including a Backstage proof of concept that taught you which job you actually have.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Backstage genuinely wins
&lt;/h2&gt;

&lt;p&gt;I want to be precise about when the answer to "Backstage alternatives" is "none, use Backstage", because that answer is real.&lt;/p&gt;

&lt;p&gt;If you have a platform team with frontend capacity, a genuine need to own and extend the portal, and an organisation large enough that the per-developer cost of the framework amortises, Backstage is a defensible choice that thousands of organisations have made work. The plugin ecosystem is unmatched. The CNCF governance means it will outlive any single vendor's funding cycle. And the things humans should register on purpose, ownership, on-call, runbooks, tech docs, are things Backstage handles well precisely because the catalog model fits them.&lt;/p&gt;

&lt;p&gt;The mistake is not adopting Backstage. The mistake is adopting any catalog-model system, Backstage or its commercial successors, &lt;em&gt;for the dependency graph&lt;/em&gt;, and then spending organisational willpower trying to keep humans updating a hand-registered copy of facts the repos already state. That spend is the maintenance cost everyone complains about, and it does not buy accuracy. It buys a graph that is accurate to within whenever someone last cared.&lt;/p&gt;

&lt;h2&gt;
  
  
  The question underneath the query
&lt;/h2&gt;

&lt;p&gt;The roundups argue about which portal. After two years of conversations with teams who walked away from Backstage, I think the better argument is about which job. The portal jobs are well served, by the portals and by managed Backstage, and the vendors fighting over that SERP have earned their places in it. The dependency-visibility job is the one that query quietly smuggles in, and it is the one place where every option in every roundup shares Backstage's actual weakness rather than fixing it.&lt;/p&gt;

&lt;p&gt;If the sentence that sent you searching was some version of "we wanted to know what breaks when we change things, and the catalog could not keep up", then you were never shopping for a portal. You were shopping for a graph, and the graph already exists, written across your Terraform sources, Dockerfiles, CI includes, chart dependencies, and module files. The work is parsing it, not registering it again by hand.&lt;/p&gt;

&lt;p&gt;That parsing is what I build. &lt;a href="https://riftmap.dev/" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt; connects to a GitLab or GitHub org with a read-only token, parses the dependency declarations across twelve ecosystems, Terraform, Docker, Helm, Kubernetes, CI templates, Go, npm, Python, Ansible, and more, and serves the resulting graph two ways: a blast-radius UI for engineers, and a &lt;a href="https://riftmap.dev/for-agents/" rel="noopener noreferrer"&gt;JSON API for coding agents&lt;/a&gt; that need cross-repo context at planning time. There is no catalog to maintain because there is no catalog. If your job is one of the other two, use the table above with my blessing; Riftmap is not a portal and will not become one. If your job is the third one, &lt;a href="https://app.riftmap.dev" rel="noopener noreferrer"&gt;the free tier covers 15 repos&lt;/a&gt; and the first scan takes about ninety seconds, which is less time than reading one more roundup.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is the best alternative to Backstage?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It depends on which job sent you looking. If you want golden paths, scaffolding, scorecards, and self-service, a commercial developer portal like Port, Cortex, or OpsLevel is the right category. If you want Backstage's model and plugin ecosystem without operating it, managed Backstage — Spotify Portal or Roadie — fits. If you wanted dependency visibility and blast radius across repos, no portal solves that well, because the dependency graph is a fact about source files; a parsed dependency graph like Riftmap reads it from the manifests instead of asking humans to register it again.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do teams abandon Backstage?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The most common reason is that the cost of standing it up and keeping it maintained exceeds the value returned. Backstage is a framework, not a finished product: the community site internaldeveloperplatform.org estimates a total cost of ownership around $150,000 per 20 developers, and roundups note most organisations need two to three full-time engineers for six months or more just to stand up a basic service catalog. The dependency graph in particular goes stale, because catalog YAML is a hand-registered copy of facts the repos already state, and nothing forces engineers to keep it in sync.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is Backstage still worth using in 2026?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes, for the right team. If you have a platform team with frontend capacity, a genuine need to own and extend the portal, and an organisation large enough that the per-developer cost amortises, Backstage is a defensible choice with an unmatched plugin ecosystem and CNCF governance. The mistake is not adopting Backstage — it is adopting any catalog-model system, Backstage or its commercial successors, as the dependency graph, and then spending organisational willpower keeping humans updating facts the repos already declare. The deep-dive on this question, with Spotify's own external-adoption figure of around 10% and a diagnostic for predicting your own outcome before you commit, is in &lt;a href="https://riftmap.dev/blog/is-backstage-worth-it/" rel="noopener noreferrer"&gt;is Backstage worth it?&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>backstage</category>
      <category>platformengineering</category>
      <category>developerportals</category>
      <category>servicecatalogs</category>
    </item>
    <item>
      <title>Auto-Discovering Infrastructure Dependencies Across 10 Ecosystems</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Mon, 29 Jun 2026 12:16:17 +0000</pubDate>
      <link>https://dev.to/danielwe/auto-discovering-infrastructure-dependencies-across-10-ecosystems-2bha</link>
      <guid>https://dev.to/danielwe/auto-discovering-infrastructure-dependencies-across-10-ecosystems-2bha</guid>
      <description>&lt;p&gt;&lt;em&gt;It sounds simple: clone every repo, parse the files, build a graph. Here's why each ecosystem fights back, and what it actually takes to map cross-repo dependencies automatically.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;In &lt;a href="https://riftmap.dev/blog/infrastructure-dependency-problem" rel="noopener noreferrer"&gt;the last post&lt;/a&gt;, I wrote about the infrastructure dependency visibility gap. The fact that most platform teams have no way to answer "if I change this, what else breaks?" across their repos. The community response confirmed what I'd seen at every client: people are building brittle grep scripts, maintaining stale spreadsheets, or just relying on whoever has been around the longest.&lt;/p&gt;

&lt;p&gt;The obvious next question is: why doesn't anyone just &lt;em&gt;parse the repos and build a graph&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;The answer is that people do. Multiple engineers I've spoken with have built their own versions; a nightly cron job, some shell scripts, a SQLite database. And those solutions work, for a while, for one org, for the file types they remembered to handle. Then they hit an edge case, go stale, or the person who built it moves on.&lt;/p&gt;

&lt;p&gt;The core approach is right: scan every repo, parse the files that declare dependencies, resolve them to actual repos, build a directed graph. But the devil is in the details, and each ecosystem has its own set of devils. This post walks through what it actually takes to auto-discover cross-repo dependencies across Terraform, Docker, CI pipelines, Python, Go, npm, Ansible, Helm, and Kubernetes, and why the cross-ecosystem problem is harder than any individual one.&lt;/p&gt;

&lt;h2&gt;
  
  
  The approach: parse what's already there
&lt;/h2&gt;

&lt;p&gt;The principle behind auto-discovery is simple: &lt;strong&gt;the dependencies are already declared in the source files.&lt;/strong&gt; A Terraform module has a &lt;code&gt;source&lt;/code&gt; attribute pointing at a git URL. A Dockerfile has a &lt;code&gt;FROM&lt;/code&gt; statement naming a base image. A GitLab CI config has &lt;code&gt;include:&lt;/code&gt; directives referencing templates in other repos.&lt;/p&gt;

&lt;p&gt;You don't need humans to fill in a catalog. You don't need a YAML manifest per repo. The dependency information exists. It's just scattered across a dozen file formats in hundreds of repos, with no unified view.&lt;/p&gt;

&lt;p&gt;So the pipeline looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Enumerate&lt;/strong&gt; — list every repo in the GitLab group or GitHub org via API&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clone&lt;/strong&gt; — shallow-clone each repo (depth 1, just the default branch)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parse&lt;/strong&gt; — walk the file tree, dispatch each file to the right parser based on filename and path&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detect artifacts&lt;/strong&gt; — identify what each repo &lt;em&gt;produces&lt;/em&gt; (a Terraform module, a Docker image, a Python package, a Helm chart)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resolve&lt;/strong&gt; — match parsed dependency references to known repos or artifacts in the org&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Store&lt;/strong&gt; — persist the graph as queryable relationships&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Steps 1 and 2 are straightforward. Steps 3 through 5 are where every ecosystem has opinions about how to make your life difficult.&lt;/p&gt;

&lt;h2&gt;
  
  
  Terraform: where version refs hide in query strings
&lt;/h2&gt;

&lt;p&gt;Terraform is usually the first ecosystem people think of for cross-repo dependencies, and on the surface it looks easy. A module block has a &lt;code&gt;source&lt;/code&gt; attribute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"git::https://gitlab.com/infra/modules/vpc.git?ref=v2.1.0"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You parse the &lt;code&gt;source&lt;/code&gt; string, extract the git URL and the ref, normalize the path to &lt;code&gt;infra/modules/vpc&lt;/code&gt;, match it against known repos in the org — done.&lt;/p&gt;

&lt;p&gt;Except it's never that clean. Here's what you actually encounter in the wild:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multiple URL formats.&lt;/strong&gt; The same module might be sourced as &lt;code&gt;git::https://...&lt;/code&gt;, &lt;code&gt;git@gitlab.com:...&lt;/code&gt; (SSH), a bare HTTPS URL without the &lt;code&gt;git::&lt;/code&gt; prefix, or a Terraform registry address like &lt;code&gt;app.terraform.io/org/module/provider&lt;/code&gt;. Each format needs different parsing logic to extract the same canonical repo path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Subdirectory references.&lt;/strong&gt; Terraform supports the double-slash convention: &lt;code&gt;git::https://gitlab.com/infra/modules.git//networking/vpc?ref=v1.0&lt;/code&gt;. The repo is &lt;code&gt;infra/modules&lt;/code&gt;, but the module is in a subdirectory. This means one repo can &lt;em&gt;produce&lt;/em&gt; multiple distinct modules, and your parser needs to handle that relationship.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Variable interpolation.&lt;/strong&gt; You'll find modules like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"service"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"git::https://gitlab.com/${var.infra_group}/modules/service.git"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can't resolve &lt;code&gt;${var.infra_group}&lt;/code&gt; without running Terraform, and the whole point of static analysis is that you &lt;em&gt;don't&lt;/em&gt; run Terraform. The practical choice is to flag these as lower-confidence dependencies and extract what you can from the static portion of the string.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Public registry vs. internal modules.&lt;/strong&gt; A source like &lt;code&gt;hashicorp/consul/aws&lt;/code&gt; points to the public Terraform Registry — it's not an internal dependency and should be skipped. But &lt;code&gt;app.terraform.io/your-org/vpc/aws&lt;/code&gt; &lt;em&gt;is&lt;/em&gt; internal. The parser needs to distinguish between public and private registries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The &lt;code&gt;modules.json&lt;/code&gt; trap.&lt;/strong&gt; Some people suggest parsing &lt;code&gt;.terraform/modules/modules.json&lt;/code&gt;, which contains the resolved module tree. The problem: this file only exists if someone has run &lt;code&gt;terraform init&lt;/code&gt;, it's usually in &lt;code&gt;.gitignore&lt;/code&gt;, and it reflects one person's local state — not the repo's declared dependencies. It's not a reliable source for org-wide discovery.&lt;/p&gt;

&lt;p&gt;The meta-lesson from Terraform: even within a single ecosystem, the same logical relationship ("repo A depends on repo B") can be expressed in half a dozen syntactically different ways, and your parser needs to normalize all of them to the same canonical form.&lt;/p&gt;

&lt;h2&gt;
  
  
  Docker: the base image puzzle
&lt;/h2&gt;

&lt;p&gt;Dockerfiles look deceptively simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node:18-alpine&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's a public image — skip it. But:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; registry.company.com/platform/base-image:v2.1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's an internal base image, and it means this repo depends on whatever repo builds and publishes &lt;code&gt;platform/base-image&lt;/code&gt;. Tracking these relationships across an org is one of the highest-value things a dependency graph can do. Docker base image updates are a constant source of surprise breakage.&lt;/p&gt;

&lt;p&gt;Here's where it gets complicated:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build arguments and variable substitution.&lt;/strong&gt; Real-world Dockerfiles frequently use &lt;code&gt;ARG&lt;/code&gt; to parameterize the base image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;ARG&lt;/span&gt;&lt;span class="s"&gt; REGISTRY=registry.company.com&lt;/span&gt;
&lt;span class="k"&gt;ARG&lt;/span&gt;&lt;span class="s"&gt; BASE_VERSION=latest&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; ${REGISTRY}/platform/base-image:${BASE_VERSION}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can resolve &lt;code&gt;ARG&lt;/code&gt; defaults by parsing the Dockerfile top-to-bottom, substituting the default values into the &lt;code&gt;FROM&lt;/code&gt; statement. But if the &lt;code&gt;ARG&lt;/code&gt; is overridden at build time via &lt;code&gt;--build-arg&lt;/code&gt;, the static default might be wrong. Again: lower confidence, but still useful signal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-stage builds.&lt;/strong&gt; A modern Dockerfile might have several &lt;code&gt;FROM&lt;/code&gt; statements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;node:18&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;builder&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;npm ci &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm run build

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; registry.company.com/platform/nginx:1.25&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /app/dist /usr/share/nginx/html&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first &lt;code&gt;FROM&lt;/code&gt; is a public image. The second is an internal dependency. The &lt;code&gt;COPY --from=builder&lt;/code&gt; is a reference to an earlier &lt;em&gt;stage&lt;/em&gt;, not an external image — the parser needs to track named stages and skip internal references. If you naively treat every &lt;code&gt;FROM&lt;/code&gt; and &lt;code&gt;COPY --from&lt;/code&gt; as a dependency, you'll generate false edges in the graph.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker Compose adds another layer.&lt;/strong&gt; A &lt;code&gt;docker-compose.yml&lt;/code&gt; might reference images directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;registry.company.com/backend/api-service:latest&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a dependency declaration in a completely different file format (YAML vs. Dockerfile syntax), but it represents the same kind of relationship.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The consumer-side problem.&lt;/strong&gt; Knowing that repo X &lt;em&gt;uses&lt;/em&gt; a Docker image is only half the story. You also need to know which repo &lt;em&gt;builds&lt;/em&gt; that image. This isn't declared in the Dockerfile. It's usually in the CI pipeline config (&lt;code&gt;docker build -t&lt;/code&gt; and &lt;code&gt;docker push&lt;/code&gt; commands). Connecting the consumer side ("this repo uses image Y") to the producer side ("this repo builds image Y") requires cross-referencing information from different file types within the same repo.&lt;/p&gt;

&lt;h2&gt;
  
  
  CI templates: the invisible dependency layer
&lt;/h2&gt;

&lt;p&gt;CI pipeline configs are arguably the most important dependency surface to track, and the most neglected.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitLab CI
&lt;/h3&gt;

&lt;p&gt;GitLab CI supports several forms of cross-repo inclusion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;project&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;platform/ci-templates'&lt;/span&gt;
    &lt;span class="na"&gt;ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;v2.0'&lt;/span&gt;
    &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/terraform-plan.yml'&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;remote&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://gitlab.com/platform/ci-templates/-/raw/main/deploy.yml'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;project:&lt;/code&gt; form gives you a clean repo reference and an optional version ref. The &lt;code&gt;remote:&lt;/code&gt; form gives you a URL that needs to be parsed back into a repo path.&lt;/p&gt;

&lt;p&gt;There's also &lt;code&gt;trigger:&lt;/code&gt;, where one pipeline triggers another project's pipeline, and &lt;code&gt;image:&lt;/code&gt;, where a job specifies a Docker image. Another dependency surface hiding in the CI config.&lt;/p&gt;

&lt;p&gt;A subtle gotcha: GitLab CI supports &lt;code&gt;!reference&lt;/code&gt; tags for reusing configuration fragments. These are valid YAML tags but not standard YAML. A naive YAML parser will choke on them. Your parser needs to handle this gracefully, either by pre-processing the file or by configuring the YAML loader to ignore unknown tags.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub Actions
&lt;/h3&gt;

&lt;p&gt;GitHub Actions reusable workflows have their own syntax:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;org/shared-workflows/.github/workflows/deploy.yml@v1.2.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And composite or JavaScript actions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;org/custom-action@main&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;uses:&lt;/code&gt; string encodes the org, repo, path, and ref in a single string. You need to parse it, separate the org/repo from the workflow path, handle the &lt;code&gt;@ref&lt;/code&gt; suffix, and skip public actions (anything under &lt;code&gt;actions/*&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The real challenge with CI templates is that they're often pinned to a &lt;em&gt;branch&lt;/em&gt; rather than a tag — &lt;code&gt;@main&lt;/code&gt; instead of &lt;code&gt;@v2.0&lt;/code&gt;. This means version tracking is inherently fuzzy. You can tell that repo A depends on repo B's CI template, but "which version" is just "whatever's on main right now." This is exactly the kind of implicit, hard-to-track dependency that causes surprise breakage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Python, Go, and npm: package ecosystems with their own quirks
&lt;/h2&gt;

&lt;p&gt;These three share a common pattern — they have declared dependency manifests — but each has its own flavour of complexity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python
&lt;/h3&gt;

&lt;p&gt;Python dependencies can be declared in at least four places: &lt;code&gt;requirements.txt&lt;/code&gt;, &lt;code&gt;pyproject.toml&lt;/code&gt;, &lt;code&gt;setup.cfg&lt;/code&gt;, and &lt;code&gt;setup.py&lt;/code&gt;. Each has a different syntax. And only some of those declarations point at internal packages — most are public PyPI packages that you should skip.&lt;/p&gt;

&lt;p&gt;The interesting ones for cross-repo tracking are &lt;em&gt;editable git installs&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-e git+https://gitlab.com/org/internal-utils.git@v1.2#egg=internal-utils
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And &lt;code&gt;pyproject.toml&lt;/code&gt; dependencies pointing at internal packages published to a private PyPI registry. Matching "package name in a requirements file" to "the repo that builds that package" requires knowing what each repo &lt;em&gt;produces&lt;/em&gt; — which is why the artifact detection step (identifying that a given repo is the source of a Python package, based on its &lt;code&gt;pyproject.toml&lt;/code&gt; or &lt;code&gt;setup.py&lt;/code&gt;) is essential.&lt;/p&gt;

&lt;h3&gt;
  
  
  Go
&lt;/h3&gt;

&lt;p&gt;Go modules are cleaner than most. The &lt;code&gt;go.mod&lt;/code&gt; file is authoritative:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;require&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;gitlab&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;shared&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;lib&lt;/span&gt; &lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="m"&gt;.3.0&lt;/span&gt;
    &lt;span class="n"&gt;github&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;internal&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sdk&lt;/span&gt; &lt;span class="n"&gt;v0&lt;/span&gt;&lt;span class="m"&gt;.9.2&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The module path &lt;em&gt;is&lt;/em&gt; the repo path (more or less). The version is explicit. The main challenge is filtering: most &lt;code&gt;require&lt;/code&gt; entries are public modules (&lt;code&gt;github.com/stretchr/testify&lt;/code&gt;, &lt;code&gt;golang.org/x/net&lt;/code&gt;). You need a way to identify which entries point to repos within your org and skip the rest.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;replace&lt;/code&gt; directives add a wrinkle — they can redirect a module path to a local directory or a different remote path, which changes the effective dependency.&lt;/p&gt;

&lt;h3&gt;
  
  
  npm
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;package.json&lt;/code&gt; is the source of truth. For internal cross-repo dependencies, you're looking for scoped packages or git URL references:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dependencies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@company/ui-components"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^2.1.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@company/shared-utils"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"git+https://github.com/org/shared-utils.git#v1.0"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Scoped packages (&lt;code&gt;@company/...&lt;/code&gt;) from a private registry need to be matched to the repo that publishes them. Git URL references can be parsed directly. Public npm packages are filtered out.&lt;/p&gt;

&lt;p&gt;One thing all three ecosystems share: the link between "this repo &lt;em&gt;consumes&lt;/em&gt; package X" and "this repo &lt;em&gt;produces&lt;/em&gt; package X" is not always obvious from a single file. You need to first discover what each repo publishes (by reading its &lt;code&gt;pyproject.toml&lt;/code&gt;, &lt;code&gt;go.mod&lt;/code&gt; module declaration, or &lt;code&gt;package.json&lt;/code&gt; name field), then match consumers to producers across the org.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ansible and Helm: infrastructure-specific dependency patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Ansible
&lt;/h3&gt;

&lt;p&gt;Ansible dependencies appear in &lt;code&gt;requirements.yml&lt;/code&gt; (roles and collections), &lt;code&gt;galaxy.yml&lt;/code&gt; (collection metadata and dependencies), and &lt;code&gt;meta/main.yml&lt;/code&gt; (role dependencies).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# requirements.yml&lt;/span&gt;
&lt;span class="na"&gt;roles&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;src&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;git+https://gitlab.com/org/ansible-roles/nginx.git&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v2.0&lt;/span&gt;

&lt;span class="na"&gt;collections&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;company.shared_collection&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;=1.0"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The complexity here is similar to Python: matching a Galaxy-style name (&lt;code&gt;company.shared_collection&lt;/code&gt;) to the repo that publishes it requires artifact detection. Git URL sources are more straightforward but still need normalisation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Helm
&lt;/h3&gt;

&lt;p&gt;Helm chart dependencies are declared in &lt;code&gt;Chart.yaml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;dependencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;17.x"&lt;/span&gt;
    &lt;span class="na"&gt;repository&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://charts.bitnami.com/bitnami"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;auth-service&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.2.0"&lt;/span&gt;
    &lt;span class="na"&gt;repository&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://helm.internal.company.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Public chart repositories (Bitnami, stable, etc.) are filtered out. Internal repository references need to be matched to the repos that build those charts. And &lt;code&gt;file://&lt;/code&gt; references to local charts in the same repo are internal — not cross-repo dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kubernetes and Kustomize: the deployment layer
&lt;/h2&gt;

&lt;p&gt;Kubernetes manifests and Kustomize configurations add another dependency surface — one that's often overlooked because it's at the "deployment" end of the pipeline rather than the "build" end.&lt;/p&gt;

&lt;p&gt;Kustomize's &lt;code&gt;kustomization.yaml&lt;/code&gt; can reference resources and bases from other repos:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;https://github.com/org/k8s-base//manifests/monitoring?ref=v1.0&lt;/span&gt;

&lt;span class="na"&gt;bases&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;github.com/org/shared-platform//overlays/production&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are cross-repo references with the same double-slash subdirectory convention as Terraform. They need the same kind of URL normalisation and repo matching.&lt;/p&gt;

&lt;p&gt;Kubernetes manifests themselves reference Docker images in container specs, Helm charts in &lt;code&gt;HelmRelease&lt;/code&gt; custom resources, and ConfigMaps or Secrets by name. The image references connect back to the Docker dependency surface — another example of how the graph crosses ecosystem boundaries.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real challenge: resolution
&lt;/h2&gt;

&lt;p&gt;Parsing is the visible work. But the step that makes or breaks the dependency graph is &lt;strong&gt;resolution&lt;/strong&gt;. Taking a parsed reference and matching it to an actual repo or artifact in your org.&lt;/p&gt;

&lt;p&gt;A Dockerfile says &lt;code&gt;FROM registry.company.com/platform/base-image:v2&lt;/code&gt;. Which repo builds that image? The registry path might not match the repo path. The image might be built by a CI pipeline in a repo named &lt;code&gt;docker-base-images&lt;/code&gt;, pushed to a registry path of &lt;code&gt;platform/base-image&lt;/code&gt;. Connecting those requires understanding the &lt;em&gt;producing&lt;/em&gt; side, not just the &lt;em&gt;consuming&lt;/em&gt; side.&lt;/p&gt;

&lt;p&gt;This is why artifact detection matters so much. Before you can resolve "this repo uses artifact X," you need to know "that repo produces artifact X." For Docker images, this means scanning CI configs for &lt;code&gt;docker push&lt;/code&gt; commands. For Python packages, it means reading &lt;code&gt;pyproject.toml&lt;/code&gt; to find the package name. For Helm charts, it means reading &lt;code&gt;Chart.yaml&lt;/code&gt;. Each ecosystem has a different way of declaring what a repo produces, and the resolver needs to cross-reference all of it.&lt;/p&gt;

&lt;p&gt;Resolution also has to deal with ambiguity. A Docker image reference like &lt;code&gt;base-image:v2&lt;/code&gt;, without a full registry prefix, could match multiple repos. A Python package name might be normalised differently (&lt;code&gt;my_package&lt;/code&gt; vs. &lt;code&gt;my-package&lt;/code&gt;). Terraform module paths might use SSH vs. HTTPS URLs for the same repo. The resolver needs normalisation rules, fuzzy matching strategies, and a confidence model — because some matches are certain and others are best-effort.&lt;/p&gt;

&lt;p&gt;Getting this right is the difference between a dependency graph that people trust and one they abandon after finding three false edges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why cross-ecosystem matters
&lt;/h2&gt;

&lt;p&gt;Any individual ecosystem's parsing problem is tractable. The Python community could build a Python dependency tracker. The Terraform community could build a module graph tool. And some have.&lt;/p&gt;

&lt;p&gt;But the actual dependency graph in a real organisation doesn't respect ecosystem boundaries. A Terraform module produces infrastructure that a Docker image is built on, which a CI pipeline deploys, which references a Helm chart, which pulls a shared Ansible role.&lt;/p&gt;

&lt;p&gt;If you only see the Terraform slice, you miss the Docker dependency that's about to break your deployment pipeline. If you only see Docker, you miss the Terraform module change that will change the infrastructure your image runs on.&lt;/p&gt;

&lt;p&gt;The value of cross-ecosystem discovery isn't additive — it's multiplicative. Each new ecosystem you add doesn't just give you more nodes in the graph. It reveals &lt;em&gt;connections between ecosystems&lt;/em&gt; that were previously invisible. Those connections are exactly where surprise breakage lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned building this
&lt;/h2&gt;

&lt;p&gt;I've been building &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt; to solve this problem — auto-discovering cross-repo dependencies across all the ecosystems described above and presenting them as a queryable, visual graph with blast radius analysis.&lt;/p&gt;

&lt;p&gt;A few things I've learned along the way:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The parser is the easy part.&lt;/strong&gt; Extracting &lt;code&gt;source = "..."&lt;/code&gt; from a Terraform file is straightforward. The hard parts are resolution (matching references to actual repos), freshness (keeping the graph current as repos change), and staleness detection (knowing when a previously-discovered dependency no longer exists because a repo was renamed or archived).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Confidence matters.&lt;/strong&gt; Not all discovered dependencies are equally certain. A Terraform &lt;code&gt;source&lt;/code&gt; with a full git URL and a pinned ref is high confidence. A Docker &lt;code&gt;FROM&lt;/code&gt; with variable substitution and no default value is low confidence. Exposing this confidence to users — rather than pretending everything is equally certain — is critical for trust.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The graph is the product, not the report.&lt;/strong&gt; Every DIY solution I've seen generates a static output — a CSV, a SQLite dump, a rendered image. The real value comes when the graph is interactive, queryable, and always current. "Show me every repo affected if I change this module" should be a click, not a pipeline run.&lt;/p&gt;

&lt;p&gt;If you're building infrastructure for a platform team and this problem resonates, I'd love to hear how it shows up in your stack. The edge cases are different for every org, and understanding them is how the tooling gets better.&lt;/p&gt;

&lt;p&gt;You can see more at &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;riftmap.dev&lt;/a&gt;, or reach me at &lt;a href="mailto:hello@riftmap.dev"&gt;hello@riftmap.dev&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>platformengineering</category>
      <category>infrastructure</category>
      <category>terraform</category>
      <category>docker</category>
    </item>
    <item>
      <title>The State of Infrastructure Dependency Tooling in 2026</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Mon, 29 Jun 2026 12:08:46 +0000</pubDate>
      <link>https://dev.to/danielwe/the-state-of-infrastructure-dependency-tooling-in-2026-1fib</link>
      <guid>https://dev.to/danielwe/the-state-of-infrastructure-dependency-tooling-in-2026-1fib</guid>
      <description>&lt;p&gt;&lt;em&gt;An honest survey of what exists, what each tool actually solves, and where the gap is widest.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;When I started researching cross-repo infrastructure dependency management, first as a practitioner hitting the problem at client sites, then as someone building a tool to solve it, I expected to find a crowded space. Dependency management is a well-understood problem in software engineering. Package managers have solved it for application code. Surely someone had solved it for infrastructure.&lt;/p&gt;

&lt;p&gt;They hadn't. Not because nobody tried, but because the tools that exist were built to solve &lt;em&gt;adjacent&lt;/em&gt; problems. Each one is good at what it does. None of them answer the specific question that platform teams keep asking: &lt;strong&gt;if I change this shared module, image, or template, which repos break and who do I need to notify?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This post is a genuine attempt to map the landscape: what exists today, where each tool shines, and where it stops. I've used or evaluated most of these tools, talked with engineers who rely on them daily, and spent time in community discussions where people describe what's working and what isn't. I've tried to be fair. Every tool here solves a real problem for real teams. The point is not that they're bad. It's that the &lt;em&gt;specific&lt;/em&gt; problem of cross-repo infrastructure dependency visibility sits in a gap between them.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to think about the landscape
&lt;/h2&gt;

&lt;p&gt;The tools people reach for when they encounter the dependency visibility problem fall into six categories. Understanding the categories matters more than evaluating individual tools, because the gap isn't a missing feature in any one product. It's a missing &lt;em&gt;category&lt;/em&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Service catalogs and developer portals&lt;/strong&gt; — "what exists and who owns it"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dependency update automation&lt;/strong&gt; — "keep everything on the latest version"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform-specific dependency explorers&lt;/strong&gt; — "see relationships within one ecosystem"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monorepo build tools&lt;/strong&gt; — "solve the problem by putting everything in one repo"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security and compliance scanners&lt;/strong&gt; — "find vulnerabilities and license issues across repos"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DIY scripts and manual approaches&lt;/strong&gt; — "build exactly what you need, maintain it forever"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each category addresses a real need. Each one gets mistaken for a solution to the visibility problem. Let's look at why.&lt;/p&gt;

&lt;h2&gt;
  
  
  Service catalogs: Backstage, Port, OpsLevel
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What they do well
&lt;/h3&gt;

&lt;p&gt;Service catalogs, with Backstage being the most prominent and Port and OpsLevel as managed alternatives, give engineering organisations a central place to answer "what services do we run, who owns them, and where do they live?" Backstage in particular has become the standard for developer portals at larger companies. It integrates with CI/CD, on-call systems, documentation, and cloud resources. It gives teams a single pane of glass for service ownership.&lt;/p&gt;

&lt;p&gt;Port and OpsLevel take a similar approach with less self-hosting overhead. They provide scorecards, maturity tracking, and integrations with cloud providers. For organisations that need a developer portal, especially those with hundreds of services and unclear ownership, these tools are genuinely valuable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where they stop
&lt;/h3&gt;

&lt;p&gt;The dependency model in service catalogs is &lt;strong&gt;registered, not discovered.&lt;/strong&gt; In Backstage, you define a &lt;code&gt;catalog-info.yaml&lt;/code&gt; per repo that describes the service, its owner, and its dependencies. This information is exactly as accurate as the last time someone updated that file.&lt;/p&gt;

&lt;p&gt;In practice, catalog YAML goes stale quickly. Engineers update code. They don't update the catalog. The dependency that got added three months ago isn't in the YAML. The one that got removed six months ago is still listed. As one engineer described it to me, it's "documentation with extra steps."&lt;/p&gt;

&lt;p&gt;This isn't a flaw in Backstage; it's a design choice. Service catalogs are built for &lt;em&gt;service metadata&lt;/em&gt; (ownership, documentation links, runbook URLs), not for &lt;em&gt;dependency graph accuracy&lt;/em&gt;. They're excellent at answering "who owns this service?" They're not designed to answer "which 40 repos consume this Terraform module at which version?" If you're weighing the catalog category against the alternatives for this specific job, I triage that decision in &lt;a href="https://riftmap.dev/blog/backstage-alternatives/" rel="noopener noreferrer"&gt;Backstage alternatives in 2026&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Port has invested in auto-discovery features that pull some metadata from cloud resources and Git, which narrows the gap. But the cross-repo infrastructure dependency graph (Terraform modules sourcing other modules, Docker base images consumed by dozens of repos, CI templates included across the org) still requires either manual registration or a separate discovery mechanism.&lt;/p&gt;

&lt;h3&gt;
  
  
  Who should use them
&lt;/h3&gt;

&lt;p&gt;Any organisation over ~50 services that needs a central developer portal with ownership tracking, documentation, and service maturity scoring. They're complementary to dependency visibility tooling, not a replacement for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dependency update automation: Renovate and Dependabot
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What they do well
&lt;/h3&gt;

&lt;p&gt;Renovate and Dependabot solve a specific, well-defined problem: when a dependency has a new version available, automatically open a pull request to update it. Renovate in particular is remarkably flexible. It supports dozens of package ecosystems, custom versioning schemes, grouping strategies, and scheduling controls. For teams that want to stay current on their dependencies, Renovate is close to best-in-class.&lt;/p&gt;

&lt;p&gt;Dependabot, integrated directly into GitHub, offers a lower-configuration path to the same outcome. It's particularly strong for npm, pip, Go modules, and GitHub Actions workflows.&lt;/p&gt;

&lt;p&gt;Both tools are excellent at what they do, and many teams should be using one of them regardless of what else they adopt for dependency visibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where they stop
&lt;/h3&gt;

&lt;p&gt;The core distinction is between &lt;strong&gt;prevention&lt;/strong&gt; and &lt;strong&gt;visibility&lt;/strong&gt;. Renovate and Dependabot solve prevention: they keep consumers updated so that version drift doesn't accumulate. They react &lt;em&gt;after&lt;/em&gt; a new version is published by opening PRs across consuming repos.&lt;/p&gt;

&lt;p&gt;What they don't provide is the &lt;em&gt;pre-release&lt;/em&gt; view. Before you publish version 2.0 of your shared Terraform module, before you even cut the release, you want to know: who is consuming the current version? How many repos will be affected? Which teams own them? Are any of them pinned to a version that's already three releases behind?&lt;/p&gt;

&lt;p&gt;Renovate can tell you that a new version is available. It cannot tell you the blast radius of a change you haven't published yet. These are different problems, and confusing them leads teams to believe they have dependency visibility when they actually have dependency &lt;em&gt;automation&lt;/em&gt;. This is the same gap that bites when you go to deprecate an internal library: the automation can perform the bump in every repo it's been pointed at, but it presupposes the list of consumers, and the repos pinned to the old version that fell out of the automation are exactly the ones that break on removal — &lt;a href="https://riftmap.dev/blog/deprecate-internal-library-find-consumers/" rel="noopener noreferrer"&gt;I've written up that angle separately&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;There's a subtler gap too: Renovate and Dependabot work at the &lt;em&gt;package&lt;/em&gt; level, not the &lt;em&gt;repo&lt;/em&gt; level. They know that &lt;code&gt;package.json&lt;/code&gt; depends on &lt;code&gt;@company/utils@1.2.0&lt;/code&gt;. They don't know that the repo containing that &lt;code&gt;package.json&lt;/code&gt; also has a Dockerfile that pulls an internal base image, a CI config that includes a shared template, and a Terraform module that sources three other internal modules. The cross-ecosystem picture is outside their scope.&lt;/p&gt;

&lt;h3&gt;
  
  
  Who should use them
&lt;/h3&gt;

&lt;p&gt;Every team that consumes shared dependencies and wants to avoid version drift. Use them &lt;em&gt;alongside&lt;/em&gt; dependency visibility tooling, not instead of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Platform-specific explorers: HCP Terraform, Artifactory, GitHub Dependency Graph
&lt;/h2&gt;

&lt;h3&gt;
  
  
  HCP Terraform Explorer
&lt;/h3&gt;

&lt;p&gt;HashiCorp Cloud Platform (HCP Terraform, formerly Terraform Cloud) includes a module explorer that shows which workspaces use which modules. If your entire Terraform workflow runs through HCP Terraform (plan, apply, state management, all of it) this gives you a reasonable view of Terraform-specific module relationships.&lt;/p&gt;

&lt;p&gt;The limitation is scope. It only works for Terraform. It only works for workspaces managed by HCP Terraform. If your org has a mix of HCP Terraform, self-hosted Terraform runners, CI-driven applies, and Atlantis (and many do) you get a partial view. The Docker images, CI templates, Helm charts, and Ansible roles that are part of the same dependency graph are invisible.&lt;/p&gt;

&lt;p&gt;For teams that are fully committed to HCP Terraform and only need Terraform module visibility, this is a solid native solution. For everyone else, it's a slice of the picture.&lt;/p&gt;

&lt;h3&gt;
  
  
  JFrog Artifactory
&lt;/h3&gt;

&lt;p&gt;Artifactory is a universal artifact repository. For Terraform, it can serve as a private module registry with metadata about published modules: versions, download counts, and some dependency information.&lt;/p&gt;

&lt;p&gt;The gap is on the consumer side. Artifactory knows what modules have been &lt;em&gt;published&lt;/em&gt;. It doesn't reliably tell you which repos are &lt;em&gt;consuming&lt;/em&gt; a given module at which version right now. The publishing-side view and the consuming-side view are different, and Artifactory was built for the former.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub Dependency Graph and Dependabot Alerts
&lt;/h3&gt;

&lt;p&gt;GitHub's built-in dependency graph parses lock files and manifests to show what each repo depends on. It's tightly integrated with Dependabot alerts for security vulnerabilities.&lt;/p&gt;

&lt;p&gt;It's good for application dependencies: npm, pip, Go modules, Ruby gems. It's limited for infrastructure dependencies. Terraform module sources, Docker base image relationships, CI template includes, Helm chart dependencies. These aren't part of GitHub's dependency graph model. And the view is per-repo, not org-wide: you can see what &lt;em&gt;one repo&lt;/em&gt; depends on, but you can't easily ask "who across my org depends on &lt;em&gt;this&lt;/em&gt; artifact?"&lt;/p&gt;

&lt;h3&gt;
  
  
  Who should use them
&lt;/h3&gt;

&lt;p&gt;Teams that operate primarily within a single ecosystem and a single platform. If you're a pure Terraform Cloud shop, HCP Explorer gives you module visibility. If you're managing artifact publishing, Artifactory is the right tool for that. If you need per-repo application dependency visibility on GitHub, the built-in graph works. None of them provide the cross-ecosystem, org-wide view.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monorepo build tools: Nx, Turborepo, Bazel
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What they do well
&lt;/h3&gt;

&lt;p&gt;This is the category that genuinely &lt;em&gt;solves&lt;/em&gt; the dependency visibility problem, within its context. Monorepo build tools like Nx, Turborepo, and Bazel are built around a dependency graph. They know which packages depend on which. They can tell you exactly what's affected by a change. They run only the tests and builds that are actually impacted.&lt;/p&gt;

&lt;p&gt;Nx in particular is impressive here. Its &lt;code&gt;nx affected&lt;/code&gt; command does precisely what platform teams wish they could do across a polyrepo: given a set of changed files, determine which projects in the repo are affected and need to be rebuilt or retested. It has a visual graph explorer, dependency analysis, and caching that skips unaffected work. Turborepo provides similar affected-project detection with a focus on speed and simplicity. Bazel takes it further with hermetic builds and fine-grained dependency tracking at the file level.&lt;/p&gt;

&lt;p&gt;If your organisation operates as a monorepo, these tools give you dependency visibility almost for free. The graph is implicit in the repo structure, and the build tool maintains it automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where they stop
&lt;/h3&gt;

&lt;p&gt;The limitation is the prerequisite: you have to be in a monorepo.&lt;/p&gt;

&lt;p&gt;Most infrastructure teams dealing with the cross-repo dependency problem are not in a monorepo, and migrating to one is not a realistic option. An organisation with three hundred existing repos, multiple teams with separate access controls, compliance boundaries between business units, and repos inherited through acquisitions cannot consolidate into a single repository as a quarter-long project. It's a multi-year architectural migration with significant operational risk.&lt;/p&gt;

&lt;p&gt;Even organisations that &lt;em&gt;want&lt;/em&gt; to move toward a monorepo often can't do it all at once. They might consolidate application code but keep infrastructure repos separate. Or they might run a monorepo for one team while other teams stay on polyrepo. The dependency graph still crosses that boundary, and monorepo build tools only see what's inside the monorepo.&lt;/p&gt;

&lt;p&gt;There's also a scope question. Nx and Turborepo are primarily designed for application code (TypeScript/JavaScript, with growing support for other languages). The infrastructure dependency graph, which includes Terraform modules, Docker base images, CI templates, Helm charts, and Ansible roles, doesn't map cleanly onto a monorepo build tool's model. Bazel is more general-purpose, but the overhead of Bazel adoption is substantial and rarely justified purely for dependency visibility.&lt;/p&gt;

&lt;p&gt;One more nuance worth noting: even inside a monorepo, the build tool's dependency graph tracks &lt;em&gt;build-time&lt;/em&gt; relationships. It knows that package A imports package B. It doesn't necessarily know that a Dockerfile in one directory pulls a base image that's built by a CI job defined in another directory, or that a Terraform module sources another module via a git URL pointing at a subdirectory. The infrastructure dependency surface is different from the application dependency surface, even within a single repo.&lt;/p&gt;

&lt;h3&gt;
  
  
  Who should use them
&lt;/h3&gt;

&lt;p&gt;Any team already operating in a monorepo, or actively planning a migration to one. Nx and Turborepo are excellent tools. The point is not that monorepos are wrong. It's that "switch to a monorepo" is not actionable advice for the organisations that feel the cross-repo dependency problem most acutely.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security and compliance scanners: Wiz, Snyk, SBoM tools
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What they do well
&lt;/h3&gt;

&lt;p&gt;Security scanners like Wiz and Snyk, along with the broader SBoM (Software Bill of Materials) ecosystem, solve an important and well-funded problem: finding known vulnerabilities (CVEs) and license compliance issues across an organisation's software supply chain.&lt;/p&gt;

&lt;p&gt;Wiz in particular has an attractive deployment model for this discussion: org-level installation via webhook or cloud connector, with no per-repo configuration needed. It scans broadly and automatically. The &lt;em&gt;install pattern&lt;/em&gt; is exactly right for dependency discovery.&lt;/p&gt;

&lt;p&gt;Snyk provides deep analysis of application dependencies, container images, and infrastructure-as-code configurations. It's strong on identifying &lt;em&gt;security risks&lt;/em&gt; within dependency trees.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where they stop
&lt;/h3&gt;

&lt;p&gt;These tools are built around a security and compliance model, not a dependency visibility model. The question they answer is "do any of my dependencies have known vulnerabilities?", not "which repos depend on this internal module and what's the blast radius if I change it?"&lt;/p&gt;

&lt;p&gt;The distinction matters in practice. Wiz can tell you that a container image has a CVE. It doesn't tell you that 30 repos across your org use that image as a base, that 12 of them are pinned to an outdated version, and that updating the image will require coordinated changes across four teams. Snyk can tell you that a Python package has a license issue. It doesn't show you the cross-repo dependency graph between your internal Terraform modules, CI templates, and Helm charts.&lt;/p&gt;

&lt;p&gt;There's also a scope gap: most security scanners focus on application dependencies (npm packages, Python libraries, container images) and cloud configuration. Terraform module-to-module relationships, CI template includes, and Ansible role dependencies are generally outside their scanning model.&lt;/p&gt;

&lt;p&gt;The pricing model is also relevant. Enterprise security platforms are priced for security budgets, not platform engineering budgets. Using a security scanner as a general-purpose dependency mapping tool would be expensive overkill for the dependency visibility problem specifically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Who should use them
&lt;/h3&gt;

&lt;p&gt;Any organisation that takes software supply chain security seriously, which should be everyone. These tools are essential for what they do. They just don't do dependency visibility, and shouldn't be expected to.&lt;/p&gt;

&lt;h2&gt;
  
  
  DIY approaches: grep, scripts, and custom crawlers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What they do well
&lt;/h3&gt;

&lt;p&gt;This is the category that tells you the most about the problem. When multiple independent engineers, with no coordination between them, build nearly identical tools, you're looking at a genuine unmet need.&lt;/p&gt;

&lt;p&gt;The common pattern: a scheduled job that shallow-clones every repo in the org (or uses the GitLab/GitHub API), greps or parses Dockerfiles, Terraform source blocks, and CI include directives, dumps the results to SQLite or a spreadsheet, and maybe renders a graph with something like Observable Framework or a simple web page.&lt;/p&gt;

&lt;p&gt;These solutions &lt;em&gt;work&lt;/em&gt;. They prove the approach is sound. Some of them are impressively capable, handling hundreds or even thousands of repos, tracking multiple file types, producing usable output. The engineers who build them solve their team's immediate problem, often in a day or two of effort.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where they stop
&lt;/h3&gt;

&lt;p&gt;Bespoke solutions have a consistent set of failure modes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They go stale.&lt;/strong&gt; A nightly cron job means the graph is up to 24 hours out of date. A manually-triggered script means it's as fresh as whoever last remembered to run it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They're fragile.&lt;/strong&gt; When a repo is renamed, archived, or changes its structure, the script breaks. Handling these edge cases, and the dozens of others that emerge over time, requires ongoing maintenance that nobody budgeted for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They're not generalised.&lt;/strong&gt; Each one handles the specific file types and repo structures of one org. Moving to a new org, or adding support for a new ecosystem, means rewriting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They don't survive their creator.&lt;/strong&gt; The engineer who built the script in a weekend moves to another team or another company. Nobody else understands how it works. Six months later, it's running but nobody trusts the output. Or it's silently broken. Or it's been turned off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They lack a query interface.&lt;/strong&gt; Most DIY solutions produce a static output. Asking "what's the blast radius of changing module X?" means writing a new query or script, not clicking a button.&lt;/p&gt;

&lt;p&gt;The meta-point: the fact that people keep building these, and that the solutions converge on the same architecture, validates both the problem and the general approach. What's missing is a product that does it well, keeps doing it, and handles the long tail of edge cases that a weekend hack can't.&lt;/p&gt;

&lt;h3&gt;
  
  
  Who should use them
&lt;/h3&gt;

&lt;p&gt;Teams with a specific, narrow need (maybe just Terraform modules across 30 repos) and an engineer willing to maintain a custom script. For anything broader or longer-lived, the maintenance cost exceeds the build cost within months.&lt;/p&gt;

&lt;h2&gt;
  
  
  The gap in the landscape
&lt;/h2&gt;

&lt;p&gt;Line up the categories side by side and the gap becomes clear:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Service catalogs&lt;/strong&gt; know what exists and who owns it, but dependencies are registered, not discovered, and go stale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dependency update tools&lt;/strong&gt; keep consumers current, but only react after a version is published and don't show blast radius beforehand.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Platform-specific explorers&lt;/strong&gt; show relationships within one ecosystem, but are blind to everything outside it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monorepo build tools&lt;/strong&gt; solve the problem structurally, but only if you're already in a monorepo. Most infrastructure teams aren't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security scanners&lt;/strong&gt; find vulnerabilities across repos, but don't map infrastructure dependency graphs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DIY scripts&lt;/strong&gt; prove the approach works, but are bespoke, brittle, and don't survive their creator.&lt;/p&gt;

&lt;p&gt;The missing category is &lt;strong&gt;cross-ecosystem infrastructure dependency visibility&lt;/strong&gt;: a tool that automatically discovers how repos depend on each other across &lt;em&gt;all&lt;/em&gt; the ecosystems a platform team uses (Terraform, Docker, CI templates, Helm, Ansible, Python, Go, npm, Kubernetes), keeps that graph current without manual maintenance, and makes it queryable.&lt;/p&gt;

&lt;p&gt;Specifically, the gap has these characteristics:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auto-discovered, not registered.&lt;/strong&gt; The dependency graph must be built from what's actually in the repos: the Terraform source blocks, the Dockerfile FROM statements, the CI include directives. Any approach that requires humans to maintain a catalog will go stale. The community has been clear and consistent on this point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-ecosystem.&lt;/strong&gt; Real infrastructure dependency graphs cross tool boundaries. A Terraform module change affects Docker builds which affect CI pipelines which affect Helm deployments. Single-ecosystem tools miss exactly the connections where surprise breakage happens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consumer-side visibility.&lt;/strong&gt; Most existing tools show the &lt;em&gt;producer&lt;/em&gt; side: what modules exist, what images are published, what templates are available. The question platform teams actually need answered is the &lt;em&gt;consumer&lt;/em&gt; side: who is pulling this artifact, at which version, right now?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Org-level installation.&lt;/strong&gt; At a hundred or more repos, anything that requires per-repo setup is a non-starter. The installation model should be a single read-only access token at the org level, similar to how security scanners deploy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Freshness without manual effort.&lt;/strong&gt; The graph must stay current through scheduled or event-driven rescans. Staleness is the number one reason DIY solutions get abandoned. Any productised solution that doesn't solve freshness will follow the same path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Blast radius before you push.&lt;/strong&gt; The highest-value query is prospective, not retrospective: before I publish a breaking change, show me every repo and team that will be affected. This requires the full transitive graph, not just direct dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I think this is heading
&lt;/h2&gt;

&lt;p&gt;The dependency visibility gap isn't going to close on its own. If anything, it's widening. Organisations are adopting more infrastructure tools, not fewer. The shift toward platform engineering means more shared components consumed across more repos. AI-assisted development is accelerating code output without any corresponding improvement in dependency awareness.&lt;/p&gt;

&lt;p&gt;I expect we'll see this category emerge properly over the next couple of years. The building blocks exist: Git platform APIs for repo enumeration, well-understood file formats for parsing, graph databases and recursive queries for traversal, and a clear community demand validated by the fact that people keep building their own versions.&lt;/p&gt;

&lt;p&gt;Some existing tools may expand into this space. Backstage's plugin ecosystem could support auto-discovery. Renovate's deep parser library could be extended for visibility queries. Security platforms like Wiz, which already have the org-level deployment model, could add infrastructure dependency graphs alongside their vulnerability scanning.&lt;/p&gt;

&lt;p&gt;What I think is more likely is that purpose-built tools will emerge. Tools designed specifically for cross-repo infrastructure dependency visibility, with auto-discovery, cross-ecosystem support, and blast radius analysis as core capabilities rather than bolted-on features.&lt;/p&gt;

&lt;p&gt;That's what I'm building with &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt;. It scans a GitLab or GitHub org, auto-discovers cross-repo dependencies across Terraform, Docker, CI pipelines, Python, Go, npm, Ansible, Helm, Kubernetes, and Kustomize, and presents the graph with interactive blast radius analysis. Org-level install, no per-repo config, no YAML to maintain.&lt;/p&gt;

&lt;p&gt;It's currently in early access, and it's far from finished. But the foundation is live: twelve ecosystem parsers, a resolver that matches consumers to producers across the org, incremental scanning to keep the graph fresh, and a visual graph with impact mode that shows exactly which repos are affected by a given change.&lt;/p&gt;

&lt;p&gt;If you're working in this space, whether as an engineer dealing with the problem, someone building tooling, or someone evaluating solutions, I'd welcome the conversation. The more perspectives I hear, the better the tooling gets for everyone.&lt;/p&gt;

&lt;p&gt;You can see more at &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;riftmap.dev&lt;/a&gt;, or reach me at &lt;a href="mailto:hello@riftmap.dev"&gt;hello@riftmap.dev&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're interested in following along as Riftmap develops, sign up for early access at &lt;a href="https://riftmap.dev" rel="noopener noreferrer"&gt;riftmap.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>platformengineering</category>
      <category>infrastructure</category>
      <category>dependencymanagement</category>
      <category>devops</category>
    </item>
    <item>
      <title>The catalog maintenance trap: why service catalogs go stale</title>
      <dc:creator>Daniel Westgaard</dc:creator>
      <pubDate>Mon, 29 Jun 2026 12:08:38 +0000</pubDate>
      <link>https://dev.to/danielwe/the-catalog-maintenance-trap-why-service-catalogs-go-stale-2a71</link>
      <guid>https://dev.to/danielwe/the-catalog-maintenance-trap-why-service-catalogs-go-stale-2a71</guid>
      <description>&lt;p&gt;&lt;em&gt;Backstage and the developer-portal category solve a real problem. The reason platform teams quietly abandon them is something different, and it points at the shape of what actually works.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Every few weeks I run into the same observation from someone in platform engineering: their team looked at Backstage, evaluated it seriously, maybe even ran a proof of concept, and walked away. The reason is rarely that it doesn't do what it advertises. The reason is that the work of keeping it running turned out to be bigger than the value it returned.&lt;/p&gt;

&lt;p&gt;I've now heard this pattern in r/devops threads, in conversations with engineers who built their own internal alternatives, and most recently from a platform engineer who summarised his evaluation in a single sentence: the cost of maintaining it was bigger than what we got back.&lt;/p&gt;

&lt;p&gt;That sentence has been bouncing around my head, because it names something I haven't seen named clearly. I'm calling it &lt;strong&gt;the catalog maintenance trap&lt;/strong&gt;: the gap between what a service catalog promises and what it costs to keep it true.&lt;/p&gt;

&lt;h2&gt;
  
  
  What catalogs actually require
&lt;/h2&gt;

&lt;p&gt;Backstage, Port, OpsLevel, and the rest of the developer-portal category are built around a simple model. Each service is described by a YAML file (in Backstage's case, &lt;code&gt;catalog-info.yaml&lt;/code&gt;) that lists the service, its owner, its dependencies, links to runbooks, on-call rotations, and so on. The portal aggregates these files and gives the whole organisation a single pane of glass.&lt;/p&gt;

&lt;p&gt;The model is clean. The trouble is the verb tense. The catalog describes the world &lt;em&gt;as of the last time someone updated it&lt;/em&gt;. Every dependency added, removed, or renamed since then is invisible until a human notices and edits the YAML. Multiply that across a hundred repos and a few dozen engineers, and the catalog drifts faster than anyone wants to admit.&lt;/p&gt;

&lt;p&gt;This isn't a Backstage flaw. It's a property of any system where the source of truth is a manually-maintained catalog. The same trap exists in any "register your dependencies in this file" approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why teams stop maintaining it
&lt;/h2&gt;

&lt;p&gt;The maintenance gradient is brutal. On day one, the catalog is shiny and motivating, and the platform team writes the first batch of YAML by hand. Over the next few weeks, two or three engineers add their services. Then onboarding starts to slow down. Then someone changes a dependency without updating the catalog. Then the dashboard shows an outdated graph. Then a new hire asks "is this accurate?" and the honest answer is "kind of, in places."&lt;/p&gt;

&lt;p&gt;At that point the catalog has stopped being an authoritative graph and started being documentation that was &lt;em&gt;supposed to be&lt;/em&gt; authoritative. Which is worse than not having it, because people make decisions on the assumption that it's accurate.&lt;/p&gt;

&lt;p&gt;The platform team has three options at this stage:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Mandate catalog updates as part of every PR review. An organisational tax that nobody enforces consistently.&lt;/li&gt;
&lt;li&gt;Build automation to keep the catalog current. Which is the actual problem, only now there's also a YAML schema in the loop.&lt;/li&gt;
&lt;li&gt;Quietly let it rot and rely on tribal knowledge again.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Option three is what most teams converge on. Not because the platform engineers are lazy, but because the cost of options one and two exceeds the value of having a catalog that's accurate to within a week.&lt;/p&gt;

&lt;h2&gt;
  
  
  The two escape routes
&lt;/h2&gt;

&lt;p&gt;When teams give up on the catalog, they tend to take one of two paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Path one: monorepo.&lt;/strong&gt; Consolidate everything into one repository and let a build tool like Nx or Turborepo maintain the dependency graph implicitly. This works, but it isn't a tooling decision. It's an architecture decision that takes years to execute, often isn't feasible across business units, and doesn't help with the infrastructure dependency surface (Terraform, Docker, Helm, CI templates) the way it helps with application code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Path two: do nothing.&lt;/strong&gt; Accept that the dependency graph lives in the heads of the senior engineers. Ask around when you need to make a breaking change. Hope nobody on the relevant team is on holiday. This is what most organisations actually do, and it's also what creates the conditions for the three-hour Slack outage when a base image changes and six teams break in sequence.&lt;/p&gt;

&lt;p&gt;Neither escape route is satisfying. The catalog promised to solve this problem, and the alternatives are either an architectural migration or institutional amnesia.&lt;/p&gt;

&lt;h2&gt;
  
  
  The thing the catalog model gets wrong
&lt;/h2&gt;

&lt;p&gt;The catalog model assumes humans should be the source of truth about dependencies. They shouldn't be. The actual source of truth already exists, written down in machine-readable form, in the repositories themselves: Terraform &lt;code&gt;source = "..."&lt;/code&gt; blocks, Dockerfile &lt;code&gt;FROM&lt;/code&gt; statements, &lt;code&gt;go.mod&lt;/code&gt; &lt;code&gt;require&lt;/code&gt; directives, &lt;code&gt;.gitlab-ci.yml&lt;/code&gt; &lt;code&gt;include&lt;/code&gt; references, Helm &lt;code&gt;Chart.yaml&lt;/code&gt; dependencies. The dependency graph is &lt;em&gt;already declared&lt;/em&gt;. It's just declared across thousands of files in dozens of formats.&lt;/p&gt;

&lt;p&gt;The work isn't getting humans to register a second copy in YAML. The work is parsing the declarations that already exist and stitching them into one queryable graph.&lt;/p&gt;

&lt;p&gt;This is what auto-discovery means in practice. No catalog. No annotations. No PR template reminding people to update the manifest. The graph is read from the source files, and the source files are the ones engineers are already editing because they have to in order to ship code.&lt;/p&gt;

&lt;p&gt;This is the difference between documentation that engineers are asked to maintain and a graph that maintains itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this actually changes
&lt;/h2&gt;

&lt;p&gt;Once discovery is automatic, the value proposition shifts. You stop selling "a place to write down what you have" and start selling "a query interface for the truth your repos already encode." The questions change too:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Not "is the catalog up to date?" but "what's the blast radius of changing this module?"&lt;/li&gt;
&lt;li&gt;Not "did everyone fill in their YAML?" but "which repos still pin to the old major version?"&lt;/li&gt;
&lt;li&gt;Not "who owns this service?" but &lt;a href="https://riftmap.dev/blog/deprecate-internal-library-find-consumers/" rel="noopener noreferrer"&gt;"if I deprecate this artifact, which teams need to be in the room?"&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are the questions platform teams actually ask. The catalog model could only answer them if the catalog was perfect, which it never was. Auto-discovery answers them by skipping the catalog step entirely.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Backstage still fits
&lt;/h2&gt;

&lt;p&gt;I want to be careful not to overclaim. Backstage genuinely solves problems that auto-discovery doesn't: service ownership across hundreds of services, documentation aggregation, templated scaffolding for new services, a unified frontend for on-call and runbooks. Those are real jobs, and a service catalog is a reasonable tool for them.&lt;/p&gt;

&lt;p&gt;The mistake is using the catalog as the dependency graph. The catalog is good at things humans &lt;em&gt;want&lt;/em&gt; to register on purpose (this service is owned by team X, on-call is via PagerDuty, the runbook is in Confluence). It is bad at things that change underneath every commit (the actual graph of what consumes what).&lt;/p&gt;

&lt;p&gt;For dependency visibility specifically, the catalog model isn't the right shape. Something that reads the source and rebuilds the graph on every scan is.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we built
&lt;/h2&gt;

&lt;p&gt;I'm working on &lt;a href="https://riftmap.dev/" rel="noopener noreferrer"&gt;Riftmap&lt;/a&gt; because this is the gap I kept hitting at client engagements as a consultant, and it's the gap the engineers I talk to keep describing. Riftmap connects to a GitLab or GitHub org with a read-only token, parses the dependency declarations that already exist across Terraform, Docker, Helm, CI pipelines, Go, npm, Python, Ansible, Kubernetes, and Kustomize, and presents the graph with &lt;a href="https://riftmap.dev/blog/ai-doesnt-understand-blast-radius/" rel="noopener noreferrer"&gt;blast radius analysis&lt;/a&gt;. There is no catalog YAML to maintain. The graph rebuilds itself when repos change.&lt;/p&gt;

&lt;p&gt;If your team has evaluated Backstage and walked away because the maintenance cost was bigger than the value, or if you're staring at that same trade-off right now, I'd be curious to hear about it. The pattern I described in this post is built from a small number of conversations and the consistent signal in r/devops and r/terraform threads. The more I hear from teams in this position, the better the tool gets.&lt;/p&gt;

&lt;p&gt;If you're earlier in the decision and the question is still "is Backstage worth adopting at all?", the companion to this piece reframes that debate around adoption rather than cost — Spotify's own head of Backstage engineering puts external adoption at around 10% and explains why: &lt;a href="https://riftmap.dev/blog/is-backstage-worth-it/" rel="noopener noreferrer"&gt;is Backstage worth it?&lt;/a&gt;. And if you're actively evaluating alternatives, I wrote a separate triage of the "Backstage alternatives" question by the job that sent you looking — portals, managed Backstage, or a parsed dependency graph: &lt;a href="https://riftmap.dev/blog/backstage-alternatives/" rel="noopener noreferrer"&gt;Backstage alternatives in 2026&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You can try Riftmap free at &lt;a href="https://app.riftmap.dev" rel="noopener noreferrer"&gt;app.riftmap.dev&lt;/a&gt;, or reach me directly at &lt;a href="mailto:daniel@riftmap.dev"&gt;daniel@riftmap.dev&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you've hit the catalog maintenance trap and want to compare notes, I'd genuinely like to hear how it went. Honest accounts of what worked and what didn't are the most valuable input I get.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>platformengineering</category>
      <category>backstage</category>
      <category>servicecatalogs</category>
      <category>developerportals</category>
    </item>
  </channel>
</rss>
