DEV Community

David Aronchick
David Aronchick

Posted on • Originally published at distributedthoughts.org

The Upstream Problem: Why Context Graphs Are Starving

Foundation Capital just published what they're calling AI's trillion-dollar opportunity: context graphs. They argue that enterprise value is shifting from systems of record (Salesforce, Workday, SAP) to systems of agents. The new crown jewel isn't the data itself. It's the context graph: a living record of decision traces stitched across entities and time, where precedent becomes searchable.

They're right about the destination. But Greg Ceccarelli's response on LinkedIn caught something important that their framing misses. Foundation Capital focuses on capturing decisions at execution time. That matters, but it's the last mile. The first mile is still bleeding out.
The Telephone Game (But With Developers)
Decisions don't originate at execution time. They originate in conversations.

A PM pattern-matches across customer interviews. Engineering debates constraints in Slack. A VP makes a call on a Zoom that nobody documents. By the time any of this hits a system of record, the context has been compressed, lossy-encoded, and re-interpreted three times. It's a game of telephone where the prize is a barely articulated card in your Kanban roadmap.

Recording meetings is table stakes now. The raw material exists. But most of it vanishes. It's searchable in theory and useless in practice. You can find that the decision was made, but you can't find why it made sense given everything else that was happening at the time.

Jamin Ball's piece "Long Live Systems of Record" pushed back on the "agents kill everything" narrative, arguing that agents don't replace systems of record, they raise the bar for what a good one looks like. I think he’s right, but the problem is no one is the voice for the downstream consumers. The reasoning, the exceptions, the context that justified a decision in the moment isn’t in any form that a human (let alone an agent) can find or consume. That's what's missing.

Context graphs need to be fed. The feed is conversations, and, right now, conversations evaporate.
The Same Problem, Worse, in Data
Greg's framing focuses on software development, and that's where his company SpecStory and their new Intent product are building. I think these are awesome, and deserve a lot of attention. In fact, so much so that I want to take it further. Software development is just one domain where decisions get lost upstream.

Data pipelines, our world, are another, and arguably worse.

When a data engineer decides which fields to drop during transformation, how to handle null values in a critical column, why a particular join strategy was chosen over another, what "clean" means for this specific dataset... where does that reasoning live? In a PR comment that gets archived. A Slack thread that disappears. Someone's head who leaves the company.

The data observability market has exploded. Gartner estimates data observability will be a $2.5B+ market by 2027. But all of it focuses on detecting problems after they happen. The upstream intent, why the pipeline was designed this way, what tradeoffs were considered, what the original constraints were, remains uncaptured.

Another favorite company of mine,Great Expectations, does a great job capturing what should be true. And dbt moves documentation closer to the code. And we have standards

, for example, captures the what of transformations. But almost nothing captures the WHY.

When an ML model makes a bad prediction, you can trace back to the training data. But can you trace back to why the training data was prepared that way? Who decided to impute missing values with medians instead of dropping rows? What was the conversation that led to that feature engineering choice? What did the team know at the time that isn't written down anywhere?

The decision trace doesn't exist because nobody captured it when it happened.
Intent Has Locality
This connects to something I've been thinking about for years. Intent has locality, just like data.

The richest context about a decision exists at the moment it's made, in the place it's made. Move it somewhere else, a different system, a different time, a summary written later, and you lose fidelity. This is true whether you're moving bytes across a network or moving reasoning into documentation.

Think about what happens when you try to document a decision after the fact. You're reconstructing. You remember the outcome but not the three alternatives you considered. You remember the constraint that mattered most but not the secondary factors that shaped the final call. You remember that someone raised an objection but not exactly what shifted the conversation.

The further you get from the moment of decision, the more context you lose. And unlike data, you can't just store a copy closer to where it's needed. The moment passes. The reasoning evaporates. What remains is the artifact without the intent.
What SpecStory Is Building
This is why what Greg and the SpecStory team are building with Intent matters. They started where decisions turn into code: the conversation between developers and coding agents. Intent records every exchange with Claude Code, Cursor, GitHub Copilot, Codex, Gemini. The transcript of how software actually gets built.

But as they asked where the intent came from, the answer kept pointing upstream. Team calls. Architecture discussions. Pairing sessions. The decisions that happen before anyone opens an IDE.

Their solution has three layers:

Capture: Every agent prompt and IDE session, recorded automatically. Not just the code that got written, but the back-and-forth that produced it.

Arena: Real-time collaboration with automatic decision extraction. Not verbose summaries nobody will read. The actual decision linked to the exact moment in the conversation.

Repo: Decisions versioned alongside your source code. Consumable by humans and agents. Searchable forever.

Full context lineage: Team discusses → Decision extracted → Agent builds → Session reasoning preserved → Code ships. Every line of code traceable back to the exchange where the decision was made.

That's the upstream feed layer context graphs need. The bridge from conversation to context to code.
The Parallel Problem Nobody's Solving for Data
The same pattern applies to data infrastructure, and the gap is even wider.

Here's an example I come back to constantly. You're looking at point-of-sale data from a retail chain, and one store shows zero transactions for six hours. What happened?

Maybe the system wasn't connected. Maybe there was a hurricane. Maybe it was midnight and the store was closed. Maybe there was a police action in the area. Maybe the pipeline is connected but stopped running. Maybe someone unplugged the wrong cable during a renovation.

The data looks identical in every case: zeros. But the appropriate response is completely different. If it's a hurricane, you adjust your forecasts and check on your employees. If it's a pipeline failure, you fix the pipeline and backfill the data. If it's midnight, you do nothing because everything is working correctly.

The "what" is the same. The "why" determines everything that matters.

This is the context gap in data infrastructure. Data lineage tools tell you what transformations happened. They don't tell you why someone chose that approach over alternatives. Data catalogs describe what datasets contain. They don't capture the discussions that shaped how those datasets were structured. Data quality tools flag when something looks wrong. They can't explain what "right" was supposed to mean based on the original requirements conversation.

Every data team has experienced this: you inherit a pipeline, something breaks, and you spend days reverse-engineering decisions that took the original author five minutes to make. The 2024 Stack Overflow survey found developers spend 30%+ of their time understanding existing code. For data engineers working with inherited pipelines, I'd bet that number is higher.

A few teams are starting to explore how to capture intent at the data layer, not just the code layer. The ones who figure out how to preserve decision context where data actually lives, at the edge, in pipelines, across distributed infrastructure, might be building something important. But right now, the tooling barely exists.
Why This Matters for AI Agents
Foundation Capital is right that agents need decision traces to exercise judgment. But consider what happens when we only capture traces at execution time.

An agent can follow a rule. It can look up a policy. But it can't understand why an exception was made last quarter unless someone captured the reasoning when it happened. It can see that a certain transformation was applied to a dataset but not why that approach was chosen over three alternatives that were discussed and rejected.

Research on AI decision-making keeps surfacing the same challenge: agents struggle with edge cases because they lack the contextual reasoning that humans use to navigate ambiguity. We've been trying to solve this with better prompts, more examples, refined guardrails. But the fundamental problem is upstream. The reasoning that would help agents handle edge cases was never captured in the first place.

Agents inherit our documentation debt. Every undocumented decision, every lost conversation, every piece of reasoning that exists only in someone's memory becomes a gap in the context graph. And agents can't exercise judgment across gaps.
The Compounding Problem
Context loss compounds in ways that aren't obvious until you're deep in a system you didn't build.

Every undocumented decision becomes a landmine for the next person (or agent) who encounters that code, that pipeline, that system. They see what was built but not why. So they either preserve it blindly (accumulating technical debt they don't understand) or change it without understanding the original constraints (breaking things the original author anticipated but never wrote down).

I've seen this pattern repeatedly. A team inherits a data pipeline with a seemingly arbitrary filter. They remove it because it doesn't match current requirements. Three months later, they discover it was preventing a subtle data quality issue that only surfaces under specific conditions. The original author knew about this. They even discussed it extensively with the team. But that conversation happened on a Zoom call that was never transcribed, and the person who made the decision left the company two years ago.

Multiply this across every team, every pipeline, every codebase. The DORA research shows that elite teams ship faster partly because they spend less time reverse-engineering past decisions. They've somehow preserved more context. Usually through heroic documentation efforts that don't scale.
The Path Forward
Foundation Capital's context graph thesis is right about the destination. Greg Ceccarelli and the SpecStory team are right about the first mile.

The platforms that win won't just capture decisions at execution time. They'll capture intent upstream, in the conversations, the debates, the reasoning that happens before anyone writes a line of code or builds a pipeline.

And they'll keep that intent close to where it matters. Versioned with the code. Traveling with the data. Available when someone (or something) needs to understand not just what happened, but why it was allowed to happen.

We're good at storing what happened. We're terrible at capturing why. The next trillion-dollar platforms will be the ones that figure out how to close that gap, not at execution time, but upstream, where the decisions actually get made.

Want to learn how intelligent data pipelines can reduce your AI costs? Check out Expanso. Or don't. Who am I to tell you what to do.*

NOTE: I'm currently writing a book based on what I have seen about the real-world challenges of data preparation for machine learning, focusing on operational, compliance, and cost. I'd love to hear your thoughts!


Originally published at Distributed Thoughts.

Top comments (0)