DEV Community: Angela Zhao

The Modern Data Stack Has a Coherence Problem

Angela Zhao — Tue, 10 Mar 2026 04:06:42 +0000

The modern data stack is an engineering achievement. Teams can ingest petabytes from dozens of sources, transform them with dbt, warehouse them in Snowflake or BigQuery, and serve them through semantic layers to dashboards that refresh in seconds. The tooling has never been better.

And yet, the decisions that matter most — the ones made by automated systems in the moment a customer buys, a transaction clears, an agent acts — keep getting them wrong. Not because the data is missing. Not because the pipeline is slow. Because the stack was never designed for coherence.

What Coherence Means — and Why It's Different from Freshness

When engineers talk about data quality, they usually mean freshness: how recently was this data updated? That's a real problem, and the modern stack has made genuine progress on it. Streaming pipelines, near-real-time warehouses, and aggressive materialization schedules have compressed lag from hours to minutes, or even seconds.

But freshness is a per-table property. Coherence is a cross-system property.

A decision is coherent when every piece of context it consumes reflects the same version of reality at the same moment in time. A fraud check is coherent when the velocity counter, the account balance, the session signal, and the device fingerprint all describe the same instant — not a patchwork of states from four different systems, each updated at different times, each read at a different point in the query.

Most modern data stacks are very good at keeping individual tables fresh. They are almost universally bad at ensuring coherence across tables and systems when a decision has to read all of them simultaneously.

The Stack Was Built for Analysis, Not for Decisions

The modern data stack's architecture reflects its original purpose: analytical reporting. You have a source of truth (the warehouse), a transformation layer (dbt or similar), and a consumption layer (BI tools, notebooks, dashboards). The flow is append-only, batch-friendly, and eventually consistent.

That model works fine when a human is the decision-maker. A dashboard showing last night's revenue is coherent enough for a Monday morning review. Stale by a few seconds? Nobody cares.

The problem is that this architecture has been retrofitted — often ad hoc — into serving automated decisions that need correct context now. Product teams building recommendation engines, risk teams building fraud models, and AI teams building autonomous agents all end up pulling from the same warehouse or the same derived tables that were designed for dashboards. The tooling was built for analysis. The decisions need something different.

Three Ways the Modern Stack Loses Coherence

1. Preparation Delay

Derived state — aggregates, velocity counters, feature vectors, materialized views — is computed from raw events. That computation takes time. Even if your raw event pipeline is near-real-time, your derived state lags behind it by whatever your transformation cycle costs: minutes in a well-tuned dbt flow, tens of minutes in a typical warehouse setup.

During that window, the raw facts and the derived state describe different realities. An AI agent reading a user's "current session context" from a feature store may be reading a summary computed from events that stopped at T-5 minutes. The events since then exist somewhere in the pipeline, but they haven't made it into the context the agent reads.

This is not a pipeline speed problem. It's a structural gap between event arrival and context availability — and it persists even in sophisticated real-time stacks because preparation itself takes time.

2. Cross-System Retrieval Inconsistency

Modern automated decisions rarely pull all their context from a single system. A real-time fraud check might read:

Account balance from a transactional database
Velocity counters from a Redis cache
Device reputation from a third-party service
Session signals from a streaming platform
Feature vectors from a feature store

Each of these systems has its own consistency model, its own replication lag, and its own definition of "current." There's no transaction boundary spanning all five reads. The fraud engine assembles a composite context from five different points in time and treats it as if it represents a single, coherent moment.

Under normal load, the differences are small enough to ignore. Under high concurrency — exactly the conditions when fraud and limit breaches actually occur — the gaps widen. Events that changed account state 800ms ago may not have propagated to the cache yet. Two concurrent transactions may both read a pre-update balance.

The modern data stack has no native mechanism to provide a consistent snapshot across heterogeneous systems. Each tool guarantees consistency within itself. Cross-system coherence is left to the application.

3. Snapshot Incoherence Under Concurrency

The third failure mode is the subtlest. Even in a single system, reads and writes interleave under concurrent load. A velocity counter incremented by transaction A may not be visible to the read performed by transaction B if B reads before A's write commits. Depending on isolation level, B may see a partially updated state — or a state that will be rolled back.

In analytical workloads, this is tolerable. Slightly stale aggregates don't change business outcomes. In automated decision workloads, particularly in financial services, a velocity counter that misses the last N concurrent increments is the mechanism by which fraud rings exploit systems. The counter says "3 transactions in the last minute." The reality is 12.

The modern data stack's read-optimized, eventually consistent architecture — designed for analytical correctness — provides insufficient isolation for decision-time correctness under real concurrency.

Why Teams Don't See This as a Stack Problem

Here's the frustrating part: teams usually diagnose coherence failures as model problems, feature problems, or freshness problems — not as architectural problems.

When a fraud model approves a transaction it should have blocked, the first instinct is to retrain the model, adjust the threshold, or improve the features. When an AI agent acts on wrong context, the first instinct is to improve the prompt, add memory, or switch models.

The interventions that should work — more data, better models, lower latency — don't fix coherence failures, because the problem isn't in any single layer. It's in the gap between layers: the seam where independently-consistent systems have to be read together and their outputs treated as a unified picture of reality.

Coherence failures are invisible in the tooling. Your data observability platform will show green. Your feature store's freshness metrics will look fine. Your latency dashboards will show sub-100ms reads. Everything looks healthy because every individual component is healthy. The incoherence only exists at the moment a decision assembles context from all of them simultaneously.

What Would a Coherence-Aware Stack Look Like?

A stack designed for decision coherence has three properties that the modern analytical stack lacks.

Single snapshot semantics across systems. A decision should be able to read all of its required context — transactional state, derived aggregates, streaming signals, vector representations — as of the same logical point in time. This is different from reading each system at "the latest." It means the stack maintains a consistent snapshot that spans systems, so a decision sees a coherent view of reality rather than a patchwork of independently-current values.

Incremental materialization with bounded lag. Derived state — aggregates, features, rollups — should be maintained incrementally as events arrive, not recomputed on a batch schedule. The goal is not zero-lag (which is impossible for non-trivial transformations) but bounded lag: a guarantee that the context available at decision time is at most N milliseconds behind raw event arrival, where N is small enough to be within the validity window of the decision.

Concurrent write isolation that doesn't sacrifice read performance. Under high concurrency, reads and writes must be isolated such that a decision sees either a fully committed write or no write — not a partial state. This is a standard database guarantee that most analytical systems relax for throughput. A decision-coherent stack restores it for the specific reads that feed automated decisions.

These properties are not exotic. They exist in database systems, though usually only within a single system boundary. The architectural challenge — and the reason the modern data stack hasn't solved this — is providing them across the heterogeneous sources that real automated decisions consume.

The Coherence Problem Is Getting Harder

Three trends are making this worse.

AI agents read more context, from more systems, under tighter time constraints. A traditional fraud model might read five features from one system. A multi-agent orchestration system might read dozens of signals from a dozen systems, synthesize them, and act — all within a second. Each additional source multiplies the opportunity for incoherence.

Automated decisions are taking on higher-stakes actions. AI agents are increasingly being given the ability to take real-world actions: approving transactions, extending credit, executing trades, modifying customer state. The cost of acting on incoherent context is no longer a misfired recommendation — it's a financial loss, a compliance violation, or a cascading error that's hard to reverse.

Concurrency is increasing. As more decisions are automated and as systems scale, the window during which concurrent state changes can cause coherence failures grows. Fraud rings exploit exactly this: high-concurrency bursts designed to exploit the gap between when state changes and when derived context reflects it.

The modern data stack was not designed for this world. It was designed for a world where decisions are made by humans, who can tolerate staleness, who can recognize and correct inconsistencies, and who operate at a cadence that makes analytical eventual consistency acceptable.

Coherence Is Not a Feature. It's Infrastructure.

The instinct, when faced with a coherence problem, is to solve it in the application: add more aggressive cache invalidation, tighten replication lag, build a custom state synchronization layer. Teams do this, and it works — until it doesn't, which is usually at the worst possible moment, under the highest possible load.

Coherence cannot be reliably provided by application logic on top of an architecture that was never designed to support it. It requires infrastructure that was designed for it from the start: a system that maintains a consistent, multi-modal view of state across sources, keeps derived context within bounded lag of events, and guarantees snapshot isolation for the reads that feed automated decisions.

This is what the modern data stack is missing. Not more speed. Not more data. Not better models. Coherence — the guarantee that when a decision is made, the context it reads describes the same world at the same moment in time.

Until the stack provides that guarantee, automated systems will keep making decisions on a world that doesn't quite exist anymore.

Originally published at tacnode.io

Why Software Becomes More Valuable As AI Makes It Free

Angela Zhao — Mon, 09 Mar 2026 10:07:40 +0000

Generative AI is driving the cost of producing software toward zero. The common conclusion is that software itself is becoming less valuable.

This is wrong. It confuses production cost with economic value.

There is a simple formula for software value in the age of AI. It explains why some software becomes worthless while other software becomes more valuable than ever.

The Floor On Replacement Cost

When people say "AI makes software free," they mean AI reduces the tokens required to generate a working program. But "toward zero" does not mean "to zero." There is a floor.

In information theory, the true complexity of any object is the length of its shortest possible complete description — its Kolmogorov complexity. For software, this is the total information consumed in generating it correctly: the spec, the output and all the intermediate reasoning, debugging and verification along the way.

As AI models improve, they internalize more patterns. The number of tokens needed to produce a given program decreases. But they cannot decrease below the program's Kolmogorov complexity.

That is the floor. The replacement cost of any software is converging toward a well-defined quantity: the irreducible token count times the price per token.

The Formula: K x P x N

Three variables determine software value:

K (Kolmogorov Complexity): The minimum number of tokens required to correctly generate the software
P (Price Per Token): The cost of compute for generation
N (Reuse Number): How many independent systems need this functionality

If N systems each need software with complexity K, and it doesn't exist, each pays K x P to generate it. Total cost to the economy: K x P x N.

If the software exists, that cost is eliminated. The economic value is the total cost saved:

Value = K x P x N

The Race Between P And N

K is converging to a floor; it cannot shrink below the irreducible minimum. The interesting dynamics are between P and N.

P is dropping fast. Token prices have fallen by orders of magnitude and will continue falling toward marginal compute cost.

For software value to grow, N must grow faster than P shrinks. This is the key insight: Value concentrates in software where reuse is expanding faster than generation cost is collapsing.

What This Predicts

Low K, Any N (Commoditized): CRUD apps, standard UI patterns, basic integrations. Low complexity means low replacement cost, even at today's token prices. As P drops further, the value approaches zero.
High K, Low N (Niche): Bespoke simulation tools, specialized compliance logic. High replacement cost, but few systems need it. Value is real but limited.
High K, High N (Most Valuable): Operating systems. Database engines. Irreducibly complex, and N keeps growing as more systems depend on them.

Venture investors are increasingly betting that infrastructure layers will capture disproportionate value in the AI cycle. The formula explains why: high K, along with N that grows faster than P shrinks.

What Makes K High

It's not the length of the spec. It's not the lines of code. It's the length of the generation process.

Two programs, each 10,000 lines. The first: REST endpoints for 200 database tables. The model reads a short spec, generates the code in one pass and it's done.

The second: a distributed consensus protocol. The model generates an attempt, tests it, discovers a race condition, reasons through the failure and tries again. Another edge case appears. It debugs, refactors and generates again.

Same output length, but vastly different total tokens were consumed. The complexity lives in the generation path, not the endpoints.

This distinction doesn't shrink as models improve. For genuinely complex software, there is a minimum computation required regardless of how intelligent the solver is.

The Agent Economy Multiplier

As autonomous agents proliferate, N explodes. Every agent needing shared context or coordinated decision-making is a consumer of infrastructure.

Meanwhile, K for that infrastructure is irreducibly high. Providing temporally consistent snapshots across analytical, transactional and semantic queries is genuinely hard.

The result: N is growing faster than P is shrinking. Agent infrastructure gains value even as token prices collapse.

Strategic Implications

For Founders: Ask two questions. First, is your K irreducibly high? Second, is your N growing faster than P is falling? If both answers are yes, you have a durable business. If either is no, you're in a race against commoditization.
For Investors: K x P x N is a valuation heuristic. The key metric isn't current N; it's the growth rate of N relative to the decline rate of P.
For Technical Leaders: Build-versus-buy has a precise answer. Estimate K x P for regeneration. Multiply by how often you'd need to do it. If buying is cheaper, buy, and expect the calculus to shift as P drops.

The Bottom Line

Generative AI changes how software is produced. It does not eliminate the need for structure, correctness or coordination.

Value is K x P x N. K has a floor. P is falling. The winners are those whose N grows faster than P shrinks.

The strategic question isn't "what can we automate?" It's "what irreducible complexity should we own, and can reuse grow faster than generation costs fall?"

Originally published at Forbes Technology Council

Decision Coherence: A Formal Correctness Requirement for Multi-Agent Systems

Angela Zhao — Mon, 09 Mar 2026 09:49:48 +0000

As AI agents move from demos into production, a class of correctness bugs is emerging that existing system design vocabulary doesn't fully describe. The bugs look like race conditions, but they aren't races in the traditional sense. They look like stale reads, but the individual systems involved are internally consistent. They look like pipeline lag, but faster pipelines don't fix them.

A paper published in January 2026 (arXiv:2601.17019) formalizes the underlying problem as a single correctness requirement: Decision Coherence. This article walks through the definition, explains why existing architectures violate it structurally, and examines what the requirement implies for system design.

The Setting

The paper's analysis applies specifically to collective AI systems: deployments where multiple agents operate continuously and concurrently, sharing state, and making irreversible decisions.

The defining characteristics of the relevant setting:

Semantic understanding at decision time — agents interpret unstructured content directly, not through pre-enumerated features
Continuous operation — agents act without batch boundaries; there is no quiescent period between updates and decisions
Shared state — multiple agents read and write overlapping portions of context simultaneously
Irreversibility — decisions commit before correction is possible (payment approvals, fraud blocks, credit decisions)

The Decision Coherence Law

A decision is coherent if and only if it is evaluated against a context that constitutes a consistent, semantically complete, and temporally bounded representation of reality at the time of decision.

From this law, three categories of operational requirements are derived:

1. Semantic Operations

Raw data records are not sufficient context for agent decisions. Agents require derived interpretations: aggregated signals, similarity relations, inferred intent, entity profiles.

The critical invariant: semantic transformations must occur inside the system boundary. If a vector embedding is computed outside the transactional scope, the relationship between that embedding and the raw data it was derived from is not covered by any consistency guarantee.

2. Transactional Consistency

An agent must not observe state that corresponds to no valid configuration of reality — partial writes, mixed pre- and post-update views, or snapshots assembled from multiple independent commit points.

This is a stronger requirement than what most ACID databases provide, because it must hold across heterogeneous retrieval patterns (point lookups, range scans, similarity search, aggregations) issued within a single decision context.

3. Temporal and Concurrency Envelopes

Temporal envelope: The maximum staleness Δ of context at decision time must be declared and enforced. Derived context (aggregates, embeddings) must reflect reality within Δ of the decision timestamp.

Concurrency envelope: The transactional and temporal guarantees must hold at a declared concurrency level C under sustained load.

The Composition Impossibility Result

Section 6 of the paper proves that:

No composition of existing system classes can satisfy Decision Coherence. The requirement can only be enforced within a single system boundary.

The proof: fraud detection requires exact aggregations over dynamically defined predicates, similarity search over recent behavior, and transactionally consistent reads of current state. Each primitive maps to a different system class. No distributed join across these systems can provide a consistent snapshot without a coordination protocol that reintroduces latency and failure modes at every seam.

This is a structural result, not a performance result. It cannot be addressed by faster replication or tighter cache invalidation.

The Four Agent Decision Admissibility Conditions

No private decision premises. All context used to evaluate a decision must reside in shared, authoritative infrastructure.

No deferred correctness. The decision must be correct at the time it is made, not correctable after the fact.

No mixed causal cuts. All observations composing a decision context must derive from the same causal snapshot.

No implicit semantics. The semantic meaning of context must be explicit and managed within the system boundary.

These conditions are testable. Engineers can audit an existing multi-agent architecture against each one.

The Context Lake System Class

The paper defines a Context Lake as the system class that enforces Decision Coherence:

A Context Lake is a system that enforces the Decision Coherence Law at the boundary of agent interaction with shared context.

Its architectural scope: it organizes experience into decision-ready context, retrieves context under the Decision Coherence guarantee, and all decision logic remains external.

Architectural Implications

Audit your retrieval boundary. If a single agent decision assembles context across more than one independently consistent store, you have a mixed causal cut.

Locate your semantic operations. Where are embeddings computed? Where are aggregates materialized? If the answer is "in a pipeline before the decision" without a mechanism to tie that computation to the same snapshot as the decision, you have implicit semantics outside your consistency boundary.

Define your temporal envelope explicitly. What is the maximum staleness your decision logic can tolerate? Is that bound declared, monitored, and enforced?

Assess irreversibility. Decision Coherence is most critical for workloads where decisions cannot be recalled.

The full formal treatment is in arXiv:2601.17019. The canonical definition and reading guide are at contextlake.org/canonical.

"Context Lake: A System Class Defined by Decision Coherence" — Xiaowei Jiang, January 2026 (arXiv:2601.17019, cs.DB)

AI's Memory Crisis: We're Building a Digital Dark Age

Angela Zhao — Mon, 09 Mar 2026 09:49:01 +0000

Millions of AI agents are entering production systems. Almost none can share operational experience. This is why that architectural choice matters—and what changes if we get it right.

At 2:06 PM, a customer places an online order for a laptop.

The checkout agent queries its operational database: clean purchase history, amount within normal range, shipping address previously used, device and location consistent with recent successful orders. Everything looks normal. The agent approves the order.

At the same time, a behavior agent processes clickstream data in the company's data lakehouse. From the session, it derives a pattern: the user arrived directly on a deep checkout URL with no browsing or comparison behavior. This signal is weak on its own, but it is a known precursor in account takeover scenarios when combined with otherwise normal purchases.

The behavior agent records this interpretation as derived knowledge for later analysis and model training.

The checkout agent never sees it. Not because the signal wasn't computed, and not because it was ignored—but because the knowledge lives inside a system the checkout agent does not consult during authorization.

Each agent behaves correctly given what it can see. Each writes to the system it owns. But the insight derived by one agent is invisible to the other at decision time.

The laptop ships.

Thirty-six hours later, the charge is disputed. Investigation confirms the account was compromised earlier that day. The attacker kept the transaction within normal bounds, relying on the fact that the only early warning existed as behavioral knowledge trapped outside the checkout agent's decision context.

The failure was not missing data, slow processing, or a bad model. It was an agent silo: knowledge was formed, but not shared.

The Problem the Printing Press Solved

Before the printing press, knowledge was fragile. When a scholar died, much of what they had learned died with them. A mathematician in London might spend decades discovering principles that a mathematician in Paris would independently rediscover fifty years later. Progress was real, but it was local, slow, and repeatedly reset.

The printing press didn't make individuals smarter. It externalized memory. Knowledge stopped being bound to a single mind and began to persist beyond the life of its creator. Insights could be shared, revisited, and built upon across generations. That is what allowed progress to compound.

We are at risk of repeating the pre-printing-press mistake with AI.

Most organizations are now deploying AI agents across production systems. These agents are typically deployed as independent services aligned with modern microservice architectures, each with its own data and operational boundary. Even inside the same organization, agents derive insight from their own production experience but rarely share the knowledge they produce with other agents making related decisions.

As a result, operational insight remains fragmented. Local decisions may improve, but experience does not accumulate across the system. Every breakthrough that stays trapped inside a single agent is a breakthrough that cannot compound.

This time, the limiting factor is not intelligence or speed. It is memory. Without a way for AI systems to externalize and share what they discover, progress resets more often than it builds.

What Shared Memory Actually Looks Like

Shared memory changes outcomes not by improving models, but by changing what agents can see at decision time.

In a siloed system, each agent reasons correctly within its own boundary. The checkout agent evaluates transactional risk. The behavior agent analyzes clickstream patterns. Each writes its conclusions to the system it owns, and those conclusions remain invisible to other agents operating in parallel.

With a shared memory layer, that boundary disappears.

As the behavior agent processes a session, it derives a weak but meaningful signal: a navigation pattern associated with early account takeover attempts. Instead of storing that insight only for offline analysis, it writes the signal to shared memory, linked to the active session.

Moments later, when the checkout agent evaluates the purchase, it queries that same memory. The transaction still looks normal. But it now sees additional context: a behavioral warning that would otherwise be absent. Neither signal is decisive on its own. Together, they cross the threshold for further verification.

Nothing about the agents themselves has changed. No models are retrained. No centralized controller intervenes. The difference is visibility: an insight formed by one agent becomes available to another while it still matters.

Crucially, that insight persists. When the outcome is later known—fraud or legitimate—the association between the signal and the result is recorded. Over time, the system accumulates an empirical record of which weak indicators tend to matter, and under what conditions.

Shared memory is not a data warehouse and not an operational database. It is a low-latency substrate for derived context: signals, interpretations, and associations that survive the interaction that produced them and remain queryable by other agents making related decisions.

The Missing Discipline: Context Engineering

Shared memory introduces a problem most teams are not prepared to solve: deciding what experience should persist.

AI systems generate vast amounts of raw experience—transactions, clicks, messages, actions, outcomes. Persisting all of it is neither practical nor useful. Without deliberate selection, shared memory becomes noise. The challenge is not collecting more data, but shaping experience into context that other agents can use.

This is the role of context engineering: deciding which observations become durable signals, how those signals are represented, and when they should be exposed to other agents. It sits between raw events and agent reasoning, transforming transient activity into shared, decision-relevant understanding.

Context engineering determines whether shared memory merely stores experience—or enables it to compound.

What Happens If We Get This Right

The default path is isolation. AI agents act independently, drawing only on their own experience. Each makes fast, locally correct decisions, but intelligence plateaus.

The alternative is a shared memory layer. When derived context persists and is visible at decision time, experience stops evaporating. Insights discovered once remain available. Weak signals gain meaning through accumulation. Decisions improve not because models change, but because agents no longer reason in isolation.

Architectural defaults harden quickly. Systems built without shared memory become increasingly difficult to retrofit as agents proliferate. The choice is simple: build systems that accumulate experience—or systems that endlessly reset.

Originally published on Unite.AI.