MatrixOrigin

Posted on Apr 17

Why Enterprise Agent Deployments Can't Succeed Without a Data Platform

#agents #ai #architecture #data

A decade ago we built a maze of "information silos" and spent five years cleaning up the mess. If we deploy AI Agents one by one today, we'll repeat the same detour — only this time, the pitfalls go deeper.

An old story with new characters

Anyone who's been through enterprise IT modernization knows the picture: ERP handles finance and supply chain, CRM handles sales, MES handles the shop floor, OA handles approvals — each system humming along nicely on its own, but pulling data across them is a nightmare. The same order number in three systems, three different formats. Production scheduling needs the sales forecast? Export to Excel first, then import manually.

Companies later spent three to five years building "data middle platforms" to connect these silos — essentially paying down the technical debt from years of "deploy first, integrate later" decisions.

Today, AI Agents are replaying the exact same script.

Every department is doing its own thing: legal deploys a contract review Agent, customer service sets up an after-sales knowledge base Agent, procurement builds a supplier intelligence Agent, HR rolls out a policy Copilot… Each project tells a compelling story on its own, but zoom out and you see the old problems coming right back.

Look closer, though, and this time it's actually worse. The silos of the digital transformation era were about "business data not connecting." The Agent era has three layers of challenge stacked on top of each other.

It's far more than just "connecting data"

Think about it: a contract review Agent needs to read contract text (unstructured), but also check the client's payment history (structured) and reference prior clause review notes (data the Agent itself produced). Three types of data, from three different places. And that's just one Agent.

When you have five, ten Agents running simultaneously, each making cross-calls to multiple structured and unstructured data sources, the call graph isn't a straight line — it's a web. Structured data at least has some lineage from the previous generation of data platforms: you know which table, what the fields mean, who has access. But unstructured data has never been centrally managed. Which contracts are in which folder, which drawings are the latest version, whose inbox has the email attachment — once Agents start making mesh-like calls into this chaos, a few more pipelines in and the whole thing becomes almost impossible to untangle. It's not "increased complexity." It's a loss of control.

What goes wrong when Agents are built in isolation

Over the past two years, working on Agent deployments across dozens of client engagements, we've seen the same failures come up again and again. They boil down to six:

Build the foundation and the intelligence together

When it comes to the relationship between data platforms and business applications, the last decade of "data middle platform" projects left two classic anti-patterns — and both are reappearing in the Agent era:

Anti-pattern #1: the IT-first "build the road before any traffic." The belief that you must build the data platform to completion first — master data governance done, data quality fully cleansed, metrics framework locked in — and only then let the business use it. Sounds rigorous. In practice, these projects often spend two years "governing" while the business can't wait. By the time the platform is ready, every department has already cobbled together their own tools and workflows. Migrating back costs more than starting over.

Anti-pattern #2: the business-first "just ship it." No thought given to a data platform at all. Whichever department needs an Agent gets one, each with its own data plumbing. The first three use cases do go live fast — but by the fourth and fifth, the cracks show: inconsistent data, ungovernable access, no shared learning between Agents. The more you build, the slower it gets.

Both anti-patterns rest on the same false assumption: that building the foundation and building the applications are two separate activities, and you either do one before the other, or skip one entirely.

Our experience says: the foundation and the intelligence must be built in parallel. Especially for enterprises whose data maturity isn't ideal, the right approach is to pick a high-value Pilot Agent scenario and get it running — and in that process, stand up the data foundation alongside it. When the Agent delivers, the business value loop closes, and the data pipelines and access policies close their loops too. Then when the second scenario plugs in, there's already a reusable layer underneath, and things naturally go faster. It's not "build the road then drive" or "drive on dirt and hope" — it's building the road and driving on it at the same time, each reinforcing the other.

As early as 2024, we started seeing Agent silo patterns emerge at client sites, which drove a product decision: we launched MatrixOne Intelligence (MOI) — not another standalone Agent tool, but an AI data intelligence platform with a data foundation at its core. It integrates the runtime capabilities Agents need with the underlying data storage, integration, and processing layers, so that from the very first scenario a team is simultaneously building business value and data assets.

Notice the layering: business scenarios sit on top; everything below is MOI. Inside MOI there are three layers:

Agent runtime layer
NL2SQL, RAG retrieval, memory management, runtime analytics — these are shared capabilities every Agent needs. Build once, reuse everywhere. New scenarios plug in without rebuilding infrastructure.

Data integration & processing layer
Business system data flows in via ETL/CDC, documents are vectorized through embedding pipelines, structured metrics become queryable in natural language through a semantic layer. Agent runtime logs are also centrally collected here. Data is processed once and shared by all Agents.

Data storage & management layer (MatrixOne)

At the bottom sits the MatrixOne cloud-native database, supporting HTAP (hybrid transactional/analytical processing), vector search, and full-text search in a single engine. Structured and unstructured data are managed together, and Agent runtime logs are stored here too. Access control is enforced at this layer — row-level and column-level permissions plus full audit trails are defined in the data engine, not left for each Agent to implement independently. Ten Agents share one security policy, not ten.

Side by side: two paths, very different outcomes

Proven in the field: cross-industry validation

This architecture isn't a slide deck concept. Since 2024, we've validated this "foundation + intelligence together" approach across AI Agent projects in multiple industries. The examples below each have a different focus, but share one takeaway: the data platform is the prerequisite for Agents that actually work, not a nice-to-have.

Five hard-won lessons from the field

Start the foundation and the Agent together — avoid both extremes

Don't spend two years building a "perfect data foundation" before touching Agents — that's the IT playbook from the last era, and the business won't wait. Don't skip the foundation entirely and let every department ship its own Agent either — the first few go fast but it gets slower with every one after. The right cadence: run the first Pilot scenario on an extensible foundation from day one, and let the business value loop drive the data loop.

One engine for structured and unstructured — don't stitch

Agents inherently need both data types. If structured goes in one database, documents in another, and vectors in yet another — the data platform itself becomes the new silo. Choose an engine that natively supports HTAP + vector search + full-text search. It saves a lot of pain down the road.

Collect Agent-generated data from day one

Conversation logs, reasoning chains, decision traces, user ratings — this data is the fuel for Agent evolution. Don't wait six months after launch to realize "maybe we should analyze how it's performing." Route this data into the unified platform from day one, so performance evaluation and strategy tuning have something to work with.

Define access control at the data layer, not per-Agent

The more Agents you have, the more critical governance becomes. Enforce row-level and column-level permissions plus operation audit at the foundation, so every Agent receives only compliant, pre-filtered data. Ten Agents, one security policy — not ten. This isn't optional; it's a compliance baseline.

Nail one scenario end-to-end, then replicate horizontally

Don't try to build an "all-in-one Agent platform" from scratch. Pick a scenario with clean data and visible business impact — after-sales Q&A, contract clause review — and run it through the full stack. The second scenario will go much faster because pipelines, access policies, and memory frameworks all carry over.

Final thoughts

Nobody questions the value of Agents. But whether that value gets realized comes down to infrastructure choices.

The lesson from the last decade of digital transformation was simple: applications are easy to build; foundations are hard to retrofit. The Agent era is even more demanding — Agents don't just consume data, they continuously produce it; and the data they need isn't just structured tables and metrics, it's contracts, drawings, documents, and other unstructured content that was never governed by last-generation platforms. Stack these three dimensions together, and Agents without a foundation only get messier the more you build.

The good news: this time, we can get the foundation right from the very first scenario. No waiting, no retrofitting — foundation and intelligence growing together is the way.