
Over the past year, I’ve spent a lot of time building with agents.
Like many engineers, I started with the usual assumptions: better reasoning, better prompts, better tools, better RAG, better orchestration. And yes, all of that matters. Models have improved fast, and it’s now possible to build agents that look very impressive in demos.
But once I started pushing agents into real enterprise environments, I kept running into the same problem.
It wasn’t that the model failed to understand the task.
It wasn’t that the prompt was weak.
It wasn’t even that the APIs weren’t connected.
The real issue was simpler:
the agent didn’t actually understand how enterprise data was connected.
One of the first hard failures I saw came from a seemingly simple request:
Find orders that have already shipped but still haven’t been invoiced after 48 hours.
At the business level, that sounds straightforward.
At the data level, it’s a mess.
The order lives in one system. Shipment status lives in another. Invoice status lives in finance. Customer context sits somewhere in CRM. And across those systems, names, schemas, and keys are rarely consistent.
A field might be called order_no in one system and source_id in another.
Sometimes the relationship is indirect and requires intermediate tables.
Sometimes the documentation is incomplete.
Sometimes the column names look similar but mean different things.
That’s when I realized where agents become risky in enterprise settings. They can generate SQL and call tools, but they often don’t know how the underlying data objects should actually connect.
And in enterprise systems, the most dangerous failure is not an exception. It’s a result that looks plausible and quietly gets trusted.
After hitting this wall a few times, I started looking for solutions more systematically. I reviewed the usual categories: text-to-SQL systems, semantic layers, metadata catalogs, lineage tools, observability products, and structured-data agent frameworks.
A lot of them solve useful parts of the problem.
But I kept feeling that something essential was missing.
Most people are focused on helping agents understand the question better. Much less attention is being paid to helping agents understand how enterprise data is actually connected.
That difference matters more than it sounds.
Because real enterprise data is not clean or unified. Naming is inconsistent. Legacy systems never fully disappear. Documentation gets stale. Business meaning often lives in people, not schemas. And the real relationship between systems is often hidden in the data itself, not in the field name.
That’s why Arisyn caught my attention.
What I found interesting was its angle: instead of relying mainly on naming conventions or metadata labels, it focuses on the characteristics of the data itself. It identifies inclusion, equivalence, and hierarchical relationships based on actual value patterns, and it can generate executable SQL JOIN paths across heterogeneous systems.
That stood out to me immediately, because if you’ve worked on enterprise data long enough, you learn that names are often the least reliable layer.
The other thing I found important was that this isn’t just about relationship visualization. Arisyn can return relationship results in a structured, machine-consumable form, such as JSON-style edges between tables and columns. That matters because once relationship discovery becomes machine-readable, it stops being just an analyst convenience and starts looking like infrastructure for agents.
The deeper insight for me was this:
this is not just a data problem. It’s an action problem.
An agent that answers questions is useful.
An agent that can safely operate across multiple enterprise systems is much harder to build.
Because action requires more than language understanding. It requires connection certainty.
If an agent is going to reconcile records, diagnose delayed operations, or trigger business workflows, it needs to know how the data world underneath those tasks is structured. Without that layer, the agent can talk, suggest, and generate plausible outputs — but it cannot reliably operate across real enterprise complexity.
That’s why I’ve started thinking of this missing piece as a data relationship intelligence layer.
Not a BI tool.
Not just metadata.
Not just lineage.
Not exactly a semantic layer either.
Something more operational:
· where should the agent get the data?
· how do these tables actually connect?
· which path is trustworthy?
· which relationships should be excluded?
· what can safely enter an execution workflow?
In that sense, this layer looks a lot like a navigation system for agents operating inside messy enterprise environments.
My current take is simple:
enterprise agents do not just need better language models.
They need a continuously maintained, executable, and governable understanding of how data connects.
That’s the part I think many teams are still missing.
If we keep focusing only on making agents better at reasoning, while ignoring whether they can reliably navigate real enterprise data structures, we’ll keep building agents that look strong in demos but stay fragile in production.
So if someone asked me what’s still undervalued in the agent stack, beyond models, RAG, and tool use, my answer would be:
data relationship intelligence.
Because before an agent can truly act, it has to understand the map of the data world it operates in.
Top comments (0)