By Kartik Anand, Cloud & AI Architect | Agentic Systems on Azure
— -
Every enterprise AI team is solving the same problem independently right now: how do we connect our AI agents to our data systems?
Each team figures out the MCP connection. Each team handles auth. Each team builds their own retry logic. Each team manages their own observability. And when something breaks at 2am, nobody knows which agent called which system or why.
There’s a better pattern. And most organizations already own the infrastructure to build it.
— -
The Problem with Point-to-Point MCP Connections
Model Context Protocol is the right abstraction for connecting AI agents to data systems. Snowflake exposes a managed MCP server. Databricks is building one. ServiceNow, SAP, and a dozen other enterprise platforms are following. The protocol is winning.
But most implementations today look like this:
ClaimsAgent. → Snowflake MCP (custom auth, custom retry)
PharmacyAgent. → Databricks MCP (different auth, different retry)
OpsAgent. → Fabric Data Agent (different auth again)
TicketingAgent. → ServiceNow MCP (yet another auth pattern)
Point-to-point. No consistency. No central observability. No governance. Every team reinventing the same wheel with slightly different spokes.
This is the microservices sprawl problem, arriving in the MCP era right on schedule.
— -
APIM as the Enterprise MCP Registry
Azure API Management is already the right answer to this problem. It’s what you put in front of APIs when you need consistent auth, retry, rate limiting, versioning, and observability across a heterogeneous set of backends. MCP servers are APIs. The same governance logic applies.
The pattern:
Any AI Agent
. → APIM (Enterprise MCP Gateway)
. ├── /mcp/snowflake-claims. → Snowflake CLAIMS_MCP
. ├── /mcp/databricks-pharmacy → Databricks Genie
. ├── /mcp/fabric-hospitalops. → Fabric Data Agent
. ├── /mcp/sap-finance. → SAP MCP server
. └── /mcp/servicenow-tickets. → ServiceNow MCP server
Every MCP server in the enterprise gets a consistent, governed endpoint. Any AI agent calls /mcp/snowflake-claims — it doesn’t know or care what’s behind it. The agent’s job is reasoning. APIM’s job is everything else.
— -
What You Get Per MCP Endpoint
Become a Medium member
Auth normalization. Every backend has its own authentication requirement. Snowflake wants one thing. ServiceNow wants another. APIM normalizes it — the agent sends its Entra ID token once, APIM handles the translation to whatever the backend expects. No agent ever holds a backend credential directly.
Retry per endpoint. Snowflake’s Cortex Agent cold start behavior is different from Databricks Genie’s. Point-to-point connections handle this inconsistently or not at all. APIM lets you define the right retry policy for each backend independently — exponential backoff, max attempts, timeout — without touching agent code.
Rate limiting. One rogue agent shouldn’t be able to hammer a shared Snowflake instance. APIM enforces per-caller, per-endpoint limits. The agent gets a clean error. The backend stays healthy.
Versioning. /mcp/snowflake-claims/v1 and /v2 coexist while teams migrate. Agents that haven’t updated yet keep working. No big-bang cutover.
Unified observability. Every MCP call across every backend flows through one logging layer. When something breaks at 2am, you have one place to look — which agent called which MCP server, what it sent, what came back, how long it took, whether it retried. This is the operational capability that makes the difference between a demo and a system.
Discovery. APIM’s developer portal becomes your internal MCP catalog. Teams browse available MCP servers like an API marketplace — here’s what data sources your AI agents can reach, here’s how to connect, here’s the schema. New agents don’t start from scratch. They pick from the catalog.
— -
The Organizational Story
Right now, connecting an AI agent to Snowflake requires knowing Snowflake’s MCP endpoint, handling their auth pattern, building retry logic, and figuring out observability. Every team does this independently.
With APIM as the registry, that’s solved once, centrally, by the platform team. A new specialist agent doesn’t think about infrastructure. It thinks about its domain:
“I need claims data → call
/mcp/snowflake-claims”“I need hospital ops → call
/mcp/fabric-hospitalops”
The platform team owns the registry. Domain teams own their agents. Clear separation. Clean ownership. This is how enterprise AI infrastructure actually scales.
— -
Why This Isn’t Being Built Yet
Most organizations are still on their first or second MCP connection. The governance problem doesn’t feel urgent when you have two endpoints. It feels very urgent when you have twenty, owned by five different teams, called by a dozen agents, with no central visibility.
The teams building MCP connections today are AI engineers. The teams that would naturally own a registry like this — platform engineering, API governance, cloud architecture — aren’t in the MCP conversation yet. That gap is where the mess accumulates.
The organizations that build this registry now, before the sprawl sets in, will have a meaningfully better operational posture when the number of MCP connections scales.
— -
What This Looks Like in Practice
In the architecture I’ve been building — an A2A orchestration platform across Snowflake and Microsoft Fabric — APIM already sits in front of the escalation action layer. The natural next step is extending that same APIM instance to front every MCP connection, not just the action tools.
One APIM instance. Multiple MCP backends. Consistent governance across all of them. Any new data source the organization wants to make available to AI agents gets added as a new route — auth configured once, retry policy set, logging enabled, added to the catalog.
That’s the enterprise MCP registry. It’s not a new product. It’s a new use of infrastructure you probably already have.
— -
The Three Things to Build First
If you’re starting today:
1. Route your first MCP connection through APIM. Pick your highest-value MCP server — probably Snowflake or Databricks if you have them — and put APIM in front of it. Add the inbound auth policy and a basic retry policy. That’s your registry with one entry.
2. Standardize the URL pattern. Establish /mcp/{platform}-{domain} as your naming convention before you have ten endpoints. Renaming later is painful.
3. Enable logging to Application Insights. One line in APIM policy. You’ll thank yourself the first time something fails and you need to know why.
Everything else — versioning, rate limiting, the developer portal catalog — follows naturally once the first connection is clean.
— -
The MCP ecosystem is moving fast. The governance layer is moving slower. The organizations that close that gap early will have the operational foundation to scale agentic AI without the chaos that usually follows platform adoption at speed.
— -
Part of a series on enterprise agentic AI architecture. Full reference implementation: HealthIQ Multi-Agent A2A Architecture · Deep dive: Building a Multi-Agent A2A Architecture on Snowflake and Microsoft Fabric
Top comments (0)