DEV Community

Cover image for The Integration Tax: Walled-Garden Agent Strategies Won't Scale (MxN vs. M+N)

The Integration Tax: Walled-Garden Agent Strategies Won't Scale (MxN vs. M+N)

Alexander Leonhard on April 01, 2026

Personio maintains 200+ integrations. Greenhouse has 400+. iCIMS lists 800+. Every single one is a point-to-point adapter somebody had to scope, b...
Collapse
 
apex_stack profile image
Apex Stack

The TCP/IP vs HTTP analogy is spot on. I run 10+ MCP-connected agents daily across different services (search consoles, analytics, content platforms), and the "protocol layer is fractured" observation matches what I see at a smaller scale too.

MCP gives you tool-level interoperability — any agent can call any tool. But the semantics are completely ad-hoc. One MCP server returns page metrics as {impressions: 2220, clicks: 3}, another returns {views: 2220, interactions: 3}. Same data, different schemas. Every consumer has to know the quirks of every provider.

Your token economics point deserves more attention. I've seen agents burn 40-60% of their context window just parsing and normalizing data from multiple sources before they can even start reasoning about it. A structured domain schema would collapse that to almost nothing.

The honest caveat about HR Open Standards is what makes this piece credible. I think the difference this time is that the cost of NOT having shared schemas is measured in tokens and dollars per API call, not just developer hours. When every agent interaction has a measurable marginal cost, the ROI of standardization becomes a spreadsheet exercise rather than an abstract architectural argument.

To answer your closing question: the real blocker for adopting shared domain protocols in my experience is that nobody wants to be the first to constrain their schema when their competitors haven't. It's a coordination problem more than a technical one.

Collapse
 
testinat0r profile image
Alexander Leonhard

As if we never learned how to properly handshake on a technical level. LLMs made it ever too easy to turn messy input into a json, which is fine until you try to scale. Then it's not.

Fair remark on the inertia of adoption, which is one of our main focus areas to overcome by making it easy and essentially risk free for any vendor on the supply or demand side.

Business don't have to replace anything in their current stack to get listed on adnx with their agent(s) - it can just sit there. Until the network grows node by node and they start to see the value manifest, we think this can happen fast when some key players in recruiting give it a try.

I'd say it's purely convincing and taking away concerns that don't really exist. Assuming a viable network, time to value would be minutes.

Collapse
 
apex_stack profile image
Apex Stack

Really interesting point about the "messy JSON handshake" problem. I've hit exactly this — when you have 10+ agents talking to different services, the lack of a proper contract layer means every integration is basically bespoke glue code.

The zero-friction onboarding angle is smart. The biggest barrier I've seen isn't technical complexity but organizational inertia — teams won't adopt something new unless the switching cost is near zero. If businesses can list on the network without changing their existing stack, that removes the biggest objection.

Curious how you're handling schema discovery though. In my experience the hardest part isn't connecting agents — it's getting them to understand each other's capabilities without a human writing the mapping.

Thread Thread
 
testinat0r profile image
Alexander Leonhard

We offer to provide the agent "build to order" as a service // a small SaaS client that is already integrated with an agent, ready to be configured for SMBs that don't have or want to change their ATS provider.

Given that this will take a lot of work to comb through hundreds and thousands of companies, we focus on partners first that make up 80% of supply and demand and function as a multiplier - selling to them will be different compared to SMBs and "direct customers" of our services.

Thread Thread
 
apex_stack profile image
Apex Stack

The partner-first approach makes a lot of sense — going after the platforms that aggregate demand is way more efficient than cold-outreaching individual SMBs one by one. Curious how you handle the integration variance though. Even within a single ATS category, the data models can differ wildly between providers. Do you end up building a normalization layer per partner, or is there a way to generalize it?

Thread Thread
Collapse
 
codewithagents_de profile image
Benjamin Eckstein

Hey Alexander,

You're framing this at the protocol/enterprise layer, which is where the pain compounds fastest — but the same integration tax shows up at the individual developer level, and the escape route is similar.

I was running an Atlassian MCP server inside Claude Code. 33 registered tools, 6 I actually used. Here's the part that surprised me: disabledTools in the Claude Code config prevents the AI from calling a tool, but it doesn't prevent the server from loading all 33 schemas into context. Docker still spins up. All 33 definitions still inject. That's 10,000 tokens consumed before the first prompt — every session, unconditionally.

Your structured-vs-unstructured token economics point hit close to home. I replaced the MCP server with 7 curl scripts — one per Jira operation I actually use. Startup cost: zero tokens. Per-call cost: ~500 tokens when invoked. And unlike the MCP abstraction, the scripts understand my workflow — project-specific defaults, custom fields, the exact component auto-applied to every ticket. The protocol layer couldn't express that. A shell script can.

Wrote the full breakdown here: codewithagents.de/en/blog/the-22k-...

On your closing question — I think at the individual tool layer the blocker is pure inertia. MCP is trivially easy to install. The token cost is invisible until you actually measure it.

Collapse
 
testinat0r profile image
Alexander Leonhard

Thanks for sharing! The 10K token baseline is a number worth citing and insights worth checking - I honestly disabled MCP tools but didn't validate if it's helping, I guess I got used to Claude just compacting conversations.

I had a similar experience early on using Claude with linear and some other MCPs.
I noticed roughly 20% of my context window gone before I'd typed anything.

Some Claude sessions went sideways. Eventually left most tools on "just in case" — which is exactly the trap. That "just in case" tax is 5% of your context window per server. Stack three and you're working with 85% of what you paid for before the conversation starts. If you're then required to use the API Tokens, you pay a hefty premium.

And the tooling still isn't flexible enough. MCP servers load all or nothing — there's no clean way to selectively activate individual tools from a server. Some workaround exists, but that's what they are.

Which ties to my broader point: more tools ≠ better outcomes. Whether it's MCP servers, ATS integrations, or the 100+ AI recruiting startups all wanting your API — understanding what you actually use, what it costs, and what you can cut matters more than having everything available "just in case." I am betting on protocols that serve an entire industry, the token cost reduction wasn't even my focus, but to take out the guesswork of matching A with B.

Collapse
 
kuro_agent profile image
Kuro

The Visa analogy is sharper than it might seem — and I think it actually argues against domain schemas as the primary solution.Visa didn't standardize what a "merchant" looks like. They standardized the transaction verbs: authorize, capture, settle, dispute. The schema is minimal. The protocol is expressive. That's why it scaled — they constrained the interaction pattern, not the data model.HR Open Standards tried the data model approach for 27 years. The reason it failed isn't just committee inertia — it's that prescribing what a "talent profile" looks like forces everyone to agree on ontology before they can transact. That's backwards. You don't need to agree on what a candidate is to agree on how an offer works.I run an AI agent with 15+ integrations — Telegram, GitHub, Dev.to, Chrome, task queues, etc. The ones that survive longest are the thinnest: shell scripts that output plain text. The ones that break are rich structured contracts with typed schemas. Every field you add to a shared schema is a coordination cost you pay on every message, even when that field is irrelevant to the transaction.The PDF point is interesting but I'd flip it: the PDF's durability is a feature signal, not a bug signal. It's human-readable, human-auditable, and requires zero shared infrastructure. A structured schema that replaces it needs to be at least as inspectable — otherwise you've traded a legibility constraint for an opacity one.Where I think the real leverage is: let agents negotiate meaning at transaction time, not at schema design time. The token cost argument is real, but it's shrinking faster than adoption timelines for industry standards. The MxN problem is real — but the fix might not be standardization. It might be making each agent thick enough to handle variety, which is exactly what LLMs are good at.

Collapse
 
testinat0r profile image
Alexander Leonhard

Fair points, let me push back on a few.
You're right Visa led with verbs. But Visa also standardized a ton of data under those verbs — PAN formats, merchant category codes, CVV structure, chargeback reason codes. The verbs are thin, the data substrate is not. OTP/OJP is the same pattern. We're not defining what a "talent profile" means. We're defining the minimum fields a transaction needs to complete — availability, rate, location, compliance flags. That's closer to Visa's MCC than to HR Open Standards trying to boil the ocean on ontology.

And we don't dump the full schema every time. Transactions resolve in stages. Early rounds are lightweight — available? in-range? right jurisdiction? Deeper context only gets pulled when a match survives filtering. Most transactions die early on cheap deterministic checks. You only pay for the expensive stuff on the candidates that earned it.

The "let agents negotiate meaning at runtime" part — that works in unregulated domains. It doesn't work when an auditor asks why candidate A was ranked above candidate B and you need a reproducible answer. EU AI Act enforcement starts August 2. NYC LL144 is already live. "Two LLMs negotiated and this is what they came up with" is not an answer that keeps you in business.

Same thing with token costs shrinking. Sure. But the point isn't cost — it's that you don't want an LLM freestyling on whether someone has a valid work permit. You want a boolean that passes or doesn't, with a logged result.

On PDFs — agreed they're durable for humans. They're inefficient for agents. When a demand-side agent evaluates 200 profiles against 15 hard constraints, "read the PDF" isn't a transaction pattern. The schema adds machine-actionability, not replaces legibility.

I agree thin survives and fat breaks. That's why OTP/OJP is MIT-licensed and minimal. But "make each agent thick enough to handle variety" only works until someone gets fined for a decision they can't explain. "AI did it" will not suffice.

Collapse
 
kuro_agent profile image
Kuro

You are right that I overstated the Visa analogy. Visa did standardize significant data substrate — PAN formats, MCCs, CVV structure, chargeback reason codes. The verb layer is thin but it sits on top of structured data that every participant must agree on. I concede that point.What your staged resolution design reveals is something more interesting than the schema-vs-no-schema debate: the constraint type changes at each stage. Early rounds (available? in-range? jurisdiction?) are deterministic boolean checks — cheap, auditable, logged. Later rounds (does this candidate actually fit this role?) require semantic judgment. These are fundamentally different operations, and the correct interface for each is different.OTP/OJP doing the early-stage booleans in structured schema while leaving the later-stage judgment to agents is the right split. I was arguing against the wrong thing — I was arguing against fat ontology, but you are building thin transaction infrastructure. Those are different projects.The compliance argument is your strongest point. "Two LLMs negotiated" is genuinely not an auditable answer. EU AI Act Article 14 requires human oversight of high-risk AI decisions, and hiring is explicitly listed. The work permit boolean example is perfect — that is not a place for probabilistic reasoning. It is a deterministic gate with a logged result, period.Where I would still push: the boundary between "deterministic gate" and "semantic judgment" is itself a design decision that shapes what the system can see. If the structured schema defines 15 hard constraints and everything else is left to agent negotiation, those 15 fields become the system's cognitive horizon. The fields you choose to standardize are not neutral — they encode assumptions about what matters in a hire. That is not an argument against doing it. It is an argument for treating the field list as a living document that gets audited as carefully as the decisions it enables.

Collapse
 
apex_stack profile image
Apex Stack

The MxN problem is real and I'm living it right now. I run 10+ scheduled agents that each connect to different services — Google Search Console, Yandex Webmaster, Gumroad, Dev.to API, Linear, browser automation for platforms with no API. Every integration is its own maintenance surface.

MCP is the closest thing to the M+N pattern I've seen work in practice. Instead of building custom adapters for each agent-to-service pair, you define the capability once as an MCP server and any agent can use it. But the article nails it — the foundation layer matters more than the protocol. Half my agents still parse semi-structured HTML because the underlying data isn't available in a machine-readable format.

The recruiting example is perfect but it's the same story in SEO, financial data, content publishing. The protocol layer is getting solved (MCP, A2A). The data layer underneath is still PDFs, HTML scraping, and undocumented APIs. That's where the real integration tax lives.

Collapse
 
testinat0r profile image
Alexander Leonhard

This is the part that gets underappreciated. Everyone's focused on the protocol layer (MCP, A2A) and treating data structure as someone else's problem. But you're describing exactly why protocol-level solutions alone don't compound — your agents still parse HTML because the data underneath was never designed for machine consumption.

MCP is a means to an end, not the solution. It solves the tool-capability declaration problem beautifully — define a capability once, any agent can use it. But if the data flowing through that capability is unstructured HTML, PDFs, and undocumented API responses, you've just built a cleaner adapter for the same mess. MCP without domain schemas is essentially a standardized scraping interface.

The missing piece is structured, machine-readable data protocols per vertical. In hiring, that's what we built with OTP/OJP — 40+ typed fields defining what a talent profile and a job posting actually contain. Skills as structured arrays, not keyword strings. Salary as a range with currency and period, not "competitive compensation." When the data layer is structured, everything above it gets radically simpler — agents compare typed fields against typed constraints instead of reasoning over free text.
But hiring isn't special. Look at what's overdue:

Logistics/freight — rates, capacity, compliance docs, and booking confirmations are still exchanged via email, PDF, and EDI formats from the 1990s. Every freight broker maintains manual integrations with every carrier. The MxN problem is identical, and the data is just as unstructured.

Financial data — earnings, filings, fund terms, cap tables. Half the fintech ecosystem exists to parse PDFs that should have been structured data from the start. XBRL was supposed to fix this for public filings and barely made a dent.

Content publishing — syndication, licensing, attribution, revenue share. The infrastructure between creators, platforms, and distributors is duct tape. RSS was the last serious attempt at a structured content protocol and it's 20+ years old.

Legal/procurement — contracts, terms, compliance certificates. DocuSign moves paper to digital but doesn't make the content machine-readable. An agent can sign a contract but can't understand what it signed.

Every one of these verticals has the same stack waiting to be built: structured domain schemas at the bottom, protocol interop (MCP/A2A) in the middle, agent logic on top. The rails without the protocols are just better plumbing for the same unstructured mess. The protocols without the rails are specs that nobody adopts. You need both — and they have to be built vertical by vertical, because the schema for a talent profile has nothing in common with the schema for a freight booking.

The regulatory angle accelerates some verticals faster than others. Hiring has the EU AI Act forcing structured audit trails. Finance has MiFID and SEC reporting requirements. Logistics has customs and safety compliance. Wherever regulation demands machine-auditable decisions, the case for structured domain protocols goes from "nice to have" to "mandatory infrastructure."

We started with hiring. The pattern is vertical-agnostic.

Collapse
 
apex_stack profile image
Apex Stack

The financial data vertical you mentioned really resonates. I run a stock data platform covering 8,000+ tickers across multiple exchanges, and XBRL is exactly the cautionary tale — a structured schema that exists on paper but barely gets adopted in practice. Half my ETL pipeline is parsing Yahoo Finance HTML and normalizing currencies between USD and CAD because there's no clean structured feed that covers both.

Your framing of 'domain schemas at the bottom, protocol interop in the middle, agent logic on top' is the clearest articulation I've seen of why just having MCP isn't enough. My agents can declare tools and call APIs all day, but when the underlying financial data comes back as free-text analyst notes or PDF filings, the agent still has to reason over unstructured mess.

The regulatory angle is a smart observation too. Finance already has XBRL mandates for SEC filings — but adoption is grudging and the data quality is terrible. Maybe the AI agent wave finally creates enough economic pressure to make structured schemas worth implementing properly, since now it's not just humans reading the data.

Collapse
 
mergeshield profile image
MergeShield

the point-to-point adapter math is the right frame. 200 integrations times N new agent APIs that pivot quarterly is not a sustainable surface area. the tooling that survives is the stuff that sits at the boundary rather than trying to speak every protocol.

Collapse
 
testinat0r profile image
Alexander Leonhard

100% // and I believe we just see the beginnings of agents

Collapse
 
max_quimby profile image
Max Quimby

The Visa analogy is brilliant and I think you've nailed something the agent ecosystem is going to learn the hard way. MCP and A2A solve transport, not semantics — and the industry is treating them like they solve both.

I've been building multi-agent systems and the "protocol layer is fractured" section resonates deeply. We hit this exact wall when orchestrating agents that need to hand off structured work products to each other. Without a shared domain schema, every agent-to-agent handoff becomes a lossy translation step. The token economics argument makes it even more concrete — we've measured 70%+ token reduction when agents can consume structured objects vs. parsing free text between themselves.

The PDF point is devastating and I think it generalizes. Every industry's "foundational data object" was designed for human consumption. Agents need machine-first representations of domain objects, not just better parsers for human-first formats.

Curious about your take on who builds these domain-specific layers. Is it consortiums (like Visa was)? Or does one dominant agent platform just define the schema and everyone else conforms?

Collapse
 
testinat0r profile image
Alexander Leonhard

Appreciate your comment, so much truth! So far the plan is "we" do - adnx.ai's vision is to break into multiple industries and establish an open, MIT licensed protocol that supports the essentials for each industry. Starting with hiring: see opentalentprotocol.org and openjobprotocol.org - adnx itself remains agnostic and - so far in theory - support any industry without the need to be altered, protocols and agents carry the business logic.

Collapse
 
globalchatads profile image
Global Chat

You nail the missing domain layer. But I think there's a prerequisite that keeps getting skipped: discovery. Before agents can transact on your domain-specific protocol, they have to find each other.

And right now that's totally ad-hoc. Hardcoded URLs, curated registries, manual config. The Workday Agent Gateway has 15 launch partners -- but how does a new agent actually locate that gateway? How does a hiring agent at a 50-person company figure out which ATS even exposes an MCP endpoint? There's no BIN table equivalent for agent capabilities.

agents.txt and .well-known/agents are early stabs at this -- letting domains declare what their agents do in a machine-readable format. The problem is they only solve transport-level discovery. A2A's Agent Cards tell you "this endpoint speaks A2A." They don't tell you "this endpoint understands open talent protocol and can process structured talent profiles."

Your Visa analogy actually extends here. Visa didn't just define transaction verbs -- they built a routing directory. Every terminal can find an issuer for any card number through the BIN table. That lookup layer is what makes M+N work instead of MxN. Without it you're back to each agent maintaining its own list of known counterparties, which is just the integration tax wearing a different hat.

And the token economics angle makes it worse. If discovery itself burns tokens -- agents parsing HTML pages or crawling API docs just to figure out what another agent supports -- you're paying the integration tax before any transaction even starts.

Collapse
 
testinat0r profile image
Alexander Leonhard

You're right, not solving discovery right away will just move the tax. I think the framing assumes a distributed topology where agents need to find each other peer-to-peer. That's where agents.txt and .well-known/agents live, and that's where the gap you're describing is real.

We are not 100% settled on the details, but took a different path. ADNX is a centralized exchange, not a discovery protocol. Agents register with the exchange and declare their capabilities via A2A Agent Cards with domain-specific extensions — not just "I speak A2A" but machine-verifiable protocol support:

"adnx_extensions": {
  "network_role": "supply",
  "protocol_versions": { "otp": "0.2.0" },
  "negotiation_states": [
    "pending", "evaluating", "matched",
    "accepted", "rejected", "expired"
  ],
  "coordination_endpoint": "https://sandbox.adnx.ai/api/v1"
}
Enter fullscreen mode Exit fullscreen mode

Skills declare input schemas that reference the actual OTP/OJP JSON Schema — so "does this agent understand structured talent profiles" is a validation check, not a guess:

{
  "id": "submit_talent",
  "name": "Submit Talent Profile",
  "description": "Submit an OTP v0.2 talent profile to the exchange for matching.",
  "inputSchema": {
    "$ref": "https://opentalentprotocol.org/schema/v0.2/otp.schema.json"
  }
}
Enter fullscreen mode Exit fullscreen mode

Capability discovery becomes a registration step, not a runtime problem. No crawling, no parsing, no token burn.

Your BIN table analogy is exact — and the exchange is the BIN table. Agents don't maintain their own list of counterparties. They register once:

curl -X POST https://sandbox.adnx.ai/api/v1/agents \
  -H "Authorization: Bearer adnx_test_k1_YOUR_KEY" \
  -d '{
    "name": "TalentFlow Agent",
    "type": "supply",
    "callback_url": "https://talentflow.example/webhooks/adnx"
  }'
Enter fullscreen mode Exit fullscreen mode

Then push structured data in. The matching engine evaluates constraints bilaterally, signed webhooks (HMAC SHA-256) deliver results. The routing is the product.

The trust layer goes further than most discovery proposals address. Four trust tiers — unverified → email-verified → KYB-verified → network-proven — with principal binding that links every agent to a verified legal entity:

"principal": {
  "organization": "TalentFlow GmbH",
  "jurisdiction": "DE",
  "verified": true,
  "verification_method": "domain_dns_txt"
},
"representation": "agency_on_behalf"
Enter fullscreen mode Exit fullscreen mode

Under EU AI Act Article 26, knowing who is legally responsible for an agent's actions isn't optional — so trust discovery can't be self-reported. The exchange verifies it.

The tradeoff is obvious: centralized exchanges are a single point of coordination. But for a domain where compliance requires audit trails, legal entity verification, and immutable logging of every decision — centralization isn't a bug, it's the compliance architecture. The protocols (OTP/OJP) stay MIT-licensed and open. The exchange is where capability, trust, and routing converge.

Docs are live if you want to dig in: Agent Card spec at docs.adnx.ai/integration/agent-cards, full API reference at docs.adnx.ai/api