A few days ago, a devops engineer posted on r/devops:
"MCP servers just showed up in our infrastructure and I genuinely have no idea how to secur...
For further actions, you may consider blocking this person and/or reporting abuse
not sure DDD is the right ancestor here. separating blast radius and mixing capabilities is closer to least-privilege than bounded context. DDD cares about what things mean - MCP security cares about what agents can touch. different problems.
Yes, you're right, this can also be viewed from a pure security angle. But I think these approaches aren't mutually exclusive, and in the past I've often seen them go hand in hand. Reasoning about domain boundaries and actor contexts can help understanding security-related aspects, like threat vectors and trust boundaries, and vice versa.
fair point, they do coexist. but the key distinction for me is which question you lead with. DDD asks: where does ownership live? least-privilege asks: how far can this fail? when you anchor blast radius design in domain structure, you end up defending design decisions instead of scoping the actual failure surface. for the security case, that ordering matters.
That's a good point, thanks!
yeah. tends to matter most when you're writing the audit checklist - that's where the ordering difference shows up
You made the point that DDD cares about what things mean. I've been thinking about that since you wrote it, and I think the "what things mean" part actually gets more dangerous with AI agents - because the agent won't ask when a term is ambiguous. I wrote a follow-up exploring that angle through DDD's Ubiquitous Language:
Your agent keeps using that word ...
This is a sharp way to map classic DDD ideas onto modern agent infrastructure. The bounded-context framing for MCP servers feels especially useful because it turns “agent security” from a vague concern into explicit interface and blast-radius design. Good reminder that a lot of AI architecture work is really software architecture discipline coming back into focus.
Regarding "Software architecture discipline coming back into focus" - that's exactly how I'd frame it too. The follow-up explores another DDD pattern that's coming back into focus for the same reasons: Ubiquitous Language. Precise vocabulary matters even more when the consumer of that vocabulary is an agent that never asks for clarification: Your agent keeps using that word ...
Totally agree. Ubiquitous language stops being documentation sugar and becomes operational once agents are in the loop. Humans usually patch over fuzzy terms with context, but agents tend to amplify that fuzziness into repeatable wrong actions. The test I keep coming back to is simple: if two tools or services hear the same term, do they infer the same action without a human in the middle? If not, that term probably belongs inside a tighter bounded context or needs an explicit schema.
Yes, the more I think about it, the more I realize that traditional software architecture and systems design provides so many useful approaches and mental models when integrating AI.
Exactly. A lot of agent failures look novel at first, but once you squint they’re familiar systems problems: unclear ownership, leaky abstractions, and hidden side effects. That’s why the bounded-context lens is so useful for MCP — each server needs a narrow contract and explicit failure semantics, not just a big “smart” surface that does everything. I’ve seen the same pattern in financial data workflows too: the model can reason across domains, but the system still needs boring invariants underneath if you want trustworthy behavior.
Interesting. Do you have any thoughts on using the DDD concepts of Ubiquitous Language(s) and Context Maps to help with reasoning within and across domains?
The bounded context mapping from DDD translates
cleanly to how I think about AI entity capabilities
at the protocol layer. Each entity has a fixed set
of operations it can perform, scoped by its
registered capabilities and active delegations.
The protocol enforces the boundaries, not the
agent's self-restraint.
MCP gets the tool-access interface right. The piece
I keep building toward is what sits underneath:
when agent A calls agent B's MCP server, who
enforces that B had the capability to serve that
request, and who settles the payment between them
with reputation consequences if the service fails.
By the way, the protocol-enforced capability boundaries between agents have an interesting vocabulary dimension. When agent A calls agent B's MCP server, the tool names themselves carry domain semantics.
confirm_purchase_intent()produces different reasoning behavior thansubmit_order()even when the underlying operation is the same. I wrote a follow-up exploring how DDD's Ubiquitous Language applies to that naming layer: Your agent keeps using that word ...The naming point is sharp. confirm_purchase_intent
and submit_order produce different agent behavior
even when the backend operation is identical. That
is the Ubiquitous Language argument applied to
machine consumers, not just human developers.
This connects to capability enforcement too. If the
tool name carries domain semantics, then the
question becomes: who decides which names an agent
is allowed to call? In most MCP setups today, the
agent sees the full tool list and self-selects.
Protocol-level capability boundaries would let you
restrict which tool names are even visible to a
given agent class, so the naming layer and the
permission layer reinforce each other.
Will check out the follow-up post.
Right. That sounds like the kind of system I have in mind. Clear capability boundaries and constraints.
That is the design goal. Each entity has a fixed
capability set registered at creation. The protocol
checks capabilities before any transaction
executes. An entity without emit_proposals cannot
publish signals. An entity without
read_memory_objects cannot access on-chain storage.
The enforcement is at the dispatcher, not in
application logic.
The boundary between entities is the same idea as
your bounded contexts. Entity A cannot reach into
Entity B's memory. The only integration surface is
the chain's RPC and signal indexes. If A wants to
use B's service, it discovers B through the
service registry,pays through Native Agent Payments (NAP), and
attests delivery. All protocol-level, no direct coupling.
That makes a lot of sense, thank you for sharing!
The microservices parallel is precise. The failure pattern was identical: people adopted the primitives without understanding why the boundaries existed. In DDD, a Bounded Context isn't a technical split — it's a split in the domain model, which reflects a split in organizational responsibility.
For MCP, the equivalent is: server boundaries should reflect agent responsibility, not tool availability. An agent that can read files, execute shell, and query the database isn't one agent with three tools — it's three contexts collapsed into one. The governance problem everyone is improvising solutions to right now is the same one that took microservices teams years to learn: the boundary is where the mental model splits, not where the code splits.
That's exactly the principle: boundaries reflect responsibility, not tool availability. The natural next question is: once you've drawn the boundary, what vocabulary do you use inside it? That's where Ubiquitous Language comes in - and it turns out the AI coding agent benefits from it even more than human team members do. I wrote about it here: Your agent keeps using that word ...
This is a useful framing because it stops MCP from being treated as “just expose the API to the agent.”
The bounded context point makes sense when the MCP server owns a real domain boundary, not just a random set of endpoints. A finance MCP server, a filesystem MCP server, and a shell MCP server should not be bundled together just because the same agent might want all three.
Where I think this gets even more important is security. Once tools become actions an agent can call, the domain boundary also becomes a trust boundary. The question is not only “what does this server mean?” but also “what can this server damage if the agent gets confused?”
That is why the anti-corruption layer idea fits well. The tool layer should translate agent-friendly requests into domain-safe operations, not let the LLM’s string-based view leak directly into business logic.
The question you're raising - what makes a boundary "real" vs arbitrary - connects to something I explored in a follow-up. A domain boundary becomes real when it has its own vocabulary: when "order" means something specific and different from what it means in the a different context. I wrote about how to make that explicit for AI agents: Your agent keeps using that word ...
This post crystallized something I've been building toward for months.
The moment you named the anti-pattern — one-to-one REST-to-MCP wrappers that don't respect bounded contexts — I immediately thought of how the same mistake happens one layer deeper: in the data models that feed those MCP servers.
Most AI integrations I see treat the LLM as a general-purpose function that receives a blob of context and returns a blob of text. The domain lives nowhere. There are no invariants, no bounded vocabulary, no ACL between "what the model reasons about" and "what the system actually means."
That's exactly the problem I've been trying to solve with ExoModel AI (exomodel.ai) — a Python framework where your Pydantic models are the bounded context. You define the schema, attach domain documents (the RAG lives inside the model, not outside it), and the object self-populates from natural language while enforcing your validation rules as a hard boundary.
The DDD mapping is almost 1:1:
Your point about ACLs translating between "LLM string-based reasoning and the domain's rich types" is precisely what structured output + schema validation does when you treat the model object itself as the boundary — not just an MCP adapter.
The pattern you're describing at the MCP layer and what exomodel does at the object layer feel like two levels of the same architectural insight: the domain boundary should be explicit, typed, and enforced — not left to prompt engineering.
Would love to hear your take on where schema-driven object design fits into this stack.
I actually wanted to mention Ubiquitous Language in my post as well, but I think - as opposed to Bounded Context and ACLs it's got a terrible name and is not self-evident without further explanation. For example, despite its name, one of its core assertions is that language is all but ubiquitous 😅
Well, I actually did write about Ubiquitous Language now, and how it applies when the new "team member" is a coding agent that re-onboards every session. The name is still terrible, but the pattern is more relevant than ever: Your agent keeps using that word ...
It’s fascinating how tech history rhymes. Seeing developers organically reinvent Bounded Contexts and Anti-Corruption Layers just to make MCP servers manageable proves that the core concepts of DDD from 20 years ago were right on the money. Good architectural principles really do survive paradigm shifts. Do you think we'll start seeing DDD terminology formally adopted in agent frameworks soon?
This framing is excellent, and the bounded-context mapping holds better than most DDD-to-X analogies do. But reading it alongside Mykola's least-privilege point, I think the two aren't competing — they're describing two different phases of the same boundary.
DDD gives you the boundary at design time. MCP's one-client-per-server topology even enforces part of it structurally, as you say. But your own caveat is the whole game: "only if you design your servers as bounded contexts." A bounded context that depends on the server author's discipline is a convention, not a control. It holds right up until someone ships the server that crams filesystem, shell, and DB into one boundary — and nothing at runtime stops them.
So the design-time boundary answers "where should the line be." It doesn't answer "who enforces the line when an unpredictable runtime crosses it." That second question is where the least-privilege lens Mykola raised actually bites, and it lives at a different layer: identity on the caller, an allowlist per caller per tool, and a record of every crossing. The ACL protects the domain from the LLM's string-world; the enforcement layer protects the boundary from the caller's intent. Two different anti-corruption jobs.
The chain problem you cite at the end is the sharpest version of this. "Each operation succeeds locally while the system fails globally" is exactly an aggregate invariant violation — but no single server can see the chain, because by design it can't see across boundaries. The only place the chain is observable is the layer the calls pass through. Which is to say: provenance and chain-of-custody can't live inside a bounded context; they have to live above all of them.
This is the problem I spend my time on (governance/audit layer for MCP), so I'm biased — but the DDD vocabulary made me realize the enforcement layer is itself an ACL, just one boundary up: it translates between "the agent decided to call this" and "this call was actually permitted." Curious whether you'd model that as a context map between servers, or as something that sits outside the map entirely.
You're right. And there's a third phase worth considering: the vocabulary phase. What you call things inside the boundary determines what the agent generates. I wrote a follow-up exploring how DDD's Ubiquitous Language applies here: Your agent keeps using that word ...
Spot on! Merging MCP with Domain-Driven Design is architectural gold.
Wrapping REST APIs 1:1 into MCP is just rewriting the "distributed monolith" disaster. Treating MCP servers as Bounded Contexts and tools as Anti-Corruption Layers (ACL) to shield the domain from the LLM’s messy string-world is the only way to stop agentic chaos and security leaks. 20 years of DDD battle scars are exactly what we need to solve modern AI orchestration. Brilliant write-up!
Thanks, Andy!
yeah, the missing piece in every MCP tutorial. everyone shows you how to spin one up, nobody talks about scoping it.
rule i keep coming back to: 1 MCP server = 1 aggregate. not 1 service, not 1 app. moment you expose tools across two aggregates the LLM starts making cross-aggregate writes the domain doesn't actually allow, and it fails silent. agent reports "done", data is just quietly wrong.
have you hit this in prod? would you split the cross-context stuff into a separate orchestrator MCP, or just guardrail the one big server?
That's an interesting point. I think it can make sense to go down to the aggregate level, but that depends a lot on the way cross-aggregate rules and invariants are being enforced. For example, when you're using Sagas inside a bounded context, I'd rather let the MCP server interface with the Saga, not the underlying aggregates. When using CQRS or a similar architecture pattern, I'd expose the commands and views, rather than the aggregates.
The ACL distinction you're making — between what the LLM reasons about in strings and the rich domain types your code actually needs — is where most integrations fail quietly. The common pattern is embedding domain logic in tool handlers and discovering you've scattered business rules across prompt templates, which is unmaintainable for exactly the reasons you describe. Worth noting: if you define domain models precisely enough (typed constraints, field descriptions that encode intent), the translation layer can be mostly automated — the Pydantic model stops being just a validation fence and becomes the specification. That's the bet behind schema-first approaches like exomodel (exomodel.ai): declare the schema, attach an intent, auto-populate from natural language, no hand-written ACL plumbing per type. Your bounded-context point amplifies this — a narrow, tightly-typed schema leaves the LLM far less surface area to produce off-spec outputs than any free-form instruction.
The parallel with the 2010s microservices mess is spot on. One thing worth noting: MCP has a structural advantage over microservices — the protocol itself enforces server boundaries at the topology level (one server = one process = one set of tools). The real problem isn't boundary definition, it's that teams stuff 20 unrelated tools into a single MCP server because it's "easier to deploy." Same distributed monolith anti-pattern, different decade.
Honestly, this hits the nail on the head because we're absolutely repeating the early microservices mess with AI integrations right now. Last week, I was looking at a project where someone basically script-wrapped an entire database API into a single MCP server, and it was a security nightmare trying to figure out how to safely restrict permissions without breaking the tool. Treating each MCP server as a strict, isolated domain context makes total sense; otherwise, you just end up with an unmanageable distributed monolith that an LLM can easily exploit or break with a bad chain of calls. It's definitely worth taking the slight hit on abstraction complexity if it means keeping your core business logic completely insulated from the model's unpredictable outputs.
I feel your pain!
Thanks Dennis. The deliberate-vocabulary point is the right next step. The specific thing that helped us was drawing a Bounded Context line between the user-facing agent and the retrieval tool. Inside the agent we used domain language. The tool returned shape-typed records in its own internal vocabulary. We wrote an anti-corruption layer that translates between them, with explicit handling for the case where the LLM response leaks schema-shaped strings into the wrong context. Without that translation step, our agent kept emitting valid-looking JSON that referenced internal field names users never see.
Yes, that's what I've been observing too and I just wrote a post about the language aspect. I'd love to read your thoughts: Your agent keeps using that word ...
This maps cleanly onto how we ended up structuring the Auto-Assign pipeline at AudioProducer.ai. The two passes (Auto-Assign Characters, owning speaker-to-voice mapping, and Auto-Assign Sounds, owning scene-to-soundscape placement) started life as a single "do everything to this chapter" agent and behaved exactly like the distributed-monolith MCP server you describe: a character-rename edit triggered a full sound-placement re-run, and the user could not tell which pass produced which artifact. Splitting them into two bounded contexts gave the editor a real surface: a per-line speaker chip belongs to one context, a paragraph-level music-bed annotation belongs to the other, and re-running one pass never silently overwrites the other's manual overrides. The ACL part lands the same way for us: the LLM proposes character names and scene categories as strings, the domain holds character objects with assigned voices and intro-pause settings, and the invariants hold regardless of what the model emitted. To Vic Chen's "boring invariants underneath" point, that is exactly what lets us let users edit any single line in place without rebuilding the whole chapter, because the model's output was never load-bearing in the first place.
Boring is great. If it's boring, it's predictable. And predictability allows me to sleep well 😅
MCP servers fit the bounded-context pattern more cleanly than most API surfaces. The constraint of structured tool definitions forces you to make domain boundaries explicit in a way REST endpoints often don't. We refactored a 12-tool agent into 3 MCP servers grouped by aggregate root and the agent's function-call accuracy improved noticeably because the schemas became coherent. The interesting part is that the eval improvement came from the architectural cleanup, not from prompt engineering. Schema design is downstream of domain design.
Your observation that tool-use accuracy improved with schema coherence is exactly the mechanism I've been digging into. The schemas are doing something specific: they're carrying consistent vocabulary into the context window. I wrote a follow-up about how to make that vocabulary deliberate rather than accidental: Your agent keeps using that word ...
Great post Dennis.
Thank you Ricardo!
Good explanation, thanks 👍🏼
Thank you
Sorry but I read the title as "Redis-covering". I was like whaaattt.....
Rediscovering DDD through MCP servers is a sharp observation, and it makes sense: an MCP server forces you to define a bounded context with an explicit contract, which is exactly the DDD discipline that's easy to skip when you're just writing internal functions. When the consumer is an agent, you can't lean on tribal knowledge or here's how you're supposed to use it, the tool's name, inputs, and semantics have to carry the meaning on their own, so you end up modeling the domain properly almost by force. The ubiquitous-language idea maps cleanly too: a well-designed tool surface is the domain language made executable, and an agent succeeds or fails largely on how well your tool names and boundaries reflect the actual domain, ambiguous or leaky tool boundaries confuse the model the same way they confuse a new engineer. The deeper point is that agent-readiness and good domain modeling are converging: designing for a literal, non-coping consumer pushes you toward clear bounded contexts, explicit contracts, and no hidden coupling, which is just DDD with a stricter client. Model the domain well and the agent interface falls out, because the agent needs the clarity humans were forgiving about. That tool-surface-as-ubiquitous-language instinct is core to how I think about Moonshift. As you carve MCP servers along domain boundaries, are you finding one-server-per-bounded-context is the right grain, or do you split finer by capability?
This hit hard. The "distributed monoliths all over again" bit is painfully real — I've seen teams wrapping REST APIs 1:1 into MCP servers and calling it done, completely missing the point.
The ACL pattern with that TransferService example is super clean. Keeping the domain logic testable without an LLM in the loop is the kind of thing people skip early on and then pay for later when debugging becomes a nightmare.
Also love that you connected the Reddit thread back to DDD vocabulary. "Separate by blast radius" = bounded contexts. It's wild how engineers keep rediscovering the same patterns independently just because the terminology didn't carry over. We really don't need new words for old problems 😅
Solid read. Bookmarking this for the next time someone on my team says "let's just wrap everything in one server."