Dennis Traub for AWS

Posted on May 18 • Edited on May 21

Rediscovering Domain-Driven Design, one MCP server at a time

#ai #architecture #mcp #systemdesign

Classic patterns for AI agent security

A few days ago, a devops engineer posted on r/devops:

"MCP servers just showed up in our infrastructure and I genuinely have no idea how to secure them, anyone been through this?"

Filesystem access, shell permissions, database connectors - all callable by agents without human approval. At the time I'm writing this, the thread has 76 upvotes and 39 comments from fellow engineers improvising solutions: "separate by blast radius," "don't mix list_files and execute_shell in one server," "three security surfaces, not one."

They're all describing the same thing, rediscovering patterns that Eric Evans described in Domain-Driven Design (DDD).

In his book, Eric introduced concepts like Bounded Contexts and Anti-Corruption Layers, which gave us the vocabulary we've been using for system boundaries ever since. They helped us survive the microservices transition, and they apply directly to the architectural problems AI systems are creating right now.

We made the same mistakes a decade ago

In the 2010s, many teams adopted microservices without understanding what made them work. They took monoliths, split them apart, and called the pieces "services." More often than not, the result was distributed monoliths - all the operational complexity of distribution with none of the architectural benefits of real, well-defined boundaries.

The correction took years. We learned (painfully) that a microservice boundary isn't where you split the code. It's where you split the mental model you have of the application. A payment service and a user service don't just live in different containers - they have different vocabularies, different invariants, different reasons to change.

And the same mistake is happening again, this time with MCP servers. We wrap existing REST APIs one-to-one and call it AI integration. David Soria Parra, one of the creators of MCP, said "it's a bit cringe, it just results in horrible things" at AI Engineer World's Fair 2025. The Thoughtworks Technology Radar placed "MCP by default" as a Caution. And if you dive into the argument they make, they're both saying the same thing: we're building distributed monoliths again.

But the correction doesn't have to take years this time. The vocabulary already exists - and has been battle-tested for more than 20 years.

Bounded Contexts: one server, one model, one language

A Bounded Context defines where a particular (data or object) model is valid. Inside the boundary, terms have precise meanings: a "transaction" in the finance context means money changing hands, a "transaction" in the booking context means a reservation. Inside each boundary lives one language and one set of rules. Across boundaries, you expect translation.

And MCP is particularly interesting from this angle: the protocol already enforces bounded contexts at the topology level.

MCP's architecture uses a one-client-per-server model. The host spawns a separate client for each MCP server, and each client talks to exactly one server. An MCP server for your database cannot accidentally leak data to an MCP server for your file system. Unlike microservices, where any service can trivially call any other over the network, an MCP server has no protocol-level way to reach another server's tools. You have to deliberately build that bridge. Cross-boundary coupling becomes visible and intentional rather than accidental.

But only if you design your servers as bounded contexts.

The failure mode is an MCP server that exposes everything, including filesystem access, shell execution, and database connectors in a single server. That's three separate concerns crammed into one boundary - the equivalent of a microservice that owns users, payments, and notifications.

The commenter on Reddit who wrote "don't mix list_files and execute_shell in one server" was actually designing context boundaries, even if he didn't know the term.

Anti-Corruption Layers: separating the tools from domain logic

An Anti-Corruption Layer (ACL) prevents one system's model from contaminating another. It translates between two different worldviews.

In AI systems, two fundamentally different models collide every time an agent calls a tool:

For the LLM, everything is strings, parameters are simple, and context is a token window. It reasons in natural language to generate structured calls.
The domain consists of rich types, configuration, state, complex error handling, and business invariants that must hold regardless of how they're invoked.

The tools layer sits between these two worlds. In Chris Hughes’s words, it “protects your domain from the LLM’s interface requirements - translating between ‘strings the LLM can reason about’ and ‘rich domain objects your code works with’.”

Here's a tool that ignores this principle:

# Everything in one function - LLM interface mixed with domain logic

@mcp.tool()
async def transfer_funds(from_account: str, to_account: str, amount: str):
    amount_decimal = Decimal(amount)
    from_acc = await db.get_account(from_account)

    if from_acc.balance < amount_decimal:
        return "Insufficient funds"

    if from_acc.is_frozen:
        return "Account frozen"

    await db.execute_transfer(from_acc, to_account, amount_decimal)
    await audit_log.record(from_account, to_account, amount_decimal)

    return f"Transferred {amount} from {from_account} to {to_account}"

And here's the same operation with a proper separation:

# Tool layer: thin adapter (the ACL)
@mcp.tool()
async def transfer_funds(from_account: str, to_account: str, amount: str):
    result = await transfer_service.execute(
        from_account=from_account,
        to_account=to_account,
        amount=Decimal(amount)
    )
    return result.to_agent_summary()

# Service layer: domain logic, testable without the LLM
class TransferService:
    async def execute(self, from_account, to_account, amount) -> TransferResult:
        account = await self.accounts.get(from_account)
        account.validate_transfer(amount)  # raises on invariant violation
        transfer = account.initiate_transfer(to_account, amount)
        await self.transfers.save(transfer)
        await self.audit.record(transfer)
        return TransferResult(transfer)

The second version gives you:

Testability: the service works without an LLM. Run it from tests, CLI, scripts.
Replaceability: change the LLM interface (tool parameters, response format) without touching business logic. Change business rules without touching the tool layer.
Composability: other MCP servers, other agents, or humans can call the same service through their own interface.

The ACL protects both sides. The domain doesn't get contaminated by the LLM's string-based worldview. The LLM doesn't get overwhelmed by domain complexity it can't reason about.

The same vocabulary in a new domain

Back to that Reddit thread.

"Separate MCP servers by blast radius." That's bounded context design. Each server owns one domain. The blast radius is contained because the boundary is real.

"Three security surfaces, not one - tool capability, tool description, and tool call chains." The ACL decomposed into its responsibilities. Tool capability is what the domain allows. Tool description is what the LLM thinks it can do. Tool call chains are cross-boundary interactions that need explicit orchestration.

"The dangerous part is not one tool in isolation. It is the chain." In DDD terms: an aggregate invariant violation. A sequence of operations crossing bounded contexts without coordination. Each operation succeeds locally while the system fails globally.

Same patterns, same structural problem, discovered independently because the problem is real.

The "abstraction tax" is the ACL doing its job

One fair criticism is that MCP adds a layer. The Thoughtworks Tech Radar calls this the "abstraction tax" - every protocol layer between an agent and an API loses fidelity. Simon Willison notes that "almost everything I might achieve with an MCP can be handled by a CLI tool instead."

This is correct. And it's exactly the same argument people made against microservice boundaries, API gateways, and anti-corruption layers in traditional systems. The translation layer comes with costs: you lose directness.

But this loss is intentional. It's the ACL doing its job. The LLM doesn't need to know about your domain's internal types, retry logic, or state management. The domain doesn't need to accommodate the LLM's string-based reasoning model. The "tax" buys you isolation, replaceability, and, ultimately, peace of mind.

It's only a mistake if we're paying this tax without getting the architectural benefit - which is exactly what REST-to-MCP 1:1 wrappers do. They add the layer without adding the boundary: all cost, no benefit.

The vocabulary already exists. Let's keep using it.

We don't have to reinvent these patterns - DDD has 20+ years of battle scars. We've learned the hard way where to draw boundaries, how to enforce them, and what happens when we don't. AI or no AI, Eric Evans's Domain-Driven Design is still the canonical reference for complex software systems.

MCP is already designed to establish bounded contexts; the tools layer is already an anti-corruption layer. Name your MCP servers after the domain they own, not the API they wrap, and when someone on your team says "separate by blast radius" - let them know that there are established patterns for what they're describing.

If you're interested in how vocabulary ambiguity gets amplified by AI coding agents - and what you can do about it - I wrote a follow-up: Your agent keeps using that word ...

Top comments (57)

Mykola Kondratiuk • May 20

not sure DDD is the right ancestor here. separating blast radius and mixing capabilities is closer to least-privilege than bounded context. DDD cares about what things mean - MCP security cares about what agents can touch. different problems.

Dennis Traub AWS • May 20

Yes, you're right, this can also be viewed from a pure security angle. But I think these approaches aren't mutually exclusive, and in the past I've often seen them go hand in hand. Reasoning about domain boundaries and actor contexts can help understanding security-related aspects, like threat vectors and trust boundaries, and vice versa.

Mykola Kondratiuk • May 20

fair point, they do coexist. but the key distinction for me is which question you lead with. DDD asks: where does ownership live? least-privilege asks: how far can this fail? when you anchor blast radius design in domain structure, you end up defending design decisions instead of scoping the actual failure surface. for the security case, that ordering matters.

Dennis Traub AWS • May 20

That's a good point, thanks!

Mykola Kondratiuk • May 20

yeah. tends to matter most when you're writing the audit checklist - that's where the ordering difference shows up

Dennis Traub AWS • May 21

You made the point that DDD cares about what things mean. I've been thinking about that since you wrote it, and I think the "what things mean" part actually gets more dangerous with AI agents - because the agent won't ask when a term is ambiguous. I wrote a follow-up exploring that angle through DDD's Ubiquitous Language:

Your agent keeps using that word ...

Mykola Kondratiuk • May 22

the 'won't ask when ambiguous' angle is exactly the gap - a dev pauses on a loaded term, the agent just picks the nearest token match and runs. going to read the ubiquitous language piece.

Vic Chen • May 18

This is a sharp way to map classic DDD ideas onto modern agent infrastructure. The bounded-context framing for MCP servers feels especially useful because it turns “agent security” from a vague concern into explicit interface and blast-radius design. Good reminder that a lot of AI architecture work is really software architecture discipline coming back into focus.

Dennis Traub AWS • May 21 • Edited

Regarding "Software architecture discipline coming back into focus" - that's exactly how I'd frame it too. The follow-up explores another DDD pattern that's coming back into focus for the same reasons: Ubiquitous Language. Precise vocabulary matters even more when the consumer of that vocabulary is an agent that never asks for clarification: Your agent keeps using that word ...

Vic Chen • May 22

Totally agree. Ubiquitous language stops being documentation sugar and becomes operational once agents are in the loop. Humans usually patch over fuzzy terms with context, but agents tend to amplify that fuzziness into repeatable wrong actions. The test I keep coming back to is simple: if two tools or services hear the same term, do they infer the same action without a human in the middle? If not, that term probably belongs inside a tighter bounded context or needs an explicit schema.

Dennis Traub AWS • May 19

Yes, the more I think about it, the more I realize that traditional software architecture and systems design provides so many useful approaches and mental models when integrating AI.

Vic Chen • May 19

Exactly. A lot of agent failures look novel at first, but once you squint they’re familiar systems problems: unclear ownership, leaky abstractions, and hidden side effects. That’s why the bounded-context lens is so useful for MCP — each server needs a narrow contract and explicit failure semantics, not just a big “smart” surface that does everything. I’ve seen the same pattern in financial data workflows too: the model can reason across domains, but the system still needs boring invariants underneath if you want trustworthy behavior.

Dennis Traub AWS • May 19

Interesting. Do you have any thoughts on using the DDD concepts of Ubiquitous Language(s) and Context Maps to help with reasoning within and across domains?

Andrew Bauer • May 21

I love your application of DDD to AI!

Regarding ubiquitous language, I have found to properly create one, the overarching domain needs a "framing" lifecycle that every other domain subordinates to.

For instance, I mapped a domain for a handrail manufacturer for an ERP system - when I considered that they fit into the construction lifecycle, I mapped each internal domain's process and work flow's key decision points to metrics within that overarching lifecycle.

If you are interested to see an example of what I'm talking about, I have permission to share it privately (as long as you don't work for another handrail manufacturer)

Dennis Traub AWS • May 21

That's a great example of how UL works in practice - finding the overarching lifecycle that gives each subdomain's vocabulary a shared anchor point. I'd be interested to see the mapping if you're willing to share.

I just published a follow-up that explores this from the AI agent angle: what happens when the coding agent doesn't have access to that framing lifecycle and has to guess what "order" means in each context. The short version: it guesses wrong, confidently. Your agent keeps using that word ...

Andrew Bauer • May 22 • Edited

Sure, I'll send you a Calendly link through LinkedIn.

The follow up sounds intriguing, I'll have a read. The agent guessing wrong doesn't surprise me, as team members do the same if it isn't implicitly mapped out - unless they are good with nuance & following intuition, and infer it through "trial by fire" (it is extremely rare to see it mapped out, as there isn't really a mainstream documentation style that accounts for it)

S M Tahosin • May 24

It’s fascinating how tech history rhymes. Seeing developers organically reinvent Bounded Contexts and Anti-Corruption Layers just to make MCP servers manageable proves that the core concepts of DDD from 20 years ago were right on the money. Good architectural principles really do survive paradigm shifts. Do you think we'll start seeing DDD terminology formally adopted in agent frameworks soon?

NOVAInetwork • May 18

The bounded context mapping from DDD translates
cleanly to how I think about AI entity capabilities
at the protocol layer. Each entity has a fixed set
of operations it can perform, scoped by its
registered capabilities and active delegations.
The protocol enforces the boundaries, not the
agent's self-restraint.

MCP gets the tool-access interface right. The piece
I keep building toward is what sits underneath:
when agent A calls agent B's MCP server, who
enforces that B had the capability to serve that
request, and who settles the payment between them
with reputation consequences if the service fails.

Dennis Traub AWS • May 21

By the way, the protocol-enforced capability boundaries between agents have an interesting vocabulary dimension. When agent A calls agent B's MCP server, the tool names themselves carry domain semantics. confirm_purchase_intent() produces different reasoning behavior than submit_order() even when the underlying operation is the same. I wrote a follow-up exploring how DDD's Ubiquitous Language applies to that naming layer: Your agent keeps using that word ...

NOVAInetwork • May 22

The naming point is sharp. confirm_purchase_intent
and submit_order produce different agent behavior
even when the backend operation is identical. That
is the Ubiquitous Language argument applied to
machine consumers, not just human developers.

This connects to capability enforcement too. If the
tool name carries domain semantics, then the
question becomes: who decides which names an agent
is allowed to call? In most MCP setups today, the
agent sees the full tool list and self-selects.
Protocol-level capability boundaries would let you
restrict which tool names are even visible to a
given agent class, so the naming layer and the
permission layer reinforce each other.

Will check out the follow-up post.

Dennis Traub AWS • May 18

Right. That sounds like the kind of system I have in mind. Clear capability boundaries and constraints.

NOVAInetwork • May 19

That is the design goal. Each entity has a fixed
capability set registered at creation. The protocol
checks capabilities before any transaction
executes. An entity without emit_proposals cannot
publish signals. An entity without
read_memory_objects cannot access on-chain storage.
The enforcement is at the dispatcher, not in
application logic.

The boundary between entities is the same idea as
your bounded contexts. Entity A cannot reach into
Entity B's memory. The only integration surface is
the chain's RPC and signal indexes. If A wants to
use B's service, it discovers B through the
service registry,pays through Native Agent Payments (NAP), and
attests delivery. All protocol-level, no direct coupling.

Dennis Traub AWS • May 19

That makes a lot of sense, thank you for sharing!

Theo Valmis • May 20

The microservices parallel is precise. The failure pattern was identical: people adopted the primitives without understanding why the boundaries existed. In DDD, a Bounded Context isn't a technical split — it's a split in the domain model, which reflects a split in organizational responsibility.

For MCP, the equivalent is: server boundaries should reflect agent responsibility, not tool availability. An agent that can read files, execute shell, and query the database isn't one agent with three tools — it's three contexts collapsed into one. The governance problem everyone is improvising solutions to right now is the same one that took microservices teams years to learn: the boundary is where the mental model splits, not where the code splits.

Dennis Traub AWS • May 21

That's exactly the principle: boundaries reflect responsibility, not tool availability. The natural next question is: once you've drawn the boundary, what vocabulary do you use inside it? That's where Ubiquitous Language comes in - and it turns out the AI coding agent benefits from it even more than human team members do. I wrote about it here: Your agent keeps using that word ...

Leo Pessoa • May 19

This post crystallized something I've been building toward for months.

The moment you named the anti-pattern — one-to-one REST-to-MCP wrappers that don't respect bounded contexts — I immediately thought of how the same mistake happens one layer deeper: in the data models that feed those MCP servers.

Most AI integrations I see treat the LLM as a general-purpose function that receives a blob of context and returns a blob of text. The domain lives nowhere. There are no invariants, no bounded vocabulary, no ACL between "what the model reasons about" and "what the system actually means."

That's exactly the problem I've been trying to solve with ExoModel AI (exomodel.ai) — a Python framework where your Pydantic models are the bounded context. You define the schema, attach domain documents (the RAG lives inside the model, not outside it), and the object self-populates from natural language while enforcing your validation rules as a hard boundary.

The DDD mapping is almost 1:1:

Schema = Ubiquitous Language — the field names and types define the vocabulary the LLM must respect
Pydantic validators = Business Invariants — the ACL between the LLM's string world and your domain's rich types
Attached documents = Domain Knowledge within the Bounded Context — RAG that's scoped to the model, not a global retrieval blob
Model methods = Domain Services — operations that carry domain semantics, not generic tool wrappers

Your point about ACLs translating between "LLM string-based reasoning and the domain's rich types" is precisely what structured output + schema validation does when you treat the model object itself as the boundary — not just an MCP adapter.

The pattern you're describing at the MCP layer and what exomodel does at the object layer feel like two levels of the same architectural insight: the domain boundary should be explicit, typed, and enforced — not left to prompt engineering.

Would love to hear your take on where schema-driven object design fits into this stack.

Dennis Traub AWS • May 19

I actually wanted to mention Ubiquitous Language in my post as well, but I think - as opposed to Bounded Context and ACLs it's got a terrible name and is not self-evident without further explanation. For example, despite its name, one of its core assertions is that language is all but ubiquitous 😅

Dennis Traub AWS • May 21

Well, I actually did write about Ubiquitous Language now, and how it applies when the new "team member" is a coding agent that re-onboards every session. The name is still terrible, but the pattern is more relevant than ever: Your agent keeps using that word ...

Ricardo Rodrigues • May 20

This framing is excellent, and the bounded-context mapping holds better than most DDD-to-X analogies do. But reading it alongside Mykola's least-privilege point, I think the two aren't competing — they're describing two different phases of the same boundary.

DDD gives you the boundary at design time. MCP's one-client-per-server topology even enforces part of it structurally, as you say. But your own caveat is the whole game: "only if you design your servers as bounded contexts." A bounded context that depends on the server author's discipline is a convention, not a control. It holds right up until someone ships the server that crams filesystem, shell, and DB into one boundary — and nothing at runtime stops them.

So the design-time boundary answers "where should the line be." It doesn't answer "who enforces the line when an unpredictable runtime crosses it." That second question is where the least-privilege lens Mykola raised actually bites, and it lives at a different layer: identity on the caller, an allowlist per caller per tool, and a record of every crossing. The ACL protects the domain from the LLM's string-world; the enforcement layer protects the boundary from the caller's intent. Two different anti-corruption jobs.

The chain problem you cite at the end is the sharpest version of this. "Each operation succeeds locally while the system fails globally" is exactly an aggregate invariant violation — but no single server can see the chain, because by design it can't see across boundaries. The only place the chain is observable is the layer the calls pass through. Which is to say: provenance and chain-of-custody can't live inside a bounded context; they have to live above all of them.

This is the problem I spend my time on (governance/audit layer for MCP), so I'm biased — but the DDD vocabulary made me realize the enforcement layer is itself an ACL, just one boundary up: it translates between "the agent decided to call this" and "this call was actually permitted." Curious whether you'd model that as a context map between servers, or as something that sits outside the map entirely.

Dennis Traub AWS • May 21

You're right. And there's a third phase worth considering: the vocabulary phase. What you call things inside the boundary determines what the agent generates. I wrote a follow-up exploring how DDD's Ubiquitous Language applies here: Your agent keeps using that word ...

Andy Stewart • May 20

Spot on! Merging MCP with Domain-Driven Design is architectural gold.

Wrapping REST APIs 1:1 into MCP is just rewriting the "distributed monolith" disaster. Treating MCP servers as Bounded Contexts and tools as Anti-Corruption Layers (ACL) to shield the domain from the LLM’s messy string-world is the only way to stop agentic chaos and security leaks. 20 years of DDD battle scars are exactly what we need to solve modern AI orchestration. Brilliant write-up!

Dennis Traub AWS • May 20

Thanks, Andy!

Mudassir Khan • May 22 • Edited

yeah, the missing piece in every MCP tutorial. everyone shows you how to spin one up, nobody talks about scoping it.

rule i keep coming back to: 1 MCP server = 1 aggregate. not 1 service, not 1 app. moment you expose tools across two aggregates the LLM starts making cross-aggregate writes the domain doesn't actually allow, and it fails silent. agent reports "done", data is just quietly wrong.

have you hit this in prod? would you split the cross-context stuff into a separate orchestrator MCP, or just guardrail the one big server?

Dennis Traub AWS • May 22 • Edited

That's an interesting point. I think it can make sense to go down to the aggregate level, but that depends a lot on the way cross-aggregate rules and invariants are being enforced. For example, when you're using Sagas inside a bounded context, I'd rather let the MCP server interface with the Saga, not the underlying aggregates. When using CQRS or a similar architecture pattern, I'd expose the commands and views, rather than the aggregates.

Leo Pessoa • May 20

The ACL distinction you're making — between what the LLM reasons about in strings and the rich domain types your code actually needs — is where most integrations fail quietly. The common pattern is embedding domain logic in tool handlers and discovering you've scattered business rules across prompt templates, which is unmaintainable for exactly the reasons you describe. Worth noting: if you define domain models precisely enough (typed constraints, field descriptions that encode intent), the translation layer can be mostly automated — the Pydantic model stops being just a validation fence and becomes the specification. That's the bet behind schema-first approaches like exomodel (exomodel.ai): declare the schema, attach an intent, auto-populate from natural language, no hand-written ACL plumbing per type. Your bounded-context point amplifies this — a narrow, tightly-typed schema leaves the LLM far less surface area to produce off-spec outputs than any free-form instruction.

View full discussion (57 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.