Duncan Brown

Posted on Feb 27

AI Coding Assistants and the Erosion of Ubiquitous Language

#architecture #ddd #ai

In my previous two posts, I’ve written about integrating AI into Spring Boot systems without collapsing architectural boundaries, and about how AI-assisted refactors can quietly erode layering.

There’s another form of drift that’s even subtler still:

Semantic drift.

AI coding agents don't just change structure faster. They change language faster.

Ubiquitous Language Is Structural, Not Cosmetic

In domain-driven design, ubiquitous language isn’t just naming preference.

It’s the shared vocabulary across:

Developers
Domain experts
Product owners
Stakeholders

When done well, the code mirrors the language of the business. This has the obvious benefit of allowing technical and non-technical people to describe and understand the same concepts with little friction.

The name of an aggregate, entity, value object, or concept is not arbitrary — it encodes meaning.

That meaning becomes part of the system’s architecture.

Meaning is easier to erode than structure.

Let's see how.

A Simple Example of Semantic Drift

Imagine a domain centred around document reviews.

public class Review {
    private Score score;
}

In the business, everyone says:

“Review”
“Score”

We'll assume for now that the definitions of those terms are implicit; that is, it's assumed that everyone "just knows" what someone is talking about when they use the words "Review" and "Score", and that there is no explicitly defined ubiquitous language.

Now imagine asking an AI assistant to “improve naming clarity", “refactor for readability", or even, "implement a more robust and comprehensive scoring system".

You might end up with:

public class Assessment {
    private Rating rating;
}

From a purely linguistic perspective, this looks reasonable.

From a domain perspective, however, it may not be, and in fact most likely is not.

If stakeholders still talk about “reviews” and “scores,” but the code now talks about “assessments” and “ratings,” the shared language has split.

Nothing breaks; tests still pass; layers remain intact.

But meaning has drifted. Developers begin to make use of these new terms, eschewing the "older" versions, even as those new terms begin to take on additional meanings.

Now imagine a meeting between the developers, management, and stakeholders. When one side refers to either set of terms, one of two outcomes is likely:

The other side(s) quietly assume that the use of the term is synonymous with their term, and so too is their understanding.
The other side(s) question the use of the term, and the side using the term does some vocal hand-waving and says, "They're pretty much the same thing," which may or may not be accurate.

Either way, the door has been opened for future issues that will likely compound over time.

(There's actually a third, correct way to handle the above scenario, and that is to actively question the use of the term and insist on establishing an explicit, mutually-understood definition and to add it to a shared ubiquitous language document.

Obvious - and frequently ignored.)

The Real Risk

Layering drift breaks structure.

Semantic drift breaks alignment.

A system can preserve its ports and adapters and still lose its conceptual integrity.

Once a ubiquitous language fragments - or is never formalized in the first place:

Discussions become translation exercises.
Onboarding slows down.
Invariants become harder to reason about.
Governance loses its anchor.
Refactoring becomes riskier because intent is no longer clear.
Assumptions and connotations make their way into code.

And AI accelerates this.

AI assistants frequently:

Introduce “more common” synonyms.
Replace domain-specific terms with generic ones.
Suggest loaded abstractions like Manager, Handler, or Processor.
Collapse business vocabulary into framework vocabulary.

As with AI-assisted refactors, none of these changes are malicious.

They might be locally optimal.

But neither structure nor semantics are local.

They are cumulative.

Making Ubiquitous Language Explicit

If structural boundaries need governance, so does language.

Ubiquitous language should live not only:

In conversations.
In someone’s head.
In scattered comments.

It should exist as a canonical artifact.

Although it may sound tempting, don't bury that artifact in AGENTS.md, and don't allow that "artifact" to be implied by class names alone.

Something like a dedicated spec.md or ubiquitous-language.md file. I prefer to put mine in a dedicated docs/ folder off the project root.

Ideally, it should be structured and machine-parseable. Even a YAML file is a good idea.

For example:

Review:
  definition: An evaluation of a submitted document.
  invariants:
    - Has exactly one Score.
    - References one Document.
  synonyms_disallowed:
    - Assessment
    - Evaluation

Score:
  definition: A normalized integer between 0 and 100 representing review quality.
  invariants:
    - Range: 0..100
  synonyms_disallowed:
    - Rating

This is not documentation for humans alone.

It can be:

Referenced during code review.
Used to validate naming conventions.
Fed into AI prompts as a constraint.
Parsed by tooling to generate glossaries.
Used by front-end interfaces to maintain consistency.

Once language is explicit, it can be protected.

The language must be versioned. Any proposed change to the language should go through an established workflow and agreed upon by all required parties. Rolling back should be as simple as rolling back to a previous commit. I cannot stress this enough.

Some additional guidelines for using your ubiquitous language artifact:

Ideally, keep it in a single shared repository and treat it as the sole source of truth. I prefer to keep mine in a backend application that has tasks for exposing and/or exporting the language for consumption by other projects.
It can (and should) also be updated as necessary.
Add a guardrail to your AGENTS.md to let the coding agent know not to introduce any new terms or concepts that aren't already a part of the ubiquitous language definition.
Have the agent confer with the developer should the agent feel a new term or concept should be used.
Tokenize the terms and verbiage of any user-facing interfaces (e.g. web apps) to draw from the ubiquitous language. A great bonus use of this is for hovering over domain terms as a form of embedded glossary.

Governance Protects Meaning

Architectural governance is not only about enforcing layers.

It is about preserving intent.

There are at least two dimensions to protect:

Structural boundaries (what depends on what).
Semantic boundaries (what things mean).

Structural governance might be enforced through:

ArchUnit rules.
Module boundaries.
Build-time checks.

Semantic governance requires:

Canonical definitions.
Explicit vocabulary.
Deliberate review of term changes.
Guardrails in AI-assisted workflows.

If AI is part of your development process, it must be aligned not only with your layering rules, but with your language.

AI cannot respect a domain vocabulary that is not formalized.

Closing Thoughts

Structural drift is quiet but visible.

Semantic drift is quieter.

Both compound over time.

If you care about architectural longevity in AI-assisted systems, governance must protect both structure and meaning.

Without that, speed quickly becomes erosion.

DEV Community