Leon Pennings

Posted on Jun 3 • Originally published at blog.leonpennings.com

AI and Enterprise Software Development

#ai #java #softwaredevelopment #architecture

AI is the most significant shift in software development since the internet. Not because it changes what software can do — but because it accelerates the consequences of a distinction the industry has been treating as a preference for thirty years.

Some software needs to work today. Some software needs to keep working — correctly, maintainably, through changing requirements and changing teams — for ten or fifteen years. These are not the same engineering problem. They never were. The tools and practices that serve the first actively undermine the second. AI does the first faster and better than any human developer ever has. What it does to the second is the subject of this article.

What Enterprise Software Actually Is

Enterprise software is not defined by its size, its industry, or its technology stack. It is defined by its relationship with time.

An enterprise application must be correct today. It must remain correct as the business domain evolves around it. It must survive the teams that built it. It must adapt to requirements that nobody could fully predict when it was written. Implementation is not the primary challenge. Understanding — correct, durable, continuously updated understanding of the business domain — is the primary challenge. Implementation follows from that, and is the smaller part of the work.

This is why the domain model is more important than the working application.

That statement will make most developers uncomfortable, and it should. The working application is the visible artifact — the thing that gets demonstrated, delivered, and measured. But a working application without a domain model is a disposable item. It works when it leaves the factory. It was not designed to be serviced. When the business changes around it — and it will — the economics of repair exceed the economics of replacement. Except you cannot simply replace it, because without the model, you rebuild the same misunderstandings into the new version, faster, with more confidence.

A domain model without a working application, on the other hand, is a foundation. Getting it working from that foundation is the smaller problem. It will stay working because the mechanism that keeps it correct is still present, still legible, and still honest.

This distinction — between software built for today and software built to remain correct over time — was always there. It was always a structural choice with structural consequences. It was just treated as a preference, because the consequences were invisible. There was never a comparable version of the same application, built differently, to measure against. The cost of not modeling was permanently hidden.

AI does not make that hiding impossible. The unfalsifiability remains intact — there is still no comparable application built the other way to measure against. What AI does is accelerate the accumulation of consequences, while making the codebase look cleaner than procedural development ever did. That combination is more dangerous than what came before, not less.

The Invisible Breakdown

Enterprise software has a failure mode that almost nobody correctly diagnoses, for a simple reason: diagnosing it requires a reference that doesn't exist.

When an enterprise system becomes difficult to maintain — when features take longer than they should, when bugs touch more than they should, when the team grows but delivery doesn't improve — the diagnosis is almost always the same: this is what enterprise development looks like. Complex domain. Large codebase. Accumulated technical debt. The solution offered is more developers, more process, more tooling.

The real diagnosis requires asking: what would this system look like if it had been built around a rich domain model from the start, maintained over the same period? That version was never built. The comparison is not available. So the decay gets attributed to enterprise complexity rather than to the absence of the structure that would have prevented it.

What the industry misread as the natural difficulty of enterprise development is in most cases the consequence of a broken PDCA cycle. Plan, do, check, act. In enterprise software, the Check step requires being able to find what was encoded, verify it against current understanding, and update it. That requires the essential complexity of the system to be visible, owned, and in one place.

Procedural development does not slow this cycle down. It breaks it. Each piece of logic that lives somewhere convenient rather than somewhere correct, each duplicated rule, each behavior scattered across service classes rather than owned by the concept it belongs to — each one removes a piece of the map the Check step needs. Eventually the map is gone. New requirements get added on top of existing ones without anyone being confident what the existing ones actually do. Contradictions accumulate. The system becomes a record of everything that was ever asked for, in chronological order, with no coherent structure underneath.

This is not enterprise complexity. It is the consequence of building for today without encoding understanding in a form that survives tomorrow. And it was always going to happen — because procedural code has no mechanism for keeping the Check step alive.

Two Practices That Keep the Cycle Running

There are two practices that prevent this breakdown. They are not a methodology. They cannot be certified. They are disciplines — each with a precise job, sequential and mutually dependent.

1. The UI: Verifying the Ubiquitous Language

The first practice is building a UI in the first month of any project — not for end users, not for customers, but as the primary instrument for verifying that the developer and the domain expert are actually talking about the same thing.

This requires immediate clarification. The instinct on projects without obvious end-user interfaces — data pipelines, processing engines, integration layers — is to defer or skip the UI entirely. That instinct is wrong in a specific and consequential way. The UI is not a deliverable. It is a yardstick for the ubiquitous language — the shared vocabulary between developer and domain expert that the entire system depends on being correct.

In twenty-five years of building software with business owners and functional application managers, the same sentence appears at nearly every meaningful discussion: "It sounds correct, but I need to see it working." This is not a failure of imagination. It is an honest statement about the limits of language as a medium for domain transfer. Two people can use the same word and mean different things. They can agree on a description and disagree entirely on what it describes. That misalignment is invisible in conversation. It is undeniable on a screen.

The UI forces concepts into a form the domain expert can evaluate directly. The concept the developer calls an order and the business expert calls an order either map to the same thing or they don't — and the screen is where you find out. The flow the developer modeled as a linear sequence and the business expert understands as a set of parallel states either match or they don't — and the screen is where you find out. No whiteboard session, no requirements document, no sprint review produces this verification with the same precision and immediacy as a working interface the domain expert can navigate directly.

Conceptual thinking is genuinely scarce in software development. Developers are trained to implement described behaviour, not to reconstruct the mental models that produced the description. The UI compensates for this structurally. It makes the domain model visible and therefore falsifiable — which is the only condition under which a domain expert can tell you whether you understood them.

It does not need to be polished. It needs to work, built in semantic HTML that will survive the project's lifetime without becoming a maintenance liability of its own. Its purpose is not presentation. It is verification.

The UI is how you learn the domain correctly. It is the input to everything that follows.

2. The Domain Model: Encoding That Learning Durably

The second practice is encoding what you learned in a rich domain model — and this is where the oldest lesson in software development applies in its most consequential form.

Keep what belongs together in the same place, so nobody has to explain where anything is.

That is the difference between a thousand-piece puzzle and a twenty-five-piece puzzle. How orders are treated in the system can be found in the Order domain object. One place. Non-duplicated logic. Non-contradictory logic. A new developer, a new requirement, a compliance audit — all of them go to the same place and find the same answer.

The domain model is a set of objects, each playing a defined role in the business domain, each owning the responsibility that role entails. Not data structures with methods bolted on. Objects that know what they are responsible for, enforce their own rules, and carry their own behavior. An Order that knows what it means to be cancelled. An Interaction that owns the transactional boundary — carrying the current user, the active roles, the deferred consequences that execute at its close. A KYC entity that owns the rules governing its own assessment.

This is what keeps the PDCA cycle alive. The Check step can still reach what was Done — ten years later, after three teams, through changing requirements. The domain model is the map. As long as the map is honest and current, the cycle runs. New understanding finds its place. The model grows more true over time rather than more obscure.

A consequence of this structure that is rarely discussed is what it provides for free. When the domain model owns its behavior and an Interaction owns the transactional boundary, a failed operation rolls back completely — the database change, the email that hadn't been sent, the downstream consequence that hadn't fired. JDBC transaction rollback is a primitive. Consistency is structural. There is no compensation logic to write, no saga pattern to implement, no consistency verification to run after the fact. The guarantee emerges from the model.

The domain model is how you keep doing the correct thing, indefinitely. It is not documentation about the system. It is the system, in its most honest and durable form.

The Relationship Between the Two

These two practices are sequential and mutually dependent in a precise way.

The UI without the domain model produces correct understanding encoded incoherently. The domain expert confirmed the language. The developer understood the domain. And then scattered it across service classes in a way nobody can find or follow three years later. The understanding was correct and it decays anyway — into the codebase, across layers, through framework conventions — until the next developer cannot reconstruct it.

The domain model without the UI produces a cohesive model of something that may be wrong. Elegant, traceable, internally consistent, and externally misaligned. The developer's interpretation was never verified against the person who actually knows.

Together they form a self-correcting cycle. The UI surfaces what the domain expert actually means. The domain model encodes that meaning durably. The UI surfaces the encoded meaning back to the domain expert for verification. The cycle is self-sustaining — not just at the start but throughout the life of the application.

This is also why you can survive on the domain model alone — a slightly wrong model is still fixable, because the wrongness is visible and locatable, and the PDCA cycle is still running — but you cannot survive on the UI alone. Correct understanding that was never durably encoded dies with the people who held it.

One practice optimizes doing the correct thing and discovering what the correct thing is.

The other documents that discovery in the most undistortable form possible — for today's team, for tomorrow's maintainers, for the requirements nobody has thought of yet.

What the Industry Built Instead

Without these two practices, enterprise software does not fail silently. Teams feel the friction. The PDCA cycle breaks. And the industry, characteristically, built architectures to manage the symptoms.

Before reaching for those architectures, one question is worth asking honestly: does this system genuinely require the same scale as the organizations that invented these patterns — or is the complexity being solved a complexity that was created by code written without a domain model?

CQRS, event-driven architecture, microservices — each originated as a response to real problems at genuine scale. Each carries a significant integration tax: distributed tracing, eventual consistency management, versioned service contracts, deployment orchestration, network failure handling, and the permanent loss of the one guarantee a single domain model provides for free — transactional consistency. Once operations are distributed across services and event queues, rollback is no longer a database primitive. It becomes an engineering problem, solved with compensation logic and saga patterns, maintained indefinitely, on top of the original modeling problem that the architecture was never asked to fix.

A rich domain model makes most of that complexity structurally unnecessary. Not by being clever — by keeping what belongs together in one place, so the system never generates the problems these architectures were designed to manage.

Frameworks and the Question They Answer Too Early

The same logic applies to the framework ecosystem, with one distinction worth making precisely.

Frameworks like Spring exist to provide implementation convenience for developers who should not need to understand the underlying mechanisms. That is a blunt description, but it is an accurate one. Spring wires things together so you don't have to understand the wiring. It provides a transaction model so you don't have to manage transactions. The value proposition is working software without deep understanding of what produces it.

That value proposition has a structural cost. Spring doesn't just provide convenience — it answers structural questions before you've understood the domain. The controller-service-repository recipe is not a neutral scaffold. It is an answer to where behavior should live, given before the domain had a chance to answer that question itself. Engineers who learned Spring as their foundation did not learn to reason about structure — they learned to apply a structure that was handed to them. When the recipe always fits, the judgment to know when it doesn't is never developed.

The relevant question for any framework is: does removing it force you to split essential complexity? If yes, the framework earns its place. If no, it is providing implementation convenience that increasingly AI can provide — without the framework's architectural opinions, without its version upgrade cycle, and without the recipe it substitutes for structural thinking.

Hibernate passes that test. Without it, the domain object and its persistence representation become two separate things — a DTO, a populator, a translation layer that has to be maintained in sync with the domain object when it changes. Hibernate collapses that into the domain object itself. The annotations are honest declarations of what the object requires from persistence. The object loads as itself. The domain model remains the single source of truth. Hibernate serves the model's integrity rather than substituting for structural thinking.

Spring fails that test. Removing Spring does not split essential complexity. It removes the recipe that was preventing essential complexity from being properly owned. AI can now provide the implementation capability Spring was providing — without the recipe.

AI: The Same Mistake, At Greater Speed

This is where the two threads of this article converge — because AI is not a separate story from the domain modeling story. It is the aspect of it that finally makes the stakes undeniable.

AI makes the same mistakes procedural programmers make. It builds what is required today without preparing for tomorrow. It fills in the template, makes it work, ships the feature. The code is correct for the prompt. Whether it is correct for the system — whether it is consistent with what was built six months ago, whether it contradicts a rule established in a different part of the domain, whether it is encoding understanding that will survive the next requirement — these are questions AI cannot ask, because asking them requires a domain model to check against, and AI builds no such model.

For a large category of software, this does not matter. Small applications, internal tools, prototypes, systems with bounded scope and limited lifespans — here, building for today is the correct approach. AI is close to the complete solution. The template gets filled. The application works. The contradictions are manageable because the scope is small enough to hold in working memory. There is no ten-year maintenance horizon. Disposable software benefits from disposable development, and AI is the best disposable development tool ever built.

The schism appears at enterprise software — and it is the same schism that always existed between procedural development and domain-modeled development, now made visible by the speed at which AI can accumulate the consequences.

A procedural programmer building a large system makes their implementation decisions in isolation. Each feature is added to whatever was there before, in whatever shape it happened to be in. Over time the contradictions accumulate. A business rule exists in three places and was updated in two. A concept that should be unified has drifted into five different representations. The PDCA cycle broke quietly, feature by feature, and the system became a record of everything that was ever asked for rather than a model of what the business actually is.

AI does the same thing, at a velocity no human team could match. The contradiction between prompt 1 and prompt 78 is invisible in working software and produces no error, no warning, no friction. The code works. The Check step has no map to navigate. The cycle was broken before it started.

The critical point — and it is worth stating precisely because the counter-argument will come — is that this is not a memory problem. Expanding context windows do not resolve it. An AI holding millions of tokens of procedural code in its context window can still generate a patch that introduces a subtle business contradiction, because the limitation is not how much the AI remembers. It is that the AI has no model of what the system should be — no canonical truth to check against, no domain concept that owns the rule being contradicted, no structure that would make the inconsistency visible before it becomes a bug.

Rebuilding does not help. Without a domain model as the canonical reference, the rebuild reconstructs from the same conversations, the same scattered understanding, the same implicit contradictions — and produces the same contradicting system, faster, with more confidence.

There is an irony worth naming here. AI does not need Spring. It does not need CQRS, event-driven architecture, or microservices. Those frameworks and architectures exist largely as scaffolding for procedural developers navigating complexity they could not otherwise manage — ways of imposing structure on code that had none, or distributing a system too incoherent to reason about as a whole. AI navigates that complexity effortlessly. It can implement a clean monolith in plain Java without the framework overhead, without the service boundaries, without the integration tax.

So AI arrives at something that looks architecturally healthier than what procedural development typically produced — and then exhibits exactly the same underlying problem, on a larger scale, with the symptoms that used to make the problem visible earlier now absent. At least with microservices the integration pain was visible and attributable. The seams showed. Teams felt the coordination overhead and knew something was wrong. The AI monolith looks coherent on the surface. The contradictions are woven through a large, working codebase with no map and no visible seams — and the unfalsifiability that kept the invisible breakdown invisible gets stronger, not weaker. There is still no comparable version built with a domain model to measure against. The decay will still be attributed to the scale of the codebase, or to the prompts not being precise enough, or to enterprise complexity. The measurement problem that hid the breakdown before AI arrived continues to hide it after. It just hides a larger problem, accumulated faster, in code that looks cleaner than anything procedural development ever produced.

Where AI genuinely excels in enterprise development is as a technology consultant. How do I stream documents to an HTTP multipart post. Can I do this in Java, and if so, how. What is the correct behavior of this Hibernate mapping in this edge case. These are questions with objectively correct answers. AI finds them instantly. The framework knowledge a developer spent years accumulating is now available on demand, to anyone who can ask the right question. That is a real democratization of technical expertise.

The reason AI cannot build a domain model is structural, and it goes deeper than context windows or model size. AI is a pattern matcher. It is trained on vast amounts of code and text that has already been written, and it produces output that matches the patterns of what it has seen. This is genuinely powerful for everything that is a pattern — technology implementation, framework usage, boilerplate, adapter code. These things have been written before, in recognizable forms, and AI finds them reliably.

A domain model is not a pattern. It is a representation of something specific that does not exist anywhere in training data — because it has never existed before. The order lifecycle for this logistics company, the KYC rules for this financial institution, the ownership structure for this regulatory context — these are particular, not general. They must be discovered through conversation with the people who know the domain, verified through a UI that makes the concepts visible, and refined through the friction of encoding them in code that either fits the model or reveals where the model is incomplete. None of that discovery process is available to AI as input. It has text. The domain exists in the heads of domain experts and in the resistance of implementation against a model being built for the first time.

This is why larger models and longer context windows do not resolve the limitation. The constraint is not how much AI can remember or process. It is that domain modeling is a discovery activity, and the thing being discovered has never been written down in the form AI would need to pattern-match against. A model trained on ten times more data is ten times better at pattern matching. It is not closer to domain discovery, because domain discovery is a different kind of activity entirely.

And there is a second layer to the limitation that compounds the first. Even without a domain model, AI cannot notice that one is missing. The signal that a model is absent is friction — the requirement that does not fit, the concept that keeps appearing in three different places, the rule that cannot be placed without contradicting something else. These are the signals that teach a human developer where the structure is wrong. AI absorbs that friction. It finds somewhere convenient for the requirement, implements it, and moves to the next prompt. The code works. The structural problem is invisible. The PDCA cycle was broken before it started, and nothing in the process signals that this has happened.

What AI cannot do is find the correct domain concept, assign it its responsibility, and place it honestly in the model. That requires understanding the domain — which requires the UI verification, the whiteboard session, and critically, the friction of implementation. The resistance a new requirement produces against an existing structure is not an inconvenience. It is the domain teaching you something. It is where the modeling judgment gets built. A developer who prompts their way through that resistance never receives the lesson. The code works. The understanding was never deepened. The model drifts from the domain silently, until the drift becomes structural and the consequences become expensive.

AI is the perfect tool for building software for today. Enterprise software requires building for tomorrow. That has always been the distinction. AI accelerates the consequences of ignoring it — invisibly, in code that works, with no signal that anything is wrong.

What This Means

The working application is not the asset. The domain model is the asset.

For software with limited lifespans, this distinction is irrelevant. Build fast, use AI, ship it, replace it when it stops serving its purpose. The disposable approach is correct for disposable software, and AI makes it better than it has ever been.

For enterprise software — applications that must remain correct through changing requirements, changing teams, and changing understanding, over years and decades — the domain model is the mechanism that keeps the PDCA cycle alive. The UI is the mechanism that keeps the domain model honest. Together they produce something AI cannot: software that gets easier to understand as it matures, because the understanding encoded in it compounds rather than decays.

The developer who builds that foundation and uses AI for implementation is extraordinarily powerful. The patterns, the boilerplate, the technology questions, the adapter code surrounding a well-modeled domain — AI handles all of it, precisely and quickly, in service of a structure the developer owns and understands. That combination is more capable than anything the industry has previously had available.

The developer who uses AI to avoid building that foundation is producing disposable software at enterprise scale — and will discover, somewhere around prompt 78, that working software and correct software are not the same thing, and that the gap between them compounds with every prompt that had no model to check against.

The distinction between procedural development and domain-modeled development was never a preference. It was always a structural choice with structural consequences. The measurement problem that kept those consequences invisible — the absence of the comparable application built the other way — is being resolved in real time, at speed, by teams discovering that the working application they built in six months is already contradicting itself, with no map to navigate back to coherence.

AI did not create this problem. It inherited it from procedural development, and it runs it faster than any human team ever could.

The practices that prevent it are the same ones that always prevented it. They are just, finally, undeniably necessary.

This article is part of a series on software engineering craft. Previous pieces examine the rich domain model as a discipline, the properties of enterprise software that lasts, how the software industry mistook its tools for its craft, and why Scrum works only when the people making decisions feel the outcomes.

DEV Community