DEV Community

Cover image for Your software development approach is too expensive and too brittle
Leon Pennings
Leon Pennings

Posted on • Originally published at blog.leonpennings.com

Your software development approach is too expensive and too brittle

Most software teams are not struggling because software is inherently chaotic.

They are struggling because they are paying enormous amounts of money to keep the wrong machine barely usable.

That sounds dramatic.

It is not.

In fact, it is one of the most normal things in modern software development.

A lot of systems are built in ways that are:

  • more expensive than they need to be,

  • more fragile than they need to be,

  • harder to change than they need to be,

  • and harder to reason about than they need to be.

And yet they still get called “well architected.”

Why?

Because in software, there is usually no comparison case.

No control group.

No alternate implementation.

No tractor parked next to the Ferrari.

So if the thing eventually works, the architecture often gets promoted from merely functional to supposedly good.

That is one of the deepest blind spots in software engineering.

And it is how teams end up trying to plow fields with a Ferrari F40.


The Ferrari and the tractor

Imagine you need to plow a field.

You can choose between:

  • a Ferrari F40, or

  • a tractor.

This should not be a difficult decision.

The tractor is not glamorous, but it is aligned to the work.

It has:

  • the right ground clearance,

  • the right tires,

  • the right torque profile,

  • the right durability characteristics,

  • the right maintenance expectations,

  • and the right operational shape.

The Ferrari has none of that.

It is a remarkable machine.

It is just the wrong one.

And the mismatch does not merely show up once the work starts.

It shows up immediately.

Because before the Ferrari can even begin to perform badly in the field, someone first has to solve a completely absurd problem:

How do we even make this thing usable for field work?

That is where the real cost begins.

Because now you need compensations.

You need:

  • custom adaptations,

  • support structures,

  • protective workarounds,

  • non-native operational handling,

  • specialist maintenance,

  • and constant care to keep the machine functioning in an environment it was never shaped for.

That is the real problem with a mismatch.

Not just that it performs badly.

But that you now have to build an entire support ecosystem around the fact that it is wrong.


And even that is a cheap mismatch compared to software

In the physical world, the mismatch would at least be visible.

A Ferrari F40 is obviously a terrible agricultural investment.

Even with rough but realistic assumptions, the economics are absurd.

In the physical world, the absurdity would be obvious on a balance sheet. A collector Ferrari F40 trades for millions, while a capable farm tractor costs a fraction of that — with maintenance profiles to match. Using the supercar for field work would not just perform poorly; it would demand absurd custom adaptations before it could even start.

Software hides this mismatch better, which is why teams can run the equivalent for years and still call it maturity.

So yes: in the real world, using a Ferrari to plow a field would already be economically insane.

But in software, the mismatch is often much worse.

Because in software:

  • the cost is less visible,

  • the pain is spread over time,

  • the friction is normalized,

  • and the organization often has no simpler implementation to compare it to.

That means software teams can spend years operating the equivalent of a Ferrari in a muddy field and still call it “engineering maturity.”

That is the danger.


The uniqueness trap

This is one of the hardest structural problems in software development:

most applications are built only once.

Not once in terms of business purpose, perhaps.

But once in terms of implementation.

A team typically does not build:

  • one version with a cohesive domain model,

  • another with CQRS and event choreography,

  • another with five microservices,

and then compare cost, reliability, comprehensibility, and adaptability over five years.

That almost never happens.

So architecture is rarely judged comparatively.

It is judged internally.

And that means if a system eventually “works,” people often conclude that the architecture must have been reasonable.

But that conclusion is deeply unreliable.

Because there may have been a far cheaper, simpler, more robust, and more truthful way to build the same thing.

No one knows.

Because the tractor version was never built.

That is the uniqueness trap.

And it is one of the main reasons accidental complexity survives so easily in software.


Most software architecture is expensive support structure around a mismatch

This is where the Ferrari metaphor becomes useful.

If someone insisted on plowing a field with an F40, they would not simply “start plowing.”

They would first need to invent a whole support system around the mismatch.

They would need to answer questions like:

  • How do we prevent the chassis from bottoming out?

  • How do we maintain traction in mud?

  • How do we protect components from wear profiles they were never designed for?

  • How do we attach the wrong machine to the wrong task?

  • How do we keep it alive under repeated misuse?

In other words:

they would need to build a compensating architecture around the fact that the machine is wrong.

That is exactly what many software teams do.

They choose an architectural shape before they understand the domain, and then spend years building support mechanisms around the mismatch.

That support structure often looks like:

  • CQRS,

  • EDA,

  • orchestration layers,

  • distributed workflows,

  • microservices,

  • command buses,

  • event buses,

  • retries,

  • compensations,

  • synchronization logic,

  • observability scaffolding,

  • deployment choreography,

  • and framework conventions.

And because all of this is technical work, it often feels sophisticated.

But much of it exists only because the software was shaped incorrectly to begin with.

That is the setup tax of accidental complexity.


Back to Brooks: essential versus accidental complexity

Fred Brooks gave us the cleanest possible vocabulary for this problem decades ago.

Essential complexity

Essential complexity is the irreducible complexity of the business domain itself.

This is the complexity that actually belongs.

Examples:

  • pricing rules,

  • eligibility constraints,

  • shipment state transitions,

  • reconciliation logic,

  • metadata rules,

  • legal behavior,

  • catalog semantics,

  • scheduling constraints.

This complexity exists because reality is complex.

You cannot remove it.

You can only understand it, model it, and localize it properly.

Accidental complexity

Accidental complexity is everything introduced by the solution that the problem itself did not require.

Examples:

  • framework conventions,

  • architectural ceremony,

  • messaging choreography,

  • unnecessary distribution,

  • layered indirection,

  • technical orchestration,

  • compensating workflows,

  • integration-driven domain shape,

  • “enterprise” abstraction stacks.

This complexity is not business truth.

It is construction overhead.

And much of modern software architecture is simply accidental complexity with better branding.


The first job of software design is not to choose an architecture

It is to understand the domain.

That should not be controversial.

And yet much of modern software development behaves as if the opposite were true.

Teams routinely begin with questions like:

  • Should we use CQRS?

  • Should we use EDA?

  • Should we split this into microservices?

  • Should this be event-driven?

  • Should we separate reads and writes?

  • Should this be asynchronous?

  • Should we introduce orchestration?

Those are not first questions.

Those are late questions.

The first question is:

What is the business, really?

Until that question is answered properly, every major architectural choice is at risk of being premature.

And premature architecture is usually just accidental complexity entering the system early enough to become permanent.


The real problem is Pattern-Driven Design

The issue is not that CQRS, EDA, or messaging can never appear in a system.

The issue is that many teams no longer design from the domain outward.

They design from patterns inward.

That is how software ends up shaped by:

  • command handlers,

  • event buses,

  • orchestration layers,

  • service templates,

  • and framework conventions

before anyone has actually understood what the business is.

That is not architecture.

That is Pattern-Driven Design.

And Pattern-Driven Design is one of the fastest ways to bury essential complexity under accidental complexity.

Because once the pattern becomes the starting point, the business no longer gets modeled on its own terms.

It gets forced to fit the machinery.

That is not simplification.

That is distortion.


Always start with the domain model

If the goal is to avoid expensive, brittle, overcompensated systems, then the starting point is straightforward:

Always start with the domain model.

Not because every system needs an elaborate object hierarchy.

Not because “DDD” is fashionable.

Not because object orientation is sacred.

But because if you do not start there, something else will define the shape of the software instead.

And that “something else” is usually accidental.

If you do not begin with:

  • what the business concepts are,

  • what they mean,

  • what they are responsible for,

  • what must always be true,

  • how they are allowed to change,

  • and how they interact,

then the system will instead be shaped by:

  • endpoints,

  • persistence structure,

  • framework constraints,

  • service boundaries,

  • message flows,

  • handler conventions,

  • or transport semantics.

And once that happens, the business is no longer being modeled.

It is being adapted to the machinery.

That is where software becomes expensive and brittle.


A user story is not a model

This is one of the most common and costly confusions in software teams.

A user story is not a model.

A ticket is not a model.

A process diagram is not a model.

A request from the business is not yet the business.

These things describe surface behavior.

They do not necessarily describe the actual structure or semantics of the domain.

That means implementation should never start by merely wiring the request into the chosen architecture.

It should start by asking:

  • What actually exists here?

  • What is this concept responsible for?

  • Which rules belong together?

  • Which state transitions are valid?

  • Which interactions are intrinsic?

  • Which behaviors are essential and which are incidental?

That is the real work of software design.

And the clearest place to do that work is the domain model.


A rich domain model is not overengineering

This is where a lot of modern teams have become confused.

There is a recurring assumption that a rich domain model is somehow “too much.”

But in practice, what often happens is not that the logic disappears.

It simply moves elsewhere.

If the business logic is not in the model, it will end up in:

  • services,

  • handlers,

  • orchestrators,

  • subscribers,

  • validators,

  • workflows,

  • pipelines,

  • process managers,

  • or framework glue.

That is not simplification.

That is displacement.

A rich domain model is not about making software “academic.”

It is about ensuring that the unavoidable business complexity lives where it is:

  • explicit,

  • cohesive,

  • inspectable,

  • and semantically meaningful.

In other words:

the model should contain the business.

Not the framework.

Not the message bus.

Not the choreography.

Not the deployment topology.

The business.


If the domain is simple, the model will be simple

This is where the usual objection appears:

“But not every system needs a rich domain model.”

Correct.

But that does not weaken the argument at all.

Because the real point is not that every system needs a complex model.

The point is:

every system should begin by discovering whether the domain is simple or complex.

And the correct place to do that is still the model.

If the domain turns out to be simple, then good.

The model will simply remain small and quiet.

That is not failure.

That is successful discovery of simplicity.

But deciding not to start there is a mistake.

Because then simplicity is not being discovered.

It is being assumed.

And assumed simplicity is one of the easiest ways accidental complexity gets invited in.


CQRS and EDA are often compensations for unclear modeling

Here is the part many people will resist.

That is fine.

CQRS and EDA are very often workarounds for bad design or not knowing how to model.

That does not mean they can never appear.

It means they should almost never appear as up-front architectural choices.

That distinction matters enormously.

They can absolutely emerge later as observations in retrospect.

But they should not be adopted as predefined frameworks before the domain has been understood.

Because once that happens, the architecture is no longer responding to the domain.

The domain is being forced into the architecture.

That is backwards.


CQRS is usually an observation, not a design starting point

Properly understood, CQRS is not something you “do.”

It is simply the recognition that:

the model used to change business state is not always the same model best suited for retrieving and navigating information.

That is all.

And sometimes that is perfectly valid.

A search engine like Lucene is a very good example.

The write side may simply persist documents or structured domain state.

The read side may support:

  • indexing,

  • tokenization,

  • ranking,

  • full-text search,

  • query optimization.

Those are not the same concern.

That is a natural asymmetry.

That is CQRS as an observation.

But that is very different from deciding on day one that the architecture will have:

  • command handlers,

  • query handlers,

  • buses,

  • mediators,

  • folders,

  • pipelines,

  • and all the associated ceremony.

That is not domain modeling.

That is accidental complexity pretending to be rigor.

Most CQRS implementations are just CRUD with bureaucracy.


EDA is often the same mistake, but with more latency

Event-driven architecture is often sold as if it were inherently sophisticated.

It is not.

Very often, it is simply a sign that direct responsibility was not modeled clearly enough.

There is a major difference between:

  • recognizing a domain fact,

    and

  • externalizing causality into a distributed system.

Those are not the same thing.

A domain event can be a useful modeling concept.

But when every business consequence gets turned into:

  • a message,

  • a subscriber,

  • a consumer,

  • a queue,

  • a retry policy,

  • a dead-letter topic,

  • a compensating process,

then what often happened is not decoupling.

What happened is that one coherent business act was split into multiple technical acts — and the system now needs operational rituals to pretend they are still one thing.

That is not elegance.

That is fragmentation.


If an event is required for correctness, it belongs in the same transaction

This is where a lot of “event-driven” thinking falls apart.

If an event represents something the business considers part of the same completed action, then it should not be externalized into eventual consistency theater.

It should be processed within the same transactional consistency boundary.

Often that means:

  • same model,

  • same process,

  • same database transaction,

  • same JDBC transaction.

Because if correctness depends on:

  • retries,

  • cleanup,

  • compensating actions,

  • dead-letter queues,

  • reconciliation jobs,

  • or support scripts,

then the architecture has usually split apart something the business still considers one coherent act.

That is not decoupling.

That is a modeling failure disguised as scalability.

The simple rule is this:

If the business says these things are one thing, the software should not split them into many things.

Only effects that are genuinely:

  • external,

  • observational,

  • optional,

  • or secondary

should be allowed to escape the core transactional boundary asynchronously.

Everything else belongs together.


Microservices are often bad design with Kubernetes

And yes, the same critique applies to microservices.

Microservices are one of the most overprescribed and underjustified architectural choices in modern software.

They are usually discussed in terms of:

  • scaling,

  • team autonomy,

  • resilience,

  • independent deployment,

  • ownership.

But that framing hides the actual cost.

Because microservices are not just a deployment decision.

They are a fragmentation decision.

They force teams to commit to distributed boundaries early — often before anyone has proven those boundaries are semantically real.

And once the split is made, the business has to pretend those boundaries are natural.

That is how teams end up with:

  • cross-service workflows,

  • distributed invariants,

  • duplicated concepts,

  • compensating logic,

  • service orchestration,

  • and “eventual consistency” as a lifestyle.

That is not architecture.

That is often just what happens when one cohesive domain gets cut into pieces because “small services” sounded modern.


Logical cohesion comes before physical scale

This is where the usual counterargument appears:

“Yes, but what about scale?”

Fair question.

But scale does not rescue bad boundaries.

It amplifies them.

If you cannot model a business capability coherently in one process, you are very unlikely to improve it by scattering it across twenty.

That is because logical cohesion is a prerequisite for physical distribution.

A coherent system can sometimes be split later if reality genuinely demands it.

An incoherent system does not become better by being distributed.

It just becomes harder to debug, harder to reason about, and more expensive to keep alive.

So yes, scale matters.

But scale is not an excuse to abandon cohesion before you have even found it.


Small is not the goal. Cohesion is.

The phrase “microservice” already biases the conversation in the wrong direction.

Because it encourages optimization for smallness.

But smallness is not the goal.

Cohesion is the goal.

The real objective is:

  • semantically meaningful boundaries,

  • high internal density of behavior,

  • low cross-boundary coordination.

That is very different.

If one business action routinely requires orchestration across multiple internal services, the split is probably wrong.

That is one of the best architectural tests there is.

Because if the business still experiences something as one coherent operation, but the software requires:

  • service A,

  • then service B,

  • then service C,

  • then retries and compensations if one fails,

then the architecture has not discovered a boundary.

It has manufactured one.

And now it has to manage the damage.


The real cost of framework-first architecture is not implementation. It is drag.

This is where the economics become severe.

Bad architecture is not expensive merely because it takes slightly longer to build.

It is expensive because it creates organizational drag for years.

That drag shows up everywhere.

Slower feature development

Every change now has to move through machinery that was introduced before the business was properly understood.

So even small changes require:

  • coordination,

  • contract changes,

  • handler updates,

  • event flow changes,

  • service touchpoints,

  • deployment sequencing,

  • orchestration review.

That is not domain complexity.

That is architecture tax.

More defects and harder recovery

When one coherent business action has been fragmented across:

  • services,

  • queues,

  • projections,

  • retries,

  • and compensations,

then failure handling becomes vastly more expensive.

The question is no longer:

“Did the business rule execute correctly?”

It becomes:

“Which part of the distributed choreography failed, and what state is the system now in?”

That is a much more expensive problem to solve.

Permanent cognitive overhead

This is one of the biggest hidden costs in software.

A misaligned architecture forces every engineer to carry extra mental load just to understand the system.

Instead of reasoning directly about the business, they must first reason about:

  • the framework,

  • the orchestration model,

  • the service topology,

  • the event timing,

  • the deployment shape,

  • the technical conventions.

That means every change is more mentally expensive than it should be.

And because salaries are the dominant cost in software, cognitive inefficiency is financial inefficiency.

The architecture becomes a second problem

At some point, the software is no longer difficult because the business is difficult.

It is difficult because the architecture has become a second problem layered on top of the first.

The system is now solving:

  1. the business domain, and

  2. the consequences of its own design choices.

That is pure waste.

And because most teams never built the tractor version, they often do not even realize how much of their effort is going into supporting the machine rather than solving the problem.

That is the uniqueness trap again.


The most expensive architecture is not the one that fails immediately

It is the one that:

  • works just enough,

  • survives just long enough,

  • and obscures its own cost just well enough

that nobody ever questions whether the machine was appropriate in the first place.

That is what makes framework-first architecture so dangerous.

It often does not fail loudly.

It succeeds expensively.

And that is much worse.

Because visible failure can trigger redesign.

But expensive success gets institutionalized.

It becomes:

  • “our platform,”

  • “our standard architecture,”

  • “our scalable foundation,”

  • “our engineering maturity.”

When in reality, it may just be a Ferrari that the organization has spent five years trying to teach to plow a field.


The first responsibility of software architecture is not scalability

It is not flexibility.

It is not “future-proofing.”

It is not pattern compliance.

It is not cloud nativeness.

It is not distributed elegance.

It is this:

to make the essential complexity of the business explicit, cohesive, and understandable.

That is the job.

Everything else comes later.

And if the software cannot explain the business clearly through its model, then it is not well architected — no matter how many services, handlers, events, buses, frameworks, or diagrams surround it.

Because at that point, the architecture is no longer serving the business.

The business is serving the architecture.

And that is why so much modern software is too expensive and too brittle.


A much better default

A better architectural instinct is this:

Do not ask what architecture you can build.

Ask what architecture the domain actually justifies.

And if the answer is:

  • smaller,

  • more cohesive,

  • more local,

  • less distributed,

  • less framework-driven,

  • and more explicit in its model

than current fashion prefers, that is not a sign of immaturity.

It is often a sign that the problem is finally being understood.

The next time a team is asked to “choose an architecture,” the first question should not be:

  • Which framework?

  • Which pattern?

  • Which cloud primitive?

  • Which service template?

It should be:

What is the business, and what is the cheapest, most coherent way to represent it truthfully?

Because software does not become expensive and brittle by accident.

It becomes expensive and brittle when teams choose machinery before they understand the work.

And from that point on, they do not just have a domain to solve.

They also have an architecture to survive.

That is not engineering maturity.

That is paying interest on a design mistake.

Top comments (0)