Yannick Loth

Posted on Jan 8

Cohesion Is Not What You Think: A Formal Response to Eberhard Wolff

#independentvariation #softwareengineering #architecture #cohesion

Eberhard Wolff recently published "Cohesion, Modules, and Hierarchies" (Internet Archive), a post that captures how software architects think about cohesion, coupling, and software design. It also captures why our field struggles with these concepts: we lack rigorous definitions, which leads to contradictory advice and confused reasoning.

I want to be clear: this isn't a criticism of Wolff personally. He's articulating the mainstream view accurately—the problem is that the mainstream view has been broken for decades. Wolff is a skilled architect and communicator; he's just working with the same incomplete conceptual toolkit the rest of us inherited.

I've formalized what I call the Independent Variation Principle (IVP), and Wolff's post is a good case study in what has been going wrong in our industry for more than five decades: we operate on intuition instead of theory.

IVP (Structural Formulation): Separate elements with different change driver assignments into distinct units; unify elements with the same change driver assignment within a single unit.

Here's my analysis of Wolff's post, sentence by sentence.

Going Through Wolff's Claims

"The cohesion of a module is good when the individual parts of the module somehow belong together."

This is the core problem. "Somehow belong together" is not a definition—it's a hand-wave. What does "belong together" mean? By what criterion? Without a precise answer, cohesion becomes whatever the architect intuits it to be.

IVP's answer: Elements belong together when they share the same change driver assignments. A change driver is an independently varying source of change to domain knowledge (business rules, infrastructure requirements, UI frameworks, etc.). Typically, a change driver is a team or a person who has the authority to decide that a change must be made to a part of a software system: it's a matter of "decisional authority". Two elements belong in the same module if and only if they share the same set of change drivers.

Wolff's claim that separating related functionalities forces "constant communication to maintain consistency"

This observation correctly identifies a symptom but misdiagnoses the cause. The problem isn't that "related" things were separated—it's that elements with the same change driver assignments were separated. When you split elements that vary for the same reason, you create artificial coupling because changes must now propagate across module boundaries.

But Wolff treats this as evidence that cohesion and coupling are symmetric concerns. They're not.

The Laptop Example: ordering, reserving, and delivering sharing state

Wolff argues these operations "could share a data model in which the state of a product (ordered, reserved, delivered) is stored" and therefore belong together.

This reasoning is backwards. The question isn't whether they could share a data model—many things could share data. The question is: do they share the same change driver assignment?

If orders, reservations, and deliveries are governed by the same business rules and change together when those rules change, they share a change driver and belong together. If they're governed by different policies that evolve independently (order policies vs. delivery logistics vs. inventory management), they have different change drivers and should be separated—even if they reference the same laptop entity.

Sharing data is not the criterion. Sharing variation sources is.

"When functionalities that share data models are separated across modules, they must constantly communicate to maintain consistency, undermining both cohesion and loose coupling."

This conflates data sharing with variation dependence. Two modules can share data structures without sharing change drivers—this is precisely what stable abstractions enable. The problem Wolff describes occurs specifically when elements with shared change drivers are separated, not when elements sharing data are separated.

Treating cohesion and coupling as "interdependent concepts" on equal footing

This might be his biggest mistake. Wolff presents cohesion and coupling as two symmetric properties that must be balanced. This framing appears throughout software architecture literature, and it's wrong.

IVP establishes cohesion primacy: Proper cohesion (aligning module boundaries with change driver assignments) causes minimal coupling as an emergent property. When you correctly group elements that share variation sources, coupling between modules is minimized to what's causally required by the domain.

Coupling is not an independent variable to optimize. It's a symptom of cohesion quality. Accidental coupling (coupling beyond what the domain causally requires) shows up as reduced cohesion—specifically, it introduces foreign change drivers into modules, reducing their purity.

Software design is not a tension between cohesion and coupling; it is not a balancing act: IVP shows that software design is a matter of realizing optimal cohesion.

Focus on maximizing cohesion. Low coupling follows.

"Code duplication violates cohesion but can create loose coupling"

Wolff writes: "When code is duplicated, the two copies can be evolved independently. For example, if a specific software solution for one customer is to be turned into an industry solution, one can copy the specific solution for each customer and adapt it. This is very easy to implement and enables completely separate development, making customer-specific changes easy."

He then notes the apparent trade-off: "On the other hand, copying code also means that a separate, specific solution emerges for each customer, making it nearly impossible to implement features for all customers."

This framing reveals the confusion. Wolff sees a tension: duplication enables loose coupling but "violates cohesion." Under IVP, there's no tension—the analysis depends entirely on change driver assignments.

Credit where it's due: Wolff's intuition here is correct. Duplicating code for different customers does create loose coupling. He sees the right outcome. The problem is that his definition of cohesion can't explain why it's right. Under IVP, the explanation is clear: duplication creates loose coupling because it achieves proper cohesion—when customer-specific code has customer-specific change drivers, separating them is exactly what IVP prescribes. Cohesion primacy at work. Wolff's practical instinct is sound; what's missing is the theoretical foundation that would let him generalize from this example to a consistent principle.

Case 1: Different change drivers. If each customer's requirements evolve independently (Customer A's business rules change without affecting Customer B), the duplicated code has different change drivers. The code looks identical today, but it varies for different reasons. Duplication is correct—it's not a cohesion violation. Each copy belongs with its customer's change driver. Wolff's observation that "changes are thus completely decoupled" confirms this: independent variation sources should be separated.

Case 2: Shared set of change drivers. If there are features that must apply to all customers (a regulatory requirement, a core platform capability), those features represent a shared set of change drivers. Code implementing shared features should be unified, not duplicated. Wolff correctly notes that duplication "makes it nearly impossible to implement features for all customers"—but this isn't a trade-off against loose coupling. It's a cohesion failure: elements with the same change driver are scattered across customer codebases.

The resolution: identify which parts have customer-specific change drivers (separate them) and which parts have a shared set of change drivers (unify them). There's no inherent tension between cohesion and coupling—there's only the question of correctly identifying change driver assignments.

Wolff even hints at this: "some actually recommend implementing a generic core only after the third similar implementation." This advice exists precisely because early apparent duplication often represents independent variation sources. Premature unification would create false cohesion—grouping elements that will diverge.

The principle: Cohesion is about variation dependence, not textual similarity. Duplicated code with different change drivers is correctly separated. Duplicated code with the same change driver is a cohesion failure and represents increased, accidental coupling.

"Simply following DRY principles doesn't guarantee good cohesion"

Correct, but for the wrong reasons. DRY (Don't Repeat Yourself) addresses knowledge duplication—the idea that every piece of knowledge should have a single authoritative representation. DRY says nothing about where that representation should live or how modules should be organized.

Cohesion is about organizing elements by their variation sources. DRY is about avoiding inconsistent representations of the same knowledge. They're orthogonal concerns. You can have perfect DRY compliance with terrible cohesion (everything in one module) or perfect cohesion with DRY violations (when apparently similar code represents different change drivers).

"Modules form a hierarchy—from microservices down to packages, classes, and methods. Different organizational principles apply at each level."

Wolff claims coarse-grained modules (microservices) should follow "domain concerns" while fine-grained modules (packages, classes) should follow "technical separation."

This is arbitrary. Why would the organizing principle change based on granularity?

IVP's answer: The same principle—align boundaries with change drivers—applies at every level (and in any dimension, but that discussion is out of scope here). What changes is the granularity of change drivers being considered.

At the microservice level, you might have coarse-grained change drivers: γ_orders, γ_inventory, γ_shipping. At the class level within the orders service, you refine γ_orders into finer-grained drivers: γ_pricing, γ_validation, γ_persistence. The principle remains constant; the analysis granularity varies.

There's no magical switch from "domain concerns" to "technical concerns" at some arbitrary boundary. Technical concerns (UI frameworks, persistence mechanisms, infrastructure) simply have their own change drivers like any other concern. They should be separated from business logic because they vary independently—not because of some special rule about technical layers.

"Grouping everything together ensures nothing is 'torn apart,' but creates incomprehensibly large modules."

This is presented as a paradox requiring judgment. Under IVP, there's no paradox.

A module should contain all elements sharing the same change driver assignments—no more, no less. If a module is "incomprehensibly large," either:

You're analyzing at too coarse a granularity (the change driver should be refined into finer-grained independent drivers), or
The module genuinely embodies a large, coherent body of domain knowledge that varies together.

Case (2) isn't a problem—it's reality. Some concerns are genuinely large and coherent. The "incomprehensibility" comes from trying to understand code without understanding the domain knowledge it embodies. And... domain knowledge (what is part of the domain, its policies, its ubiquitous language...) is defined by the decisional authority structure that governs it. Here we're back to decisional authority.

"Cohesion demands both grouping related components AND separating unrelated ones to maintain manageability."

Finally, something close to correct—but imprecisely stated.

IVP formalizes this as two complementary axioms:

IVP-1 (Separation): Elements with different change driver assignments must be in different modules.
IVP-2 (Unification): Elements with the same change driver assignments must be in the same module.

Together, these establish a biconditional: two elements share a module if and only if they share the same set of change drivers. This isn't a vague call for "grouping and separating"—it's a precise criterion for exactly what goes where.

The IVP Framework in Brief

Here's the rigorous foundation Wolff's post lacks.

Core Definitions

A change driver (γ) represents an independently varying source of change to domain knowledge. Change drivers are independent when changes to one don't necessitate changes to another.

Examples: business rules (γ_business), UI framework (γ_UI), persistence mechanism (γ_persistence), external API contracts (γ_API).

An element is an atomic unit of code at whatever granularity you're analyzing (class, function, module).

The change driver assignment Γ(e) maps each element to the set of change drivers that could cause it to change.

Some elements are pure—they have a single change driver (|Γ(e)| = 1). These are your core domain logic, your infrastructure implementations, your UI components. Other elements legitimately have multiple change drivers (|Γ(e)| > 1). These are typically adapters, integration points, or boundaries between domains. An adapter between your business logic and your persistence layer has Γ(adapter) = {γ_business, γ_persistence}—it must change when either domain changes. This is not a design flaw; it represents the legitimate coupling between pure domains. IVP doesn't eliminate such elements; it ensures they're grouped with other elements sharing the same multi-driver assignment, and it makes explicit where the necessary, legitimate coupling lives.

The Structural Formulation

IVP: Separate elements with different change driver assignments into distinct units; unify elements with the same change driver assignment within a single unit.

Formally:

IVP-1: ∀e_i, e_j ∈ E: Γ(e_i) ≠ Γ(e_j) ⟹ they're in different modules
IVP-2: ∀e_i, e_j ∈ E: Γ(e_i) = Γ(e_j) ⟹ they're in the same module

Together: elements share a module ⟺ they share the same set of change drivers.

Causal Cohesion (Properly Defined)

Cohesion has two measurable dimensions:

Purity: purity(M) = 1/|{Γ(e) : e ∈ M}| ∈ (0, 1], where we count the number of distinct change driver assignments across all elements in the module. Equals 1 when all elements share the same change driver assignment. Note: an adapter module where all elements have Γ(e) = {γ_A, γ_B} has purity = 1—the elements are homogeneous even though each element has multiple drivers.
Completeness: completeness(M) = min over σ ∈ {Γ(e) : e ∈ M} of |{e ∈ M : Γ(e) = σ}|/|ε(σ)| ∈ [0, 1], where ε(σ) is the set of all elements in the system with assignment σ. For each distinct assignment present in M, we measure what fraction of all elements with that assignment are in M; the minimum gives the worst case. Equals 1 when, for every assignment present in M, all elements with that assignment are unified in M.

A module has maximal cohesion when both equal 1: cohesion(M) = (1, 1)—pure and complete. An adapter module with all elements sharing assignment {γ_A, γ_B} can achieve (1, 1) if it contains all such elements. Note that an impure module (purity < 1) can still have completeness = 1 if, for each distinct assignment it contains, all elements with that assignment are in the module—though this typically indicates the module should be split.

Cohesion Primacy

Theorem: In an IVP-compliant system, maximal cohesion implies minimal coupling.

Proof sketch: Accidental coupling (dependencies beyond what the domain requires) introduces foreign change drivers into modules—an element that depends on another inherits that element's variation sources. This reduces structural purity (by introducing elements with different assignments into the same module), showing up as degraded cohesion. Accidental coupling is therefore detectable as a cohesion problem.

Conversely, when all modules have purity = 1 (all elements share the same assignment) and completeness = 1 (all elements with that assignment are unified), inter-module dependencies can only exist between modules with different assignments—which represents the legitimate, domain-required coupling between different concerns. No accidental coupling remains.

Corollary: You don't need to measure coupling and cohesion as separate metrics. Measure cohesion. If it's maximal, coupling is minimal. If cohesion is degraded, find the heterogeneous assignments—they reveal the accidental coupling.

This is why coupling and cohesion aren't symmetric: coupling problems are a consequence of cohesion problems. Cohesion has primacy.

What This Means in Practice

What does this mean for architects?

Stop treating cohesion intuitively. "Belongs together" is not a criterion. Ask: "What are the change drivers? Do these elements share them?"
Stop balancing cohesion against coupling. There's no trade-off. Maximize cohesion (by IVP's definition); coupling minimizes automatically.
Code duplication isn't automatically bad. If duplicated code has different change drivers (varies for different reasons), it's correctly separated. Unifying it would reduce cohesion.
The organizing principle doesn't change by granularity. Apply IVP at every level. Change the granularity of your change driver analysis, not your organizing principle.
Technical vs. domain is a false dichotomy. Technical concerns (persistence, UI, infrastructure) are simply change drivers. Separate them from business logic because they're independent variation sources, not because of some layer rule.

Wrapping Up

Wolff's post is representative of mainstream architectural thinking—and that's been the problem for more than 5 decades. This isn't Wolff's fault; he's faithfully representing what our field has taught for generations. The issue is that what we've been teaching is built on sand. We operate on intuition, vague definitions, and folk wisdom. We treat cohesion and coupling as mysterious forces to be balanced. We offer contradictory advice because we lack the theoretical foundation to resolve contradictions.

IVP provides that foundation. The core insights aren't new—Parnas articulated them in 1972 in "On the Criteria To Be Used in Decomposing Systems into Modules": modularization should be based on design decisions likely to change (information hiding), and module boundaries should align with work assignments. But Parnas left these as separate, loosely articulated concepts. He probably didn't recognize that "likely to change" and "work assignments" are two facets of the same underlying idea—what IVP now calls change drivers. For more than five decades, nobody connected these dots or formalized them into a rigorous, operational framework. IVP does exactly that—what might be the first unifying foundational theory of software architecture.

IVP defines cohesion precisely, establishes its primacy over coupling, and gives architects a clear criterion for modularization decisions. When you understand that cohesion is about aligning module boundaries with change driver assignments, the confusion dissolves.

The question isn't whether elements "somehow belong together." The question is: Do they share the same set of change drivers?

The full exposition of IVP is available in The Independent Variation Principle (IVP). Formal mathematical foundations, including proofs of its relationship to SOLID principles, are forthcoming.

DEV Community