We've all heard that "high cohesion" is good. But what does cohesion actually mean?
Most definitions focus on the mechanics: "elements that belong together should be grouped together." But this raises the question: how do we know what "belongs together"?
The answer lies in understanding cohesion as a knowledge organization principle.
The Knowledge Perspective
Here's a more fundamental way to think about cohesion:
Cohesion is about grouping knowledge that changes for the same reason
Knowledge that varies together should be grouped together. Knowledge that varies independently should be separated.
When you have high cohesion, all the knowledge in a unit changes for one reason—driven by one change driver. When you have low cohesion, the knowledge in a unit changes for multiple independent reasons—driven by different, unrelated change drivers.
Let me show you what this means in practice.
A Tale of Scattered Knowledge
Consider this typical e-commerce order system:
class Order {
private List<OrderItem> items;
private String customerId;
private String status;
private double totalAmount;
public void addItem(Product product, int quantity) {
items.add(new OrderItem(product, quantity));
}
public void calculateTotal() {
totalAmount = 0;
for (OrderItem item : items) {
totalAmount += item.getPrice() * item.getQuantity();
}
}
public void applyDiscount(String discountCode) {
// Apply discount logic
}
public void generateInvoice() {
// Generate PDF invoice
}
public void sendConfirmationEmail() {
// Email logic
}
public void updateInventory() {
// Inventory update logic
}
public void logToAnalytics() {
// Analytics logging
}
}
This looks reasonable at first. But let's analyze the knowledge required to understand this class.
What Knowledge Do You Need?
To understand the Order class, you need to know about:
- Order domain logic (items, totals, discounts)
- PDF generation (libraries, templates, formatting)
- Email systems (SMTP, templates, recipients)
- Inventory management (stock levels, reservations)
- Analytics systems (tracking, event formats)
That's five different areas of knowledge—and they change for completely different reasons:
- Order logic changes when business rules change
- Invoice generation changes when document requirements change
- Email changes when notification preferences change
- Inventory changes when warehouse systems change
- Analytics changes when tracking requirements change
A developer working on discount logic shouldn't need to understand PDF generation. A developer fixing email issues shouldn't need to understand inventory systems.
This is low cohesion. The knowledge is scattered across unrelated concerns.
The Principle of Independent Variation
The Principle of Independent Variation (PIV) is a principle that states:
Separate elements governed by different change drivers into distinct units; unify elements governed by the same change driver within a single unit.
PIV provides a precise way to identify low cohesion:
When a module contains knowledge that varies for independent reasons, cohesion is low
What Is a Change Driver?
A change driver is a reason why code needs to change. The same concept appears in other design principles (Single Responsibility Principle, Common Closure Principle) under different names: "reason to change", "actor", "source of change".
These terms are used interchangeably—they all refer to the same underlying concept.
Change drivers have these characteristics:
- Independent triggers: They activate for different reasons at different times
- Different stakeholders: Different people or teams decide when and how each driver triggers
- Distinct knowledge domains: Each driver requires understanding different areas of expertise
For example:
- Business rule changes are driven by product managers adjusting pricing, discounts, or validation rules
- Infrastructure changes are driven by platform engineers optimizing databases or APIs
- UI changes are driven by designers and frontend teams improving user experience
When multiple independent change drivers affect the same module, that module has low cohesion.
Let's identify the independent change drivers in our Order class:
1. Business Rules
- Discount calculations
- Order validation
- Tax calculations
- Pricing logic
Stakeholder: Product managers, business analysts
2. Document Generation
- PDF formatting
- Invoice templates
- Receipt layouts
Stakeholder: Design team, legal/compliance
3. Communication
- Email templates
- Notification timing
- Delivery channels (email, SMS, push)
Stakeholder: Marketing, customer service
4. Warehouse Operations
- Inventory tracking
- Stock reservations
- Fulfillment systems
Stakeholder: Logistics team
5. Data & Analytics
- Event tracking
- Metrics collection
- Reporting
Stakeholder: Data science team
These are five independent sources of change—different stakeholders, different reasons to change, different timelines.
When they're all mixed in one class, every change forces you to understand all five concerns, even if you're only modifying one.
High Cohesion: Grouping Knowledge by Change Driver
Now let's reorganize so that each module contains only knowledge driven by one change driver:
// Change Driver 1: Business Rules
// Contains knowledge that varies when business requirements change
class Order {
private OrderId id;
private CustomerId customerId;
private List<OrderItem> items;
private Money totalAmount;
private OrderStatus status;
public void addItem(Product product, Quantity quantity) {
items.add(new OrderItem(product, quantity));
recalculateTotal();
}
public void applyDiscount(DiscountCode code, DiscountCalculator calculator) {
Money discount = calculator.calculate(this, code);
totalAmount = totalAmount.subtract(discount);
}
private void recalculateTotal() {
totalAmount = items.stream()
.map(item -> item.subtotal())
.reduce(Money.ZERO, Money::add);
}
public OrderConfirmed confirm() {
validateCanConfirm();
status = OrderStatus.CONFIRMED;
return new OrderConfirmed(this);
}
}
// Change Driver 2: Document Generation
// Contains knowledge that varies when document requirements change
class InvoiceGenerator {
private final PdfEngine pdfEngine;
private final InvoiceTemplate template;
public Invoice generate(Order order) {
InvoiceData data = InvoiceData.from(order);
return pdfEngine.render(template, data);
}
}
// Change Driver 3: Communication
// Contains knowledge that varies when notification requirements change
class OrderNotifier {
private final EmailService emailService;
private final NotificationTemplates templates;
private final CustomerRepository customerRepository;
public OrderNotifier(EmailService emailService,
NotificationTemplates templates,
CustomerRepository customerRepository) {
this.emailService = emailService;
this.templates = templates;
this.customerRepository = customerRepository;
}
public void notifyOrderConfirmed(OrderConfirmed event) {
Customer customer = customerRepository.find(event.customerId());
Email email = templates.orderConfirmation(event, customer);
emailService.send(email);
}
}
// Change Driver 4: Warehouse Operations
// Contains knowledge that varies when warehouse systems change
class InventoryManager {
private final InventoryRepository inventory;
public void reserveStock(OrderConfirmed event) {
for (OrderItem item : event.items()) {
inventory.reserve(item.productId(), item.quantity());
}
}
}
// Change Driver 5: Data & Analytics
// Contains knowledge that varies when analytics requirements change
class OrderAnalytics {
private final AnalyticsClient analytics;
public void trackOrderPlaced(OrderConfirmed event) {
analytics.track("order_placed", Map.of(
"order_id", event.orderId(),
"total_amount", event.totalAmount(),
"item_count", event.itemCount()
));
}
}
Now each class is driven by exactly one change driver:
-
Ordercontains knowledge driven by business rule changes -
InvoiceGeneratorcontains knowledge driven by document requirement changes -
OrderNotifiercontains knowledge driven by communication requirement changes -
InventoryManagercontains knowledge driven by warehouse system changes -
OrderAnalyticscontains knowledge driven by analytics requirement changes
This is high cohesion: one module, one change driver, one cohesive area of knowledge.
The Benefits of High Cohesion
Reduced Cognitive Load
When you need to understand discount logic, you only need to understand the Order class. You don't need to know anything about PDFs, emails, or analytics.
When you need to change email templates, you only need to understand OrderNotifier. You don't need to understand order pricing or inventory management.
Each unit requires understanding only one area of knowledge.
Localized Changes
Each change driver affects exactly one module:
- Business rules →
Order - Document requirements →
InvoiceGenerator - Communication needs →
OrderNotifier - Warehouse operations →
InventoryManager - Analytics tracking →
OrderAnalytics
Changes are localized to the relevant knowledge area.
Independent Evolution
The PDF library can be upgraded without touching order logic.
The analytics system can be replaced without touching email code.
Order validation can be enhanced without touching inventory management.
Each knowledge area evolves independently.
The Relationship Between Cohesion, Knowledge, and Change Drivers
Here's the key insight:
High cohesion = Knowledge driven by one change driver is grouped together
Low cohesion = Knowledge driven by different change drivers is mixed together
Or more precisely:
Cohesion measures how well a module's knowledge aligns with a single change driver
This explains why high cohesion improves maintainability:
- Easier to understand: You only need to understand knowledge relevant to one change driver
- Easier to change: When a change driver triggers, you modify exactly one module
- Easier to test: Each module tests knowledge for one change driver in isolation
- Easier to parallelize work: Different teams own different change drivers and their corresponding modules
How to Identify Low Cohesion
The key is identifying change drivers. Ask these questions about your classes/modules:
1. What are the change drivers?
What are the different reasons this unit might change?
- Business rule changes?
- Technology/infrastructure changes?
- Performance optimizations?
- UI/UX changes?
- Compliance/regulatory changes?
If there are multiple independent reasons to change, you have multiple change drivers → low cohesion.
2. Who owns each change driver?
For each potential change, who decides when and how it happens?
- Product managers (business rules)?
- Platform team (infrastructure)?
- Frontend team (UI/UX)?
- DevOps (performance)?
- Data team (analytics)?
Multiple stakeholders = multiple change drivers = low cohesion.
3. What knowledge does each change driver require?
For each change driver, what do you need to understand to make that change?
- When business rules change, do you need to understand PDF generation?
- When document templates change, do you need to understand order validation?
- When analytics requirements change, do you need to understand email delivery?
If one change driver forces you to understand knowledge from other change drivers, they're inappropriately coupled → low cohesion.
4. Do change drivers vary independently?
- Can business rules change without affecting document generation?
- Can analytics requirements change without affecting inventory management?
- Can notification preferences change without affecting order validation?
If these vary independently but are in the same module → low cohesion.
The Decision Framework
When designing or refactoring code:
Identify change drivers: What are the independent reasons this code might change?
-
Map knowledge to change drivers: For each change driver, what knowledge varies when that driver triggers?
- Business rules → domain logic, validation, calculations
- Document requirements → templates, formatting, PDF generation
- Communication needs → email templates, notification channels
- Infrastructure changes → databases, APIs, caching
-
Check independence: Do these change drivers vary independently?
- Different stakeholders?
- Different timelines?
- Different triggering events?
- Can one change without affecting others?
-
If independent → Separate them
- Create one module per change driver
- Each module contains only knowledge for its change driver
- Use interfaces/events to coordinate between modules
- Keep boundaries explicit
-
If dependent → Keep them together
- They represent the same change driver
- They always change together
- The knowledge forms a cohesive unit
Real-World Application
This knowledge-based view of cohesion helps explain many design principles:
Single Responsibility Principle
"A class should have only one reason to change" = "A class should be driven by exactly one change driver"
Separation of Concerns
"Separate different concerns" = "Separate knowledge driven by different change drivers"
Interface Segregation
"No client should depend on methods it doesn't use" = "Don't force clients to know about irrelevant knowledge"
Bounded Contexts (DDD)
"Separate contexts have separate models" = "Separate knowledge domains have separate representations"
The Takeaway
Stop thinking of cohesion as a vague "things that belong together" principle.
Start thinking of cohesion in terms of change drivers and knowledge:
Group knowledge driven by the same change driver. Separate knowledge driven by different change drivers.
When you do this:
- Understanding becomes easier: each module contains knowledge for one change driver
- Changes become localized: one change driver triggers changes in exactly one module
- Testing becomes focused: each module tests knowledge for one change driver
- Parallel work becomes natural: different teams own different change drivers
The next time you're writing or reviewing code, ask yourself:
"How many independent change drivers affect this module?"
If the answer is more than one, you've found low cohesion.
Identify the change drivers. Separate the knowledge. One module per change driver.
Your future self (and your teammates) will thank you.
Learn More
This article applies insights from:
Loth, Y. (2025). The Principle of Independent Variation. Zenodo.
https://doi.org/10.5281/zenodo.17677316
The PIV paper provides a mathematical framework for understanding when concerns should be separated or unified, offering a principled approach to cohesion.
How do you identify low cohesion in your codebase? What strategies have worked for improving it? Share your experiences in the comments!
Top comments (0)