DEV Community

Yannick Loth
Yannick Loth

Posted on

Error Handling Through the Lens of the Principle of Independent Variation

I've been thinking about error handling lately—not the syntax debates or "which language does it best" arguments, but something more fundamental. What makes one approach architecturally better than another?

There's one principle that gives you an objective basis for these decisions: the Principle of Independent Variation (PIV). It states: things that vary independently should be separated; things that vary together should be grouped. Deceptively simple, but it tells you which design is technically superior—though context may still force a trade-off.

Let me show you what I mean by applying it to four error handling strategies.

The Four Approaches

// Return codes (C-style)
int read_file(const char* path, char** content) {
    if (!path) return -1;
    if (!exists(path)) return -2;
    return 0;
}
Enter fullscreen mode Exit fullscreen mode
// Checked exceptions
public String readFile(String path) throws IOException, SecurityException {
    // compiler forces you to handle these
}
Enter fullscreen mode Exit fullscreen mode
# Unchecked exceptions (Python, C#, Java RuntimeException)
def read_file(path):
    # might raise FileNotFoundError, PermissionError...
    # signature doesn't tell you
Enter fullscreen mode Exit fullscreen mode
// Result monad (Rust, Haskell; available via libraries in Java, C#, Python)
fn read_file(path: &str) -> Result<String, FileError> {
    // can't access the value without handling the Result
}
Enter fullscreen mode Exit fullscreen mode

What Are the Change Drivers?

PIV starts by asking: what are the independent reasons this code might change?

For error handling, I see three:

  • Error types evolve (new failure modes appear, old ones get removed)
  • Error handling logic changes (different recovery strategies)
  • Business logic changes (nothing to do with errors)

These are genuinely independent. Adding a new error type shouldn't require touching business logic. Changing how you recover from network failures shouldn't affect which errors exist.

Why Error Handling Strategies Must Vary Independently

This isn't abstract. Consider the same business operation—"fetch user profile"—and the different ways you might handle its failure depending on context:

Recovery strategies:

  • Retry with backoff — transient network failure, try again
  • Use cached/stale data — acceptable degradation
  • Return default value — the show must go on
  • Fail fast — no point continuing if this fails

Observability strategies:

  • Log and continue — non-critical, just record it
  • Alert on-call — this needs human attention now
  • Increment metric — track for SLO dashboards
  • Trace correlation — attach to distributed trace

Propagation strategies:

  • Absorb — don't let callers know
  • Transform — convert to domain-specific error
  • Propagate as-is — let it bubble up
  • Circuit break — stop calling the failing service

In microservices, this gets more interesting. The same downstream failure might require:

  • Service A: retry 3 times, then return cached data (user-facing, latency-sensitive)
  • Service B: fail immediately, dead-letter the message (async job, correctness matters)
  • Service C: log warning, return partial result (aggregator, best-effort)

The business logic—"fetch user profile"—is identical. The error handling varies based on operational context, SLAs, and the role of each service. These concerns change for different reasons and at different times. PIV says: they must be separable.

So how do our four approaches fare?

Change Return Codes Checked Exc. Unchecked Exc. Result
Add error type Touch all callers Update every throws clause up the stack Nothing Compiler warns on incomplete matches
Remove error type Silent—constants sit unused Compilation errors (good!) Nothing Compilation errors (good!)
Change handling strategy Scattered across callers Restructure try-catch blocks Can centralize Swap one combinator
Change business logic Entangled with error checks Buried in try blocks Clean—just change it Clean—just change it

Let's examine each change driver in detail.

Change Driver 1: Error Types Evolve

When you add a new error type, what happens?

Checked exceptions create coupling—the throws clause is infectious:

// You add DatabaseException to this method...
void saveUser(User u) throws IOException, DatabaseException

// ...and now this method needs updating...
void processRegistration(Form f) throws IOException, DatabaseException

// ...and this one...
void handleRequest(Request r) throws IOException, DatabaseException

// ...all the way up the stack
Enter fullscreen mode Exit fullscreen mode

I've seen teams respond by catching and wrapping at every layer—defeating the whole purpose. Or worse, declaring throws Exception everywhere.

Return codes fail silently. You add a new constant -3, but existing callers still only check for -1 and -2. The new error propagates as garbage data.

Unchecked exceptions hide the change entirely. Callers don't know a new exception exists until it crashes at runtime.

Result types propagate like unchecked exceptions—the ? operator passes errors up without signature changes at intermediate layers. But unlike unchecked exceptions, the error is visible in the type. And when you add a new variant, the compiler flags every decision point (where you match), not the entire call chain. Best of both worlds: lightweight propagation, explicit handling.

Change Driver 2: Handling Strategy Changes

This is where the differences are starkest. Result types give you combinators—small functions that transform Results without unwrapping them. This lets you plug in different strategies at composition time.

Same business logic, different recovery strategies:

// Strategy 1: Fail fast
let user = fetch_user(id)?;

// Strategy 2: Return cached data on failure
let user = fetch_user(id)
    .or_else(|_| get_cached_user(id))?;

// Strategy 3: Retry with backoff
let user = fetch_user(id)
    .or_else(|_| { sleep(100); fetch_user(id) })
    .or_else(|_| { sleep(500); fetch_user(id) })?;

// Strategy 4: Default value
let user = fetch_user(id)
    .unwrap_or_else(|_| User::anonymous());
Enter fullscreen mode Exit fullscreen mode

Same business logic, different observability:

// Log and continue
let user = fetch_user(id)
    .map_err(|e| { log::warn!("fetch failed: {e}"); e })?;

// Increment metric
let user = fetch_user(id)
    .map_err(|e| { metrics::increment("user_fetch_error"); e })?;

// Alert on-call (for critical paths)
let user = fetch_user(id)
    .map_err(|e| { pagerduty::alert("user fetch failed"); e })?;
Enter fullscreen mode Exit fullscreen mode

Same business logic, different propagation:

// Absorb error, return partial result
let profile = fetch_user(id).ok()
    .map(|u| Profile::from(u))
    .unwrap_or(Profile::empty());

// Transform to domain error
let user = fetch_user(id)
    .map_err(|e| DomainError::UserUnavailable(e))?;

// Circuit breaker pattern
let user = circuit_breaker.call(|| fetch_user(id))?;
Enter fullscreen mode Exit fullscreen mode

The business logic—fetch_user(id)—never changes. The handling strategy is composed around it, not interleaved with it. You can swap strategies without touching the core operation.

Now the same four strategies with exceptions:

// Strategy 1: Fail fast
User user = fetchUser(id);  // throws, caller deals with it

// Strategy 2: Return cached data on failure
User user;
try {
    user = fetchUser(id);
} catch (IOException e) {
    user = getCachedUser(id);
}

// Strategy 3: Retry with backoff
User user;
int retries = 0;
while (true) {
    try {
        user = fetchUser(id);
        break;
    } catch (IOException e) {
        if (++retries >= 3) throw e;
        Thread.sleep(100 * retries);
    }
}

// Strategy 4: Default value
User user;
try {
    user = fetchUser(id);
} catch (IOException e) {
    user = User.anonymous();
}
Enter fullscreen mode Exit fullscreen mode

Each strategy is a different control flow structure. Retry needs a while loop. Fallback needs a try-catch. Default needs a try-catch with assignment. They don't compose—you rebuild from scratch.

Combining strategies makes it worse (retry then fallback to cache):

User user;
int retries = 0;
while (true) {
    try {
        user = fetchUser(id);
        break;
    } catch (IOException e) {
        retries++;
        if (retries >= 3) {
            try {
                user = getCachedUser(id);
                break;
            } catch (CacheException ce) {
                throw new UserFetchException("all strategies failed", e);
            }
        }
        Thread.sleep(100 * retries);
    }
}
Enter fullscreen mode Exit fullscreen mode

Compare to Result: fetch_user(id).or_else(retry).or_else(retry).or_else(cache). The call to fetchUser is untouched in both, but with exceptions it's buried in 20 lines of nested control flow.

And with return codes (same strategy):

User* user = NULL;
int result, retries = 0;

while (retries < 3) {
    result = fetch_user(id, &user);
    if (result == 0) break;
    retries++;
    sleep_ms(100 * retries);
}

if (result != 0) {
    result = get_cached_user(id, &user);
}

if (result != 0) {
    log_error("all strategies failed");
    return ERR_USER_FETCH;
}
Enter fullscreen mode Exit fullscreen mode

Same story: fetch_user is untouched, but buried in conditionals. Every strategy change means rewriting the if-checks around it. And if you forget one check, the error silently propagates as garbage data.

The difference isn't just syntax. Result strategies compose linearly—you chain them. Exception strategies nest hierarchically—try-catch inside try-catch. Return code strategies scatter conditionally—if-checks everywhere.

Note what doesn't change in any approach: the individual operations (fetchUser, getCachedUser) stay the same. The problem is the composition. With exceptions and return codes, how you combine operations is entangled with how you handle their failures. Want to add a third fallback? Restructure the whole block. Want to change retry count? Edit the same code that defines the operation sequence.

With Result, composition and strategy are separate concerns. fetch_user(id).or_else(f).or_else(g) reads as a pipeline where each combinator is independent. Swap .or_else(cache) for .unwrap_or(default) without touching the rest.

Change Driver 3: Business Logic Changes

What happens when you need to change the core operation itself—say, fetch_user now needs an extra parameter, or you're replacing it with fetch_user_v2?

Return codes force you to navigate through error-checking conditionals to find the actual call. The business logic is entangled with error handling:

// Where's the actual operation? Buried here:
while (retries < 3) {
    result = fetch_user(id, &user);  // <-- find this among the noise
    if (result == 0) break;
    retries++;
    sleep_ms(100 * retries);
}
Enter fullscreen mode Exit fullscreen mode

Checked exceptions bury your logic in try blocks. Low cohesion—the business operation and its error handling are interleaved:

try {
    user = fetchUser(id);  // <-- the actual work
    break;
} catch (IOException e) {
    // 10 lines of recovery logic
}
Enter fullscreen mode Exit fullscreen mode

Unchecked exceptions and Result types both keep business logic clean. The operation stands alone:

let user = fetch_user(id)?;  // clear what's happening
Enter fullscreen mode Exit fullscreen mode
user = fetch_user(id)  # equally clear
Enter fullscreen mode Exit fullscreen mode

The difference: with Result, you know error handling exists somewhere in the chain. With unchecked exceptions, you're trusting that someone, somewhere, handles it.

The Scorecard

How well does each approach let you vary the three concerns independently?

Change Driver Return Codes Checked Unchecked Result
Error types evolve ❌ silent failures ❌ infectious signatures ⚠️ invisible ✅ exhaustive matching
Handling strategy changes ❌ rewrite conditionals ❌ restructure try-catch ⚠️ centralized but implicit ✅ swap combinators
Business logic changes ❌ entangled with checks ⚠️ buried in try blocks ✅ clean ✅ clean

Applying PIV

Result types win not because of syntax or fashion, but because they respect how change actually works. Success and failure are independent concerns—they come from different stakeholders, evolve on different timelines, respond to different pressures. PIV says: separate them.

The deeper insight is that PIV is grounded in business reality. Change drivers aren't technical abstractions—they're product requirements, compliance updates, operational incidents, scaling demands. The business doesn't care about your try-catch nesting. It cares that you can add circuit breakers before the next outage, or swap retry policies without a two-week refactor.

The only constant is change. Software that fights this reality accumulates friction. Software that embraces it—by separating what varies independently—stays malleable.

To apply PIV yourself:

  1. Identify the change drivers. What are the independent reasons this code might change? Not abstract categories—real forces. Who asks for these changes? How often? On what timeline?

  2. Trace the coupling. For each driver, what else has to change? If touching one concern ripples into another, you've coupled things that vary independently.

  3. Check cohesion. Is related code scattered? If what changes together lives in five different files, you've fragmented what should be grouped.

  4. Compare designs. The one that minimizes cross-concern coupling while keeping each concern cohesive is what PIV favors. That's not opinion—it's the design that will cost less to change.


Learn More

This article is based on research from:

Loth, Y. (2025). The Principle of Independent Variation. Zenodo.
https://doi.org/10.5281/zenodo.17677316

The PIV paper explores:

  • Formal derivation of SOLID principles from PIV
  • Mathematical framework for coupling and cohesion
  • The Knowledge Theorem (cohesion ↔ domain knowledge equivalence)
  • Extensions to functional programming, database design, and architectural patterns
  • Critical evaluation of coupling/cohesion metrics and their limitations

What error handling strategy did your team land on? Did coupling pain drive that decision?

Top comments (0)