DEV Community

RyTheTurtle
RyTheTurtle

Posted on

Taming Complexity and Code Bloat with Lookup Tables

The lookup table pattern is an efficient, maintainable design pattern to reduce complexity in software. Lookup tables maintain a constant cyclomatic complexity and allow related concepts to be neatly organized in a way easy to read, write, and maintain. Lookup tables can be easily represented in most programming languages regardless of paradigms such as functional, object-oriented, and others. Lookup tables minimize the code maintenance burden compared to alternative approaches, specifically if-else and the popular object-oriented alternative to if-else: polymorphism. For these reasons, the lookup table pattern should be the first choice approach unless the complexity of the domain model makes polymorphism a more suitable approach.

Cyclomatic complexity is the measure of how many different branches or paths a particular piece of software can execute. Cyclomatic complexity is a language agnostic industry standard as one of the measures of maintainability of software. Most software deals with decision points where one of several paths must be taken based on some condition. Not surprisingly, most programming languages have a concept of conditional statements, also known as flow control. Usually, these come in the form of either the if-else statement or switch statements. For example, the following code has a cyclomatic complexity of 4 since there are 4 possible unique paths the code can executed.

if(car.make == "Honda"){
    console.log("The power of dreams")
} else if(car.make == "Toyota"){
    console.log("Let's go places")
} else if(car.make == "Mazda"){
    console.log("Zoom zoom")
} else {
    console.log("Just a car")
}
Enter fullscreen mode Exit fullscreen mode

These constructs are what I would consider fundamental to the concept of programming, but, as most things do, they have some drawbacks in practice

  • Each branch in an if-else block increases the cyclomatic complexity of it’s surrounding scope. A relatively tame method will quickly approach the limits of acceptable complexity for any non-trivial logic. This is especially true for nested if-else statements.

  • The number of unit tests required to maintain line and branch coverage grows proportionally with the number of conditions in the if-else block. This is directly caused by the cyclomatic complexity problem because new cases require new code paths and new code paths require new unit tests.

  • if-else blocks cause repetitive code. Software that makes many different decisions based on the same conditions will usually end up having the same if(someCondition()){} else {} boilerplate copied all over the place. If a required new scenario is added (which we will explore in a bit) such as a new type of user, every conditional block referencing the user type as the condition for the if-else statement must be identified and updated. Even with the help of modern IDEs, this is error prone and tedious.

In the object-oriented world, a common remedy prescribed for these problems is to use polymorphism. Polymorphism means that an object can take different forms depending on the situation. Refactoring if-else to classes using polymorphism typically involves creating a series of classes to represent the different paths of code

  1. an abstract base class, representing the scenario

  2. one or more implementations of the abstract base class, each representing one of the branches in an if-else statement. These concrete classes are where the logic is implemented that otherwise would have been the body of one of the branches of the if-else.

  3. a factory class (usually an abstract factory as well), that can take an input and return a concrete implementation of the abstract base class. Which implementation gets returned depends on logic that resides in the factory class.

For developers who have worked in heavily object-oriented environments such as Java, the concept of polymorphism is probably not new. Robert Martin’s book “Clean Code” prescribes polymorphism as the default answer to any scenario that would otherwise use if-else. A blog post that made it’s way around social media a few years back, If-Else Is a Poor Man’s Polymorphism, is less subtle about their support for polymorphism as an alternative to if-else.

We can easily stop here and call it a day. You refactored the nasty branching out and replaced it with polymorphism. Your code is now object-oriented and incredibly easy to maintain. Great job.

Polymorphism does reduce cyclomatic complexity, improve the readability of individual methods, shrink the size of each class, and provides a mechanism for some level of reusability of conditional logic. These benefits come at a price:

  • Polymorphism is expensive to write. It’s rarely taken seriously as a drawback to polymorphism in discussions of “clean code” and “maintainability”. Writing an abstract base class, implementation classes of the base class for each condition, and a factory to create the implementations based on some condition which still ends up using if-else anyways, and writing all of the associated unit test classes and unit tests, is a lot of code to write to ultimately achieve a fairly simple task.

  • Polymorphism is expensive to read. Code is generally harder to read than it is to write. Given how expensive it is to write polymorphic code, it should not be a surprise that reading and understanding polymorphic code is expensive as well. Each individual polymorphic class is simple to read, but the effort to identify and understand the broader picture is dramatically complicated even in simple examples.

  • Polymorphism is expensive to maintain. The number of code branches with if-else increases proportionally with the number of conditions. With polymorphism, the number of classes in the code increases proportionally with the number of conditions. If you work on a moderately sized domain model and lean heavily in to polymorphism as the approach to branching and decision logic, you’ll soon find yourself digging through a sea of classes and factories, drawing impressive looking class diagrams, and discussing substantial refactorings to accommodate minor changes to functional requirements.

Polymorphism has a time and place, but it should not be the default answer to the situation where code needs to execute one of several branches of logic based on some criteria.

The lookup table shines where if-else and polymorphism fall short. A lookup table is exactly what it sounds like: a table that you can use to look things up based on some criteria. When implemented in code, lookup tables are usually implemented as arrays, hashtables or hashmaps, or dictionaries. Conditions of an if-else are entries in the lookup table. The body of the if-else conditions are the values associated to the entries in the lookup table.

Lookup tables have several properties that make them a convenient alternative to if-else and polymorphism:

  • Lookup tables are inexpensive to write. Lookup tables require very little “noise” to implement. Most popular languages, including Javascript and Python, have concise syntax for initializing dictionary data structures. The logic for most lookup table implementations has a cyclomatic complexity of 2 (the key exists in the lookup table so use it’s value, or the key does not exist in the table) and that complexity remains constant regardless of the number of entries in the lookup table. The only unit tests required to achieve full branch and line coverage of a method that references a lookup table are tests to verify the behavior of if a key is found and if a key is not found. There is no need to write unit tests that test for every possible key in the lookup table.

  • Lookup tables are inexpensive to read. Lookup table definitions present a clear mapping between input keys and the associated values without the extraneous classes and boilerplate of a polymorphic approach. In contrasts to the polymorphic approach, all of the conditions and values for a lookup table are grouped together.

  • Lookup tables are inexpensive to maintain. Adding new conditions to existing lookup tables typically involves writing one or two lines of code. Regardless of the number of entries in the table, the only branches to test are “the value is present” or “the value is not present”, so adding extra unit tests for each new condition in the table is unnecessary.

Starting with a simple example, consider a scenario where we need to determine the price of a subscription for a streaming service.

public class Subscription { 
  public enum TIER {
    BASIC, PREMIUM, PLATINUM
  }

  private TIER tier;

  public double getMonthlyFee(){
    switch(this.tier){
      case BASIC:
        return 10.0; 
      case PREMIUM:
        return 14.0;
      case PLATINUM:
        return 20.0;
      default:
        throw new RuntimeException("No fees for tier {tier}");
  }
}

// in some other class.....
Subscription userSubscription = //...; 
double monthlyFee = userSubscription.getMonthlyFee();
Enter fullscreen mode Exit fullscreen mode

Pretty simple situation. Our switch statement causes us to have 4 different branches of execution in the getMonthlyFee() method but all in all, it’s pretty readable.

Here’s the same logic implemented in a polymorphic approach

// remember in java these each are in separate class files ....

public enum SubscriptionTier {
  BASIC,PREMIUM, PLATINUM
}

public abstract class Subscription {   
  abstract double getMonthlyFee();
}

public class BasicSubscription implements Subscription {
  @Override
  public double getMonthlyFee(){
    return 10.0;
  }
} 

public class PremiumSubscription implements Subscription {
  @Override
  public double getMonthlyFee(){
    return 14.0;
  }
} 

public class PlatinumSubscription implements Subscription {
  @Override
  public double getMonthlyFee(){
    return 20.0;
  }
} 

public interface ISubscriptionFactory {
  Subscription makeSubscription(SubscriptionTier t);
}

public class SubscriptionFactory implements ISubscriptionFactory { 

  @Override
  public Subscription makeSubscription(SubscriptionTier t){
    switch(this.tier){
      case BASIC:
        return new BasicSubscription();
      case PREMIUM:
        return new PremiumSubscription();
      case PLATINUM:
        return new PlatinumSubscription();
      default:
        throw new RuntimeException("No fees for tier {tier}");
    }
  }
}

// somewhere else in the code...
ISubscriptonFactory factory = new SubscriptionFactory();

Subscription userSubscription = factory.makeSubscription(user.getSubscriptionTier()); 

double monthlyFee = userSubscription.getMonthlyFee();
Enter fullscreen mode Exit fullscreen mode

Now lets see the same logic using a lookup table

public class Subscription { 
  public enum TIER {
    BASIC, PREMIUM, PLATINUM
  }

  private TIER tier;
  static final Map<TIER, Double> tierFees; 
  static {
    tierFees = new HashMap<>();
    tierFees.put(BASIC, 10.0);
    tierFees.put(PREMIUM, 14.0);
    tierFees.put(PLATINUM, 20.0);
  }

  public double getMonthlyFee(){
    return Optional.ofNullable(tierFees.get(tier))
      .orElseThrow(() -> new RuntimeException("Unknown tier"));
  }
}

// in some other class.....
Subscription userSubscription = //...; 
double monthlyFee = userSubscription.getMonthlyFee();
Enter fullscreen mode Exit fullscreen mode

For simple cases like these, the lookup table is great at achieving a high degree of readability, low cyclomatic complexity that doesn’t increase with additional cases or types, and does not bloat the code by spreading the configuration across numerous classes.

The example in If-Else is a Poor Man’s Polymorphism can also be written using lookup tables. Admittedly, this is uglier in languages that don’t have lambdas or first class functions, but I decided to write up the example scenario in Typescript to demonstrate that lookup tables are powerful beyond simple mappings between types and scalar values.

Recreating the example from If-Else is a Poor Man’s Polymorphism in Typescript, here’s the “complicated, headache inducing branching” logic for deciding what to do when a user is updated.

enum UserUpdatedReason {
    AddressChanged,
    UsernameChanged,
    EmailChanged, 
    EnabledTwoFactor
}

function UpdateAsync(reason: UserUpdatedReason , user: User) {
    if(reason == UserUpdatedReason.AddressChanged) {
      informMarketing();
      updateUser();
    } else if(reason == UserUpdatedReason.UsernameChanged){
      updateProfileSlug();
      notifyFollowers();
      updateUser();
    } else if(reason == UserUpdatedReason.EmailChanged){
      generateSecurityStamp();
      notifyMarketing();
      updateUser();
    } else if(reason == UserUpdatedReason.EnabledTwoFactor){
      generateSecurityStamp();
      updateUser(); 
    } else {
      throw new Error("Unknown reason");
    }
} 
Enter fullscreen mode Exit fullscreen mode

Pretty bad, we have a cyclomatic complexity of 5 and thats a lot of code to write and read. So what does the polymorphic approach look like?

enum UserUpdatedReason {
    AddressChanged,
    UsernameChanged,
    EmailChanged, 
    EnabledTwoFactor
}

interface IUserUpdatedReason{
    updateUser(u: User)
};

class AddressChanged implements IUserUpdatedReason { 
    updateUser(u: User) {
        informMarketing();
        updateUser();    
    }
}

class UsernameChanged implements IUserUpdatedReason {
    updateUser(u: User) {
        updateProfileSlug();
        notifyFollowers();
        updateUser();
    }
}

class EmailChanged implements IUserUpdatedReason { 
    updateUser(u: User) {
        generateSecurityStamp();
        notifyMarketing();
        updateUser();
    }
}

class EnabledTwoFactor implements IUserUpdatedReason { 
    updateUser(u: User) {
        generateSecurityStamp();
        updateUser(); 
    }
} 

function UpdateAsync(reason: IUserUpdatedReason , user: User) {
    reason.updateUser(user);
} 

Enter fullscreen mode Exit fullscreen mode

This is translated almost directly from the author’s C# example to Typescript. It conveniently leaves out the extra boilerplate for initializing each class, the factory class that would be needed to generate the IUserUpdatedReason implementations based on some criteria, and is just about complicated enough logic to where the polymorphism is a reasonable suggestion. So how would a lookup table handle this same scenario?

enum UserUpdatedReason {
    AddressChanged,
    UsernameChanged,
    EmailChanged, 
    EnabledTwoFactor
}

const USER_UPDATE_TASKS: Map<UserUpdatedReason, Array<(u:User)=>void>> = new Map ([
    [UserUpdatedReason.AddressChanged, 
        [informMarketing, updateUser]],
    [UserUpdatedReason.UsernameChanged, 
        [updateProfileSlug, notifyFollowers, updateUser]],
    [UserUpdatedReason.EmailChanged, 
        [generateSecurityStamp, notifyMarketing, updateUser]],
    [UserUpdatedReason.EnabledTwoFactor, 
        [generateSecurityStamp, updateUser]]
])

function UpdateAsync(reason: UserUpdatedReason , user: User) {
    if(USER_UPDATE_TASKS.has(reason)){
        USER_UPDATE_TASKS.get(reason)?.forEach(task => task(user))
    } else {
        throw Error("Unknown task")
    }
} 
Enter fullscreen mode Exit fullscreen mode

We’ve expressed the same complex logic as the polymorphism with literally half the code, and still maintained the ability to easily add new cases without increasing the size of our code with extra classes and boilerplate.

Both polymorphism and lookup tables can be used to improve the problems of if-else statements. Polymorphism is heavily object oriented and comes with the cost of dramatically increased amount of code to read, write, and maintain. Lookup tables are an efficient alternative to polymorphism that should be the first choice for developers when implementing or refactoring conditional logic. Polymorphism should be reserved for scenarios where the complexity of the cases is great enough to justify the additional cost.

Top comments (0)