Techniques For Tackling Legacy Code

#legacycode #refactoring

Legacy code is usually the code nobody likes to touch. It’s often complex, violates all kinds of best practices, has little or no tests, etc.

Yet I find it very rewarding to work in legacy code. It takes a while, but when you start to tame that beast and understand its intricacies, it can actually be fun!

But how do you start changing a legacy project, putting in modern standards, and making it maintainable? This article will explain some techniques here, based on my experience. I won’t pose as the go-to expert, but I’ve seen some legacy projects in my career and almost always enjoyed working in them. Plus, I always seem to tackle them using the same techniques. There are some preconditions though, which I’ll get into later.

Dependency injection

One of the first things I recommend is to add dependency injection. This sounds like a big undertaking in a legacy application, but you don’t need to do this all at once. The first step would be to identify the entry point of the application, and initialize an empty container there.

Your application probably contains one or more classes that then create instances of their dependencies. In an ASP.NET app, these will be your controllers. In a WPF app, your main view or viewmodel. Take one of these dependencies, register it in your DI container, and use your container to instantiate the class that depends on it (which should also be registered in the container).

Take this fictional controller for example:

public class CustomerController : ApiController
{
    public CustomerController() {}

    public IHttpActionResult Get()
    {
        var logger = new Logger();
        logger.Info("Retrieving all customers");

        var repository = new CustomerRepository();
        var customers = repository.GetAll();
        return customers;
    }
}

The CustomerRepository class will probably access the database, making it hard to test. We can easily change this to:

public class CustomerController : ApiController
{
    private CustomerRepository _repository;

    public CustomerController(CustomerRepository repository) 
    { 
       _repository = repository;
    }

    public IHttpActionResult Get()
    { 
        var logger = new Logger(); 
        logger.Info("Retrieving all customers");

        var customers = _repository.GetAll();
        return customers;
    }
}

Now all other dependencies will still be created by their dependents, but at least you now have a road in to add more and more DI. It would seem nothing has really changed. But it’s a first step to pulling your application apart in smaller building blocks that you can manage and test more easily and that you can combine as you wish.

There are some cases that pose a bigger challenge. But these too can be handled.

Multiple instances

What if you have a class that has a dependency to another class, but it creates multiple instances of it? You can’t inject this into the constructor as you would then only have one instance. Have a look at this WPF code that creates multiple viewmodels and adds it to an ObservableCollection:

public void LoadCustomers()
{
    var customers = _repository.GetAll();
    foreach (var customer in customers)
    {
        var customerViewModel = new CustomerViewModel(customer);
        CustomerViewModels.Add(customerViewModel);
    }
}

A solution is to inject a factory. Then, instead of creating multiple instances of the dependency itself, the dependent class can just call the (injected) factory multiple times:

public void CustomerOverviewViewModel(CustomerViewModelFactory factory)
{
    _factory = factory;
}

public void LoadCustomers()
{
    var customers = _repository.GetAll();
    foreach (var customer in customers)
    {
        var customerViewModel = _factory.Create(customer);
        CustomerViewModels.Add(customerViewModel);
    }
}

If you’re using Autofac, you don’t need to write your own factory classes. Autofac has a built-in concept of Delegate Factories that provide this functionality.

Constructor Parameters

What if the dependency is constructed with some parameters? Well, if it’s another class that can be registered in the container, your DI framework should be able to handle it. But if it’s a local variable at the time of construction, the factory technique explained above can be used again. In fact, in our example above, I pass on a different customer to each new CustomerViewModel.

Other special cases

You might encounter other special cases that you need to solve in a way that seems like you’re adding more legacy code. If there is no other option, that’s ok. Turning a legacy app around into a more modern and lean animal is a long-term effort that sometimes requires you to make things worse before they get better. Discipline is key here, so that you remember to improve it later.

Componentize

Another important technique is to start untangling those big classes into smaller ones.

A first step can be to split up giant methods into smaller logical parts. Often, you can identify what should be methods by the comments above blocks of code:

public void UpdatePoints(Player player)
{   
    // Calculate general multiplier based on special items in inventory
    var generalMultiplier = 1;
    var specialItems = player.Inventory.Where(x => x.IsSpecialItem).ToList();
    foreach (var specialItem in specialItems)
    {
        // Now for some complex logic where certain items add to the multiplier,
        // but some items can also block the abilities of other items.
    }

    // Add points based on items
    var inventoryScore = 0;
    foreach (var item in player.Inventory)
    {
        // Add to skill points based on characteristics of items in inventory
    }

    // And so on...

    // Now set the definite points using all calculated values above
}

Instead of using comments, why not use methods with descriptive names? This is easier to read:

public void UpdatePoints(Player player)
{   
    var generalMultiplier = CalculateGeneralMultiplier(player);
    var inventoryScore = CalculateInventoryScore(player);
    var achievementsScore = CalculateAchievementsScore(player);
    // And so on...

    var result = (inventoryScore + achievementsScore) * generalMultiplier;
}

This will make the flow more clear for everyone. It will also make it more maintainable. The details of how something is done are hidden away in private methods. They’re available if you need them, but if you just need to understand what a method is doing, this method is sufficient.

Later on, you can extract these methods to separate classes, if the code belongs together (cohesion, remember). You’ll end up with lots of small building blocks that can more easily be mixed and matched, but also refactored, changed, and tested.

And these building blocks will fit nicely in our DI container!

Tests

While you’re pulling those large classes apart, take the opportunity to write tests for these smaller blocks of code.

The tests will be easier to write than before, and adding more tests wil give you greater confidence to continue along this path.

Writing an extensive test suite is important in a legacy codebase. One of the main characteristics of a legacy codebase is the absence of tests. And when there are tests, developers usually don’t trust them enough. Either because there aren’t many tests, or because they’re written poorly. The tests don’t give the developers enough confidence to refactor code or add features.

Adding more and more tests to a legacy application will force you to componentize and decouple big chunks of code, improving the quality and architecture of the code. This will in turn make it easier to write tests. It’s a virtuous cycle that, given enough time, will pull you out of the legacy standstill you were previously in.

Coding standards

Something that often happens to legacy projects is that multiple styles of programming are mixed together. This is true on an architectural basis (the lava-layer anti-pattern), but also on a pure coding style level. While most smart developers can look over this, they will also be frustrated by it. And in a legacy project, there are enough frustrations already.

If the application has a consistent style, I will not try to change it immediately. Even if it’s a style I don’t like. A legacy project has other problems that need tackling, so if there’s a consistent style, there’s at least that.

But if it’s a mix of styles, it’s a good idea to agree on what route should be taken. Enforce a coding standard that everyone can agree on. Don’t get into too much discussion. It’s more important to actually have a standard, than what’s in the standard. Tabs or spaces? One isn’t better than the other. Just choose one and stick to it.

Your legacy application will probably break the standard in hundreds, if not thousands of places. Don’t worry about that. Just apply the boyscout rule and fix the violations one by one.

Running It Locally

When you start developing a new application, you can usually run it locally. Many applications lose that functionality along the way. Yet the importance cannot be overstated. Being able to run the application locally allows you to experiment and develop without bothering anyone else.

Running locally means running locally. This may seem silly, but it also means running the database locally. This is usually the issue with legacy applications.

It’s one of the first things I invest time in when working on a legacy application. Don’t underestimate the freedom you get when you can suddenly run the application in isolation. You can now mess about, try new things, run load tests, develop without an internet connection (commuting anyone?), etc.

Preconditions

Before you can even start with all of the above, there are some preconditions. These are generally organizational, i.e. management should be ok with you investing time (and their money) into it, mistakes should be allowed to happen, etc. But some are also technical, like having regular backups, good source control, etc. I’ll talk about those in a next post.