Refactoring Legacy Code

#refactoring

Introduction

Refactoring code is almost always met with fear and anxiety. The truth is, we should be making small changes in our code that improve it each time we work, but we all know that never happens. Instead, code deteriorates and becomes technical debt that eventually needs a massive overhaul. The dreaded refactor or migration, a mountain of work no one wants to own. I've heard everything from "we don't know how it works" to "we don't want the outages" when the topic has come up. Refactoring code is not always smooth sailing, but there are plenty of examples to learn from, and a simple three-step process is available. Discover, stabilize, and execute are the three steps to updating and refactoring your code.

Discovery

This phase helps to plan and organize the pieces and approach for migrating the code. This will help you surface how much work you actually have to plan and make time for. A key element to examine is test coverage, including the extent to which it exists and whether it is sufficient. The goal of a refactor or migration is to move the code without changing the behavior the user anticipates. As Martin Fowler stated:

Refactoring is the process of changing a software system in a way that does not alter the external behavior of the code yet improves its internal structure. It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence, when you refactor, you are improving the design of the code after it has been written.¹

Understanding that the goal is to change the software in a way that does not impact behavior means that you can rely on unit tests, integration tests, and end-to-end tests as a metric for success. Thus, evaluating test coverage and where it falls short is a critical step for refactoring.

Additionally, review any internal wikis for helpful information and context on the system and its objectives. Often, there are resources explaining potential pitfalls and hazards to watch out for. You should also review informal channels, such as chats, commit messages, and code comments. Commit history is a fascinating thing to examine because it reveals the evolution of the file's current state and any other files that may be related to or frequently updated alongside the current file you are viewing. This also lets you determine the change frequency for that file.

In some cases, it becomes evident that this file is prone to errors and is tightly coupled with other files. I've actually used this method to locate files that required specific reviewers on pull requests before they were approved. They had high change frequencies, a high number of commit messages mentioning bugs or defects, and were tightly coupled with each other. The extra scrutiny reduced defects in those files by over 30%.

Tornhill also describes a method for pinpointing tightly coupled modules in your program by looking at sets of files modified within the same commit. To depict this idea, let's say we have three files, superheroes.js, supervillains.js, and sidekicks.js. In a subset of our commits, we have the following changes: commit one modifies both superheroes.js and sidekicks.js; commit two modifies all three files; commit three again modifies superheros.js and sidekicks.js; and commit four only touches superheroes.js. From this subset of our version history, depicted in Table 3-3, we notice that of four commits, three of them modified both superheroes.js and sidekicks.js. This insinuates that some kind of coupling between these two files exists.²

Reviewing lines of code, complexity metrics, and bug reports can all play a big part in discovery as well. Additionally, AI and LLMs make analyzing large sections of code a much easier task. You can include the files you are looking at as context and ask the LLM to summarize what the code does, locate edge cases, evaluate test coverage, and suggest improvements.

By the end of discovery, you should have a document outlining all the essential information related to what you are refactoring. You should understand the code and have a good idea of potential pitfalls or landmines. The following quote from Tanya Reilly is focused on joining a new company, but it actually explains discovery really well:

When you join a new company, most of the big picture is completely unknown to you. A big part of starting a new job is building context, learning how your new organization works, and uncovering everyone's goals. Think of it like the fog of war in a video game, where you can't see what awaits you in the parts of the map you haven't explored yet. As you scout around, you clear the fog and get a better picture of the terrain, learning what's surrounding you and whether there are wolves coming to bother your villagers. You can set out to uncover the obscured parts in all three of the maps and find ways to make that information easy for other people to understand. For instance: Your locator map can help you make sure the teams you work with really understand their purpose in the organization, who their customers are, and how their work affects other people. Your topographical map can help highlight the friction and gaps between teams and open up the paths of communication. Your treasure map can help you make sure everyone knows exactly what they're trying to achieve and why. You'll be able to clear some parts of the map through everyday learning, but you'll need to deliberately set out to clear other parts. A core theme of this chapter is how important it is to know things: to have continual context and a sense of what's going on. Knowing things takes both skill and opportunity, and you might need to work at it for a while before you start seeing what you're not seeing.³

Think about discovery in the same way: if you don't fully explore the map, you don't know what you're getting into. You have to clear the fog of war and understand the terrain.

Caution : If you reach this point and think you won't have time to complete the refactor, then you must narrow the scope. An incomplete or abandoned refactor is far worse than leaving the code as is. If the hill gets too big, narrow the scope and tackle the refactor or migration in smaller pieces.

Stabilize

At this point, you have a solid understanding of the code and a clear plan. Stabilizing is the intermediate step that allows you to execute that plan. In this phase, you focus on getting the code to a point where it is safe to migrate or completely refactor. You can, and should, perform small refactors in place in this phase. Making minor changes or adjustments that will ensure the larger migration or refactor is successful. The first thing you should prioritize is bringing the test suite up to a level that ensures you can validate that behavior is not changed or lost.

We can increase our confidence that nothing has changed by writing a suite of tests (unit, integration, end to end), and we should not seriously consider moving forward with any refactoring effort until we've established sufficient test coverage.²

Tests are critical to ensuring that you are not breaking, changing, or completely removing functionality. This should be your first step in stabilizing the code and increasing test coverage.

Whenever I do refactoring, the first step is always the same. I need to ensure I have a solid set of tests for that section of code.¹

There is a strong possibility that you will need to modify the code to increase test coverage. An example of this is legacy PHP applications that use static method calls. These are not easily mocked and create dependencies that, in some cases, need to be broken to increase testing. You may need to adjust static calls or other dependencies. These changes should be kept to a minimum while allowing you to increase test coverage. After test coverage is raised, you can modify these areas more if needed.

Dependency is one of the most critical problems in software development. Much legacy code work involves breaking dependencies so that change can be easier.⁴

Additionally, you should work on releasing these changes using feature flags or an A/B system. This allows you to slowly release and monitor adjustments and react to defects and issues as they arise in a more controlled manner. Two standard options for making these more minor changes are using Sprout Methods and Sprout Classes⁴. These are methods or classes that are created alongside existing legacy code to slowly move functionality into a more stabilized state. For methods, you can create a new clean method and call it from the existing legacy method. If needed, you can add logging to determine all the paths calling that method and ensure they are working correctly. Eventually, the new method replaces the legacy method. A Sprout Class is similar, but it is used for gradually replacing entire classes over time. There are also Wrap Methods and Classes that follow the decorator pattern. Working Effectively with Legacy Code covers these methods in great detail and is a great read. Wrap Methods and Classes are used to intercept legacy calls and safely add functionality before and after the call and delegate to the original methods/classes⁴.

At this point, your legacy code should have sufficient test coverage, be refactored in place, and have been tested with real users with feature flags or A/B testing. It is possible to stop here; you may not always need to undertake the complete migration or refactor, and that is okay. If your code is stable here, not causing problems, and has removed technical debt, you may have reached your goal.

Execute

This final step involves executing the last refactor or migration. This consists of consolidating all your discovery and stabilization efforts and pushing through to the end of the project. Your code should be in such a tested and stable spot that this step should be straightforward. If you are moving to a new framework, your efforts to stabilize give you the tests you need, identify broken dependencies, and map out the dangerous parts of the code. You have likely already identified and addressed some bugs resulting from refactoring in place and have a deep understanding of the codebase.

You should follow the same steps here, releasing changes in small chunks under feature flags or through A/B testing. Still, you will likely encounter fewer issues making changes at this point. The workflow should involve making small changes, committing, and testing. This makes it easier to review, change, and fix if something goes wrong. Working on small, functional pieces will make the final refactor or migration far easier than trying to do it all in a single massive commit.

First, committing small, incremental changes makes it much easier to author great code. By pushing bite-sized commits, you can get relevant feedback early and often from your tooling (e.g., integration tests running on a server through continuous integration). If you push a wide breadth of changes infrequently, you risk needing to wade through and fix a heap of test failures.

Second, reverting a small commit is much easier than reverting a big one. If something goes wrong, whether during development or well after the code has been deployed, reverting a small commit allows you to carefully extract only the offending change.

Third, because concise commits tend to be sufficiently focused, you'll also be able to write better, more precise commit messages.

Finally, it is nearly impossible for a teammate to review the entirety of the modified code adequately. ²

Work small and commit frequently; you'll be glad you did at the end. Live pushes should be the most minor yet complete feature.

Conclusion

Refactors and migrations can be intimidating, and they often are. If you follow a structured approach using these three phases. Much of the guesswork is removed, and you can discover the landmines upfront. Following this approach can help you successfully migrate and stabilize your code base. As an added bonus, you will unlock a massive treasure chest of context and documentation about what your code is doing, what to look out for, and what parts of the code rely on each other.