loading...

Refactoring Basics

arne_mertz profile image Arne Mertz Updated on ・4 min read

This article was originally published on my personal blog "Simplify C++!". Simplify C++! is a blog about modern C++ and Clean Code.

Rules of engaging your code...

Small compilable steps

Have you ever worked on your code for hours, then you try to compile and test it - and nothing works anymore? Usually, in such a situation you try to fix the problems for a few more hours until you realize that you only got into even more trouble. In the end, you roll back everything and start over, frustrated. Or worse, you roll back and don't even bother to try another time.

That's why refactoring is best done in small steps where each single step compiles and passes the tests. Why pass the tests you may ask. You'll want to pass your tests because to not pass the tests means you changed some behavior. You either accidentally changed something you did not want to change, or you did change it on purpose, in which case you'll also have to change the tests.

Commit often

Every small step on the way is a small victory. Set a rollback point whenever your code compiles, passes the tests and when the change you made contributes to reaching the refactoring goal.

Of course, nobody wants to see a hundred commits consisting of single-line changes. But here is where version control systems like git can help. For example, I often stage a number of small changes vía git add before I actually commit something. If a change does not compile or does not have the desired effect, I can simply go back to only staged changes. When I have staged enough changes, I either create a new commit or amend the last commit. Finally, if needed, we're able to squash multiple commits into one.

Have a plan

Plan the small steps ahead. Try to think of a path from the current state of the code to the state you want to reach. Compare what you do to what you planned to do - that way you find out early if you are heading in the wrong direction.

Don't stray from your path. If you find bad variable names, violation of design principles or other things that need fixing, take notes, but don't fix them yet. Finish what you set out to do and fix your findings later. That way your changes stay focused and comprehensible. It also prevents you from being sidetracked and running into time problems. It's of no use if at the end of the day you started a dozen refactorings and finished none of them.

Back and forth

Doing small steps sometimes requires us to do some intermediate steps that have to be undone later. For example, to move a method out of a class we might have to make other members of the class public first. Later, when the method no longer requires access to the members, we can make them private again.

Undoing things we have done a few minutes before might look like a waste of time at first. However, doing it this way gives us some security that we would not have otherwise. Compare it to crossing a pitch black room: It might be safer to feel our way along the wall, the long way round, instead of stumbling through the dark and running into furniture.

The above example also shows something else: We are allowed, sometimes even required to break coding rules during our refactorings. We can violate style guides, duplicate code and commit all kinds of atrocities, as long as our plan includes steps to fix these things.

Don't mix things up

Sometimes you might encounter a problem that prevents you from continuing your current plan. Remember that refactoring means modifying code without changing the behavior of the larger picture. If you need to do such a change to fix your problem, there are basically two options. The first option is to bring the refactoring to an early end (but fixing the rule violations you committed) and apply the change as a next step.

The other option is to roll back the steps you have done so far, fix the problem, and apply the steps again. Rolling back does not necessarily mean throwing away everything. For example, in git, we can check out a point before the refactoring started, fix the problem there on a new branch and then rebase the refactoring onto the fixed branch.

... are more like guidelines.

The points above may seem overly pedantic and tedious. In addition, how small should a small step be? That's up to you. If you are confident that you can do a larger step at once without running into problems, go ahead. If you are refactoring code that is unfamiliar and hard to understand, smaller steps might be appropriate.

The same goes for the number of tests you run after each step. Running a three-minutes test suite after every twenty-seconds change will slow you down terribly. It's a compromise between investing time and sacrificing security. Depending on the size of your steps, even just compiling might be enough, as long as you run all affected tests once in a while.

Discussion

pic
Editor guide
Collapse
mathur_anurag profile image
Anurag Mathur 

My only grouse with re-factoring large and legacy code, is that backporting becomes a big headache. I am a sucker for refactoring in newer projects, but am one of the skeptical few when the project is already RTM.

Collapse
arne_mertz profile image
Arne Mertz Author

I understand. Is backporting something you regularly see? In my experience, it is close to impossible in legacy projects, because new functionality builds on basically everything that came before. Taking a single piece of functionality back to an earlier version usually does not pay off.

Collapse
mathur_anurag profile image
Anurag Mathur 

Yes, Arne. We backport a lot of bug fixes from the current development branch to production branches. Many-a-times, we had to deliver an enhancement required by a customer, and we would code that on a development branch, and then backport it to production branches.

Collapse
aleksikauppila profile image
Aleksi Kauppila

Thanks for posting Arne. I think this writing acknowledges all the things how to efficiently refactor code. Good job! 👍

Running a three-minutes test suite after every twenty-seconds change will slow you down terribly.

I can totally relate to this. That's why i prefer to have those very microscopical commits in the VCS. Small changes are easy to name: "remove unused argument from Class.method()", "rename method Class.method() to Class.newName()". It also creates a mental note that this change is safe, but this other was not that safe. Once the tests expose that i broke the code i can quickly review the commits and revert the one i think broke the tests.