Chun Fei Lung

Posted on Sep 6, 2021 • Edited on Dec 4, 2021 • Originally published at chuniversiteit.nl

Refactoring does not solve all problems… right away

#programming

Is it an improvement? I guess wheel never know with these uphill battles

I read and summarise software engineering papers for fun, and today we’re having a look at Old habits die hard: Why refactoring for understandability does not give immediate benefits (2015) by Ammerlaan, Veninga, and Zaidman.

Whenever shortcuts are taken during the development of a software system, it accumulates technical debt.

This debt makes it harder to understand and make changes to the system, so the development speed for a system with a lot of technical debt will eventually come to a grinding halt.

Why it matters

Refactoring is a process where the structure of code is improved without changing the functionality of the system. Many in the software development community argue that well-structured code is easier to understand, and thus easier to modify and less prone to bugs.

Unfortunately there is little empirical evidence that refactoring actually has beneficial effects on developer productivity. This study tries to shed some light on the matter.

How the study was conducted

A comparative experiment was conducted at Exact, a software company that produces business software with development teams that are distributed over multiple continents.

The study consists of 5 different experiments and included 30 participants (all developers) from 11 different teams and two different countries (Malaysia and The Netherlands).

In each experiment, a developer was asked to perform a small coding task on components from a codebase with 2.7 millions of lines of code: they either had to fix a small bug or make a small change in functionality. Participants in the experimental group were given a refactored version of the code, while those in the control group were given the original code.

The experiment includes three types of refactorings:

small Rename field or variable, and Extract function refactorings;
medium Extract class and Adapter pattern refactorings, accompanied by one or more unit tests;
large refactorings to divide responsibilities, also accompanied by unit tests.

What discoveries were made

Results were mixed.

Results

In the first (small) experiment some helper methods were extracted from the code. Surprisingly, developers who saw the refactored version needed more time to make the requested change, not less.

The second (small) experiment had a similar setup, but was (apparently) easier to complete. This means that the productivity measurements for this experiment are less noisy. In this case, about 75% of the participants in the experimental group finished before 25% of the developers with the original code.

The third (small) experiment again used similar refactorings and also resulted in lower finishing times for those who saw the original code without refactorings. It’s possible that flow of method arguments and return values between multiple smaller methods was harder to understand than a linear flow in a large method.

In the fourth (medium) experiment participants were asked to fix a bug. It appears that those in the experimental group had slightly lower finishing times than those in the control group. Another notable finding is that developers who were quite experienced in unit testing performed better than other participants.

In the fifth (large) experiment, developers who saw the original code once again did much better than developers who had to work with the refactored code, presumably because it takes more time to understand the relations between classes that emerge from a large refactoring. However, the quality of solutions also differed: whereas most developers in the control group fixed the bug using a “quick fix”, those in the experimental group managed to fix the root cause.

Discussion

The experimental results show that most of the time the original, unrefactored code was “better” for productivity. However, when the original and refactored code were shown to participants side-by-side, most preferred the refactored code.

The authors argue that this discrepancy can be explained by the habits of developers, who are used to reading long, procedural methods and thus simply need more time to get used to dealing with multiple classes and methods.

However, even if refactorings lead to a (possibly temporary) decrease in understandability, the possible increases in maintainability and testability could still make the refactoring worthwhile.

DEV Community