Alex Omeyer for Stepsize

Posted on Mar 31, 2021 • Originally published at stepsize.com

What Lies Beneath Hard Work: Code Churn

#codequality #productivity #programming #refactorit

Originally published on Tech debt blog by Cate Lawrence.

Organizations are continuously looking for ways to track, measure, and evaluate developer workflows. Done effectively, this creates the means to improve performance and code quality, reduce time to market and increase profits. But it's not always easy to measure efficiency. What may first appear to be evidence of a team's hard work may be an indication of the bigger challenges and inefficiencies of code churn.

What is code churn?

Code churn is a measure or indication of how often a file changes. It typically refers to how often a developer throws out code (such as a function, file, or class) within the first 2-3 weeks of writing.

Churn levels vary between team members, different levels of experience, and projects. There's no normal level of code churn that can be universally applied across all developers. Deletes and edits are normal as code is tested and refined, particularly while problem-solving or trying out new code. However, excessive code churn or sudden changes in churn levels can be a symptom of other problems affecting the developer team.

What can measuring code churn reveal?

As mentioned above, changes in code churn can signify more significant issues facing developers.

At a basic level, excessive or irregular code churn may indicate that a developer is struggling and would benefit from extra support such as mentoring or pair programming. It may signify an individual with a perfectionist streak or a tendency to reinvent the wheel in problem-solving, wasting valuable time and achieving little for their efforts. Left unattended, this behavior may result in work dissatisfaction and burnout. Constant edits may mean that the team leader needs to articulate what 'done' and 'finished' mean in a very real sense.

Code churn also provides insights into current workloads and resource allocation.

Toby Osborne suggests code changes can generate useful insights. For example: "In the last two weeks, home.html was changed 50 times, and website_controller.rb was changed 20 times. These stats show you:

Parts of the system that may need more tests because they are often changed.
Parts of the system that are getting all of the development resources."

These factors are important as they show a project may need a different allocation of resources, greater testing, or a longer deadline. Research shows a strong connection between high volumes of code churn and the number of defects discovered while testing.

Timing is also essential. Code should become more stable as deadlines loom, and if the opposite happens every time, it suggests the code may be volatile and subject to post-release defects. If code churn is in response to customer feedback, it may indicate the need to improve workflow and timelines.

Code churn may also indicate internal team problems with communication where a high volume output is perceived as highly rewarded, at the expense of more decisive, efficient code writing.

How does code churn relate to refactoring and technical debt?

Source code refactoring is essential to maintain long-term code quality, security, and performance. It turns messy, incorrect and/or repetitive code into clean code. It addresses the standardization problems which can occur when multiple developers contribute their own code. Refactoring provides greater readability and improves the maintainability of the source code as well as the overall structure and functionality. Without factoring regularly, developers are left with a mammoth amount of technical debt. This debt grows as more opportunities for code refactoring are missed and as a result, new development becomes difficult, especially built on legacy code.

The challenge for many organizations is where to start when it comes to reducing technical debt through refactoring.

Nicolas Carlo suggests that measuring complexity churn on an XY graph is a way to prioritize code refactoring that is important and urgent. He demonstrates that files that cause the most problems are those that are complex and frequently touched. Thus, deploying this kind of metric can help you identify code hotspots that need to be attended to first in any refactoring efforts.

It's worth stressing that code churn is not always problematic. It's normal in the prototyping and design phases where a developer invests time researching, testing, and investigating. This is likely to generate high code churn as ideas are developed and refined.

How to measure code churn

You can't reduce code churn without measurement. So the first step is measuring your code churn to create actionable insights. You need to determine the normal churn levels among your team and find the incidents and areas where it exceeds these levels.

There are many software options for measuring code churn depending on whether you want to buy or build, the size of your organization, and your budget. They typically focus on how many lines of code have changed (been modified, added, or removed) in the system over a specified period, usually several weeks.

If you are using git, you can use this git-churn script to see how many times you have changed a file. Alternatively, Patrick Mevzek suggests, "You can get the number of times a file was committed by using:

git log --format=oneline [path_to_file]

There's also churn-php/

For something more official, Pluralsight Flow aggregates historical git data into reports offering greater transparency of a team's engineering performance and process efficiency. CodeScene uses predictive analytics to find hidden risks and social patterns in your code. It measures the number of added lines of code, and the number of deleted lines. Stepsize is looking to calculate code churn for the code associated with technical debt.

Additionally, N-Cover makes it possible to identify how much of your code is tested and any testing gaps. This is helpful as code defects are found in instances of high code churn.

For SaaS devs, Azure DevOps Server comes with a built-in mechanism for measuring code churn. It enables you to create reports that reveal:

The number of files with a specific file name extension changed in a particular build.
The number of lines of code in the source base for a particular build.
Which changesets have been submitted, and what are the details of each change? (For example, who made the change, which files were changed, and on what date was the change made)?

Conclusion

Code churn can have a significant impact on team capacity and effectiveness. High code churn may indicate team members that are highly innovative and should be steered towards more creative projects. But it could otherwise identify those who would benefit from additional support. Once you identify where code churn sits in your team's workflow you have the opportunity to make improvements.

Code churn may result from external factors like new information, data, or in response to customer feedback. Therefore, it could indicate the need for better communication flows or more regular, clearer feedback. If a lack of skills or developer knowledge is one of the key causes, it's an opportunity to introduce dedicated training to team members in need, which overall improves the morale and effectiveness of the team, meaning the greater business goals are met.

DEV Community

What Lies Beneath Hard Work: Code Churn

What is code churn?

What can measuring code churn reveal?

How does code churn relate to refactoring and technical debt?

How to measure code churn

Conclusion

Top comments (0)

Read next

Making Websites for Spies

Integrating Java and Python for Successful Machine Learning Implementation: A Case Study

Common Mathematical Functions

Perl Weekly #667 - Call for papers and sponsors for LPW 2024