Inspired by @ben's post about establishing a culture of valueing refactoring among devs (linked further below), I asked myself: "How could we achieve that? What could I do to make this happen?"
I reckon one of the main problems is to actually keep track of those shortcuts. The larger the code base, the easier it is to forget about places we need to refactor. Do you remember all the shortcuts you took a month or a year ago? I certainly don't.
My first impulse was to try and measure those things and to constantly remind myself about those places.
I'll start this little case study with the following bit of JS code, a quick and dirty FizzBuzz implementation that could use some refactoring:
for (let i = 1; i < 100; i++) {
if (i % 3 === 0) {
if (i % 5 === 0) {
console.log('FizzBuzz')
} else {
console.log('Fizz')
}
} else if (i % 5 === 0) {
console.log('Buzz')
} else {
console.log(i)
}
}
Keeping a list of shortcuts in a wiki or something seems like a good idea at first, but it needs additional discipline to keep it in sync with the code base. I'll mark these spots in the code directly instead - with a little comment:
// TODO: ...
for (let i = 1; i < 100; i++) {
// TODO: Move all mods to functions (1st commit)
if (i % 3 === 0) {
if (i % 5 === 0) { // TODO Make this nested if composed of resuable code (2nd commit)
console.log('FizzBuzz') // TODO: Reduce number of console.log calls (2nd commit)
} else {
console.log('Fizz')
}
} else if (i % 5 === 0) { // TODO: Remove code duplication (1st commit)
console.log('Buzz')
} else {
console.log(i)
}
}
I've split the added TODOs on two commits, to better illustrate change over time.
Off for a good start! A lot of IDEs nowadays recognize TODO
as a special comment already and offer lists of those. Perfect for refactoring. Now I'll write a bash script to keep track of those. Inspired by another post by @waylonwalker (linked further below), I'll try to make most of the script reusable.
I first define what I actually want to measure. Two useful numbers could be the amount of total TODO
s and how many of them I added/removed in my last commit. I'll call those numbers TODOS_TOTAL
and TODOS_CHANGE_ABS
and TODOS_CHANGE_PERCENT
.
Counting the total
To first get an accurate image of the status quo, I use this piece of bash code and put it in a file I'll call measureTodos.sh
#!/bin/bash
# Find all occurances of "TODO" in git, count the lines
git grep "TODO" | wc -l
And execute it:
user@~/fizzbuzz (main) $ ./measureTodos.sh
5
Wait - why are there 5? The code only contains 4, where does the other one come from? I'll analyze this using the git grep
command without wc -l
:
user@~/fizzbuzz (main) $ git grep "TODO"
fizzbuzz.js: // TODO: Move all mods to functions (1st commit)
fizzbuzz.js: if (i % 5 === 0) { // TODO Make this nested if composed of resuable code (2nd commit)
fizzbuzz.js: console.log('FizzBuzz') // TODO: Reduce number of console.log calls (2nd commit)
fizzbuzz.js: } else if (i % 5 === 0) { // TODO: Remove code duplication (1st commit)
measureTodos.sh:git grep "TODO" | wc -l # Find all occurances of "TODO" in git, count the lines
Looks like the script is also counting all TODOs in itself - I need to exclude this. I'll do this by using grep
instead of git grep
and pass it a list of files known by git, excluding measureTodos.sh
:
#!/bin/bash
# Find all occurances of "TODO" in git, count the lines
TODOS_TOTAL=$(grep "TODO" -- `git ls-files | grep -v $0` | wc -l)
echo $TODOS_TOTAL
This should now output the right amount of TODOs
:
user@~/fizzbuzz (main) $ ./measureTodos.sh
4
Amount of change
Now I want to track how many TODOs got added or removed during the last commit. I can do this with git diff
and grep
:
#!bin/bash
# Get the diff, remove all unnecessary output.
DIFF=$(git diff HEAD^ HEAD --unified=0 --ignore-all-space --ignore-blank-lines)
ADDED=$(echo "$DIFF"|grep "^+.*TODO" | wc -l) # Amount of TODOs added
REMOVED=$(echo "$DIFF"|grep "^\-.*TODO" | wc -l) # Amount of TODOs removed
Those two lines search in the entire diff for added (^+
) and removed lines (^\-
) containing the string TODO
.
Now I'll calculate the amount of change by subtracting the removed ones from the added ones.:
TODOS_CHANGE_ABS=$(expr $ADDED - $REMOVED)
This gives me the absolute number. Now I'll calculate the percentage changed with this piece of code:
TODOS_CHANGE_PERCENT=$(echo "scale=2; $TODOS_CHANGE_ABS * 100 / $TODOS_TOTAL" | bc)
echo $TODOS_CHANGE_ABS
echo $TODOS_CHANGE_PERCENT
My script now looks like this:
#!/bin/bash
# Find all occurances of "TODO" in git, count the lines
TODOS_TOTAL=$(grep "TODO" -- `git ls-files | grep -v $0` | wc -l)
# Count added and removed TODOs
DIFF=$(git diff HEAD^ HEAD --unified=0 --ignore-all-space --ignore-blank-lines)
ADDED=$(echo "$DIFF"|grep "^+.*TODO" | wc -l) # Amount of TODOs added
REMOVED=$(echo "$DIFF"|grep "^\-.*TODO" | wc -l) # Amount of TODOs removed
TODOS_CHANGE_ABS=$(expr $ADDED - $REMOVED)
# Calculate percentage of change
TODOS_CHANGE_PERCENT=$(echo "scale=2; $TODOS_CHANGE_ABS * 100 / $TODOS_TOTAL" | bc)
echo $TODOS_TOTAL
echo $TODOS_CHANGE_ABS
echo $TODOS_CHANGE_PERCENT
And it outputs the following:
user@~/fizzbuzz (main) $ ./measureTodos.sh
4
2
50.00
And that's already helpful! I now know that 50% of all TODOs were added in the last commit.
Make it reusable
To make it more reusable, I'll move parts of the calculations into functions:
#!/bin/bash
# Calculate the percentage of two numbers
get_percentage () {
if [ "$2" -eq "0" ]; then
# Possible division by zero
if [ "$1" -eq "0" ]; then
echo 0.0 # Error handling: Division by zero
else
# Division by zero, but all have been removed.
echo 100.0
fi
else
echo "scale=2; $1 * 100 / $2" | bc
fi
}
# Subtracts two numbers
subtract () {
echo $(expr $1 - $2)
}
# Get an uncluttered diff
get_diff () {
git diff HEAD^ HEAD --unified=0 --ignore-all-space --ignore-blank-lines
}
TODOS_TOTAL=$(grep "TODO" -- `git ls-files | grep -v $0` | wc -l)
DIFF=$(get_diff)
ADDED=$(echo "$DIFF"|grep "^+.*TODO" | wc -l) # Amount of TODOs added
REMOVED=$(echo "$DIFF"|grep "^\-.*TODO" | wc -l) # Amount of TODOs removed
TODOS_CHANGE_ABS=$(subtract $ADDED $REMOVED)
# Calculate percentage of change
TODOS_CHANGE_PERCENT=$(get_percentage $TODOS_CHANGE_ABS $TODOS_TOTAL)
echo $TODOS_TOTAL
echo $TODOS_CHANGE_ABS
echo $TODOS_CHANGE_PERCENT
I also added some error handling for the percentage calculation: If there's no more TODOs left, the calculation would've ended in a division by zero. Already feels a lot safer and more stable!
Make it configurable
Maybe the comment I want to look for is not TODO
. In some other project it might be REFACTOR
or similar. To cover this as well, I'll alter parts of my script to use a parameter instead of a hardcoded keyword:
# [...]
KEYWORD=$1
TODOS_TOTAL=$(grep "$KEYWORD" -- `git ls-files | grep -v $0` | wc -l)
DIFF=$(get_diff)
ADDED=$(echo "$DIFF"|grep "^+.*$KEYWORD" | wc -l)
REMOVED=$(echo "$DIFF"|grep "^\-.*$KEYWORD" | wc -l)
# [...]
I can now use it like this:
user@~/fizzbuzz (main) $ ./measureTodos.sh TODO
4
2
50.00
Make it look nice
I want to add a visual clue on how the amount of TODOs has changed since the last commit. For this, I'll color the output according to this scheme:
- No change or any removed: Green
- 1-25% added: Yellow
- 25% and more added: Red
I can add some colors and format the output by adding two more functions to the script:
#!/bin/bash
# [...]
# Color output according to given percentage
get_output_color () {
if (( $(echo "$1 <= 0" |bc -l) )); then # Green
echo -e "\e[30;42m"
elif (( $(echo "$1 > 25" |bc -l) )); then # Red
echo -e "\e[30;41m"
else # Yellow
echo -e "\e[30;43m"
fi
}
# Generates output according to change
get_todo_output () {
echo ""
echo " TODO Measurement"
echo " ----------------"
echo " Total found: $1"
if [[ "$2" -lt 0 ]]; then
echo " Removed: ${2#-} ($3%)"
elif [[ "$2" -gt 0 ]]; then
echo " Added: $2 (+$3%)"
else
echo " No change"
fi
}
... and use them futher below:
#!/bin/bash
# [...]
get_output_color $TODOS_CHANGE_PERCENT
get_todo_output $TODOS_TOTAL $TODOS_CHANGE $TODOS_CHANGE_PERCENT
echo -e "\e[0;0m"
This should be enough to generate nice readable outputs, like these:
The final script
The final script I've wrote now looks like the following:
#!/bin/bash
# Calculate the percentage of two numbers
get_percentage () {
if [ "$2" -eq "0" ]; then
# Possible division by zero
if [ "$1" -eq "0" ]; then
echo 0.0 # Error handling: Division by zero
else
# Division by zero, but all have been removed.
echo 100.0
fi
else
echo "scale=2; $1 * 100 / $2" | bc
fi
}
# Subtracts two numbers
subtract () {
echo $(expr $1 - $2)
}
# Get an uncluttered diff
get_diff () {
git diff HEAD^ HEAD --unified=0 --ignore-all-space --ignore-blank-lines
}
# Color output according to given percentage
get_output_color () {
if (( $(echo "$1 <= 0" |bc -l) )); then # Green
echo -e "\e[30;42m"
elif (( $(echo "$1 > 25" |bc -l) )); then # Red
echo -e "\e[30;41m"
else # Yellow
echo -e "\e[30;43m"
fi
}
# Generates output according to change
get_todo_output () {
echo ""
echo " TODO Measurement"
echo " ----------------"
echo " Total found: $1"
if [[ "$2" -lt 0 ]]; then
echo " Removed: ${2#-} ($3%)"
elif [[ "$2" -gt 0 ]]; then
echo " Added: $2 (+$3%)"
else
echo " No change"
fi
}
KEYWORD=$1
TODOS_TOTAL=$(grep "$KEYWORD" -- `git ls-files | grep -v $0` | wc -l)
DIFF=$(get_diff)
ADDED=$(echo "$DIFF"|grep "^+.*$KEYWORD" | wc -l)
REMOVED=$(echo "$DIFF"|grep "^\-.*$KEYWORD" | wc -l)
TODOS_CHANGE=$(subtract $ADDED $REMOVED)
# Calculate percentage of change
TODOS_CHANGE_PERCENT=$(get_percentage $TODOS_CHANGE $TODOS_TOTAL)
get_output_color $TODOS_CHANGE_PERCENT
get_todo_output $TODOS_TOTAL $TODOS_CHANGE $TODOS_CHANGE_PERCENT
echo -e "\e[0;0m"
Takeaway thoughts
I tried to explore some new kind of code quality measurement here and learned a lot about bash in the process. This little script can now be used within CI/CD pipelines or as a git commit hook. Adding some exit codes could enable it to fail CI builds if a dev has taken too many shortcuts. Please feel free to use and/or alter this script as you see fit. It's a tool after all!
Linked posts
As promised, here's the two posts that inspired me to write this script and post:
I write tech articles in my free time. If you enjoyed reading this post, consider buying me a coffee!
Top comments (2)
In bash, you can assign defaults to variables, so you could use
to make your script look for "TODO" or whatever you passed it as an argument.
Thank you for, I wasn't aware of that! Will add this to the article :)