DEV Community

Pascal Thormeier
Pascal Thormeier

Posted on • Edited on

//TODO: Refactor me - A guide to measuring shortcuts

Inspired by @ben's post about establishing a culture of valueing refactoring among devs (linked further below), I asked myself: "How could we achieve that? What could I do to make this happen?"

I reckon one of the main problems is to actually keep track of those shortcuts. The larger the code base, the easier it is to forget about places we need to refactor. Do you remember all the shortcuts you took a month or a year ago? I certainly don't.

My first impulse was to try and measure those things and to constantly remind myself about those places.

I'll start this little case study with the following bit of JS code, a quick and dirty FizzBuzz implementation that could use some refactoring:

for (let i = 1; i < 100; i++) {
  if (i % 3 === 0) {
    if (i % 5 === 0) {
      console.log('FizzBuzz')
    } else {
      console.log('Fizz')
    }
  } else if (i % 5 === 0) {
    console.log('Buzz')
  } else {
    console.log(i)
  }
}
Enter fullscreen mode Exit fullscreen mode

Keeping a list of shortcuts in a wiki or something seems like a good idea at first, but it needs additional discipline to keep it in sync with the code base. I'll mark these spots in the code directly instead - with a little comment:
// TODO: ...

for (let i = 1; i < 100; i++) {
  // TODO: Move all mods to functions (1st commit)
  if (i % 3 === 0) {
    if (i % 5 === 0) { // TODO Make this nested if composed of resuable code (2nd commit)
      console.log('FizzBuzz') // TODO: Reduce number of console.log calls (2nd commit)
    } else {
      console.log('Fizz')
    }
  } else if (i % 5 === 0) { // TODO: Remove code duplication (1st commit)
    console.log('Buzz')
  } else {
    console.log(i)
  }
}
Enter fullscreen mode Exit fullscreen mode

I've split the added TODOs on two commits, to better illustrate change over time.

Off for a good start! A lot of IDEs nowadays recognize TODO as a special comment already and offer lists of those. Perfect for refactoring. Now I'll write a bash script to keep track of those. Inspired by another post by @waylonwalker (linked further below), I'll try to make most of the script reusable.

I first define what I actually want to measure. Two useful numbers could be the amount of total TODOs and how many of them I added/removed in my last commit. I'll call those numbers TODOS_TOTAL and TODOS_CHANGE_ABS and TODOS_CHANGE_PERCENT.

Counting the total

To first get an accurate image of the status quo, I use this piece of bash code and put it in a file I'll call measureTodos.sh

#!/bin/bash

# Find all occurances of "TODO" in git, count the lines
git grep "TODO" | wc -l
Enter fullscreen mode Exit fullscreen mode

And execute it:

user@~/fizzbuzz (main) $ ./measureTodos.sh 
5
Enter fullscreen mode Exit fullscreen mode

Wait - why are there 5? The code only contains 4, where does the other one come from? I'll analyze this using the git grep command without wc -l:

user@~/fizzbuzz (main) $ git grep "TODO"
fizzbuzz.js:  // TODO: Move all mods to functions (1st commit)
fizzbuzz.js:    if (i % 5 === 0) { // TODO Make this nested if composed of resuable code (2nd commit)
fizzbuzz.js:      console.log('FizzBuzz') // TODO: Reduce number of console.log calls (2nd commit)
fizzbuzz.js:  } else if (i % 5 === 0) { // TODO: Remove code duplication (1st commit)
measureTodos.sh:git grep "TODO" | wc -l # Find all occurances of "TODO" in git, count the lines
Enter fullscreen mode Exit fullscreen mode

Looks like the script is also counting all TODOs in itself - I need to exclude this. I'll do this by using grep instead of git grep and pass it a list of files known by git, excluding measureTodos.sh:

#!/bin/bash
# Find all occurances of "TODO" in git, count the lines
TODOS_TOTAL=$(grep "TODO" -- `git ls-files | grep -v $0` | wc -l)
echo $TODOS_TOTAL
Enter fullscreen mode Exit fullscreen mode

This should now output the right amount of TODOs:

user@~/fizzbuzz (main) $ ./measureTodos.sh 
4
Enter fullscreen mode Exit fullscreen mode

Amount of change

Now I want to track how many TODOs got added or removed during the last commit. I can do this with git diff and grep:

#!bin/bash

# Get the diff, remove all unnecessary output.
DIFF=$(git diff HEAD^ HEAD --unified=0 --ignore-all-space --ignore-blank-lines)

ADDED=$(echo "$DIFF"|grep "^+.*TODO" | wc -l) # Amount of TODOs added
REMOVED=$(echo "$DIFF"|grep "^\-.*TODO" | wc -l) # Amount of TODOs removed
Enter fullscreen mode Exit fullscreen mode

Those two lines search in the entire diff for added (^+) and removed lines (^\-) containing the string TODO.

Now I'll calculate the amount of change by subtracting the removed ones from the added ones.:

TODOS_CHANGE_ABS=$(expr $ADDED - $REMOVED)
Enter fullscreen mode Exit fullscreen mode

This gives me the absolute number. Now I'll calculate the percentage changed with this piece of code:

TODOS_CHANGE_PERCENT=$(echo "scale=2; $TODOS_CHANGE_ABS * 100 / $TODOS_TOTAL" | bc)
echo $TODOS_CHANGE_ABS
echo $TODOS_CHANGE_PERCENT
Enter fullscreen mode Exit fullscreen mode

My script now looks like this:

#!/bin/bash
# Find all occurances of "TODO" in git, count the lines
TODOS_TOTAL=$(grep "TODO" -- `git ls-files | grep -v $0` | wc -l)

# Count added and removed TODOs
DIFF=$(git diff HEAD^ HEAD --unified=0 --ignore-all-space --ignore-blank-lines)
ADDED=$(echo "$DIFF"|grep "^+.*TODO" | wc -l) # Amount of TODOs added
REMOVED=$(echo "$DIFF"|grep "^\-.*TODO" | wc -l) # Amount of TODOs removed
TODOS_CHANGE_ABS=$(expr $ADDED - $REMOVED)

# Calculate percentage of change
TODOS_CHANGE_PERCENT=$(echo "scale=2; $TODOS_CHANGE_ABS * 100 / $TODOS_TOTAL" | bc)

echo $TODOS_TOTAL
echo $TODOS_CHANGE_ABS
echo $TODOS_CHANGE_PERCENT
Enter fullscreen mode Exit fullscreen mode

And it outputs the following:

user@~/fizzbuzz (main) $ ./measureTodos.sh 
4
2
50.00
Enter fullscreen mode Exit fullscreen mode

And that's already helpful! I now know that 50% of all TODOs were added in the last commit.

Make it reusable

To make it more reusable, I'll move parts of the calculations into functions:

#!/bin/bash

# Calculate the percentage of two numbers
get_percentage () {
    if [ "$2" -eq "0" ]; then
        # Possible division by zero
        if [ "$1" -eq "0" ]; then
            echo 0.0 # Error handling: Division by zero
        else
            # Division by zero, but all have been removed.
            echo 100.0
        fi
    else
        echo "scale=2; $1 * 100 / $2" | bc
    fi
}

# Subtracts two numbers
subtract () {
    echo $(expr $1 - $2)
}

# Get an uncluttered diff
get_diff () {
    git diff HEAD^ HEAD --unified=0 --ignore-all-space --ignore-blank-lines
}

TODOS_TOTAL=$(grep "TODO" -- `git ls-files | grep -v $0` | wc -l)

DIFF=$(get_diff)
ADDED=$(echo "$DIFF"|grep "^+.*TODO" | wc -l) # Amount of TODOs added
REMOVED=$(echo "$DIFF"|grep "^\-.*TODO" | wc -l) # Amount of TODOs removed

TODOS_CHANGE_ABS=$(subtract $ADDED $REMOVED)

# Calculate percentage of change
TODOS_CHANGE_PERCENT=$(get_percentage $TODOS_CHANGE_ABS $TODOS_TOTAL)

echo $TODOS_TOTAL
echo $TODOS_CHANGE_ABS
echo $TODOS_CHANGE_PERCENT
Enter fullscreen mode Exit fullscreen mode

I also added some error handling for the percentage calculation: If there's no more TODOs left, the calculation would've ended in a division by zero. Already feels a lot safer and more stable!

Make it configurable

Maybe the comment I want to look for is not TODO. In some other project it might be REFACTOR or similar. To cover this as well, I'll alter parts of my script to use a parameter instead of a hardcoded keyword:

# [...]
KEYWORD=$1

TODOS_TOTAL=$(grep "$KEYWORD" -- `git ls-files | grep -v $0` | wc -l)

DIFF=$(get_diff)
ADDED=$(echo "$DIFF"|grep "^+.*$KEYWORD" | wc -l)
REMOVED=$(echo "$DIFF"|grep "^\-.*$KEYWORD" | wc -l)
# [...]
Enter fullscreen mode Exit fullscreen mode

I can now use it like this:

user@~/fizzbuzz (main) $ ./measureTodos.sh TODO
4
2
50.00
Enter fullscreen mode Exit fullscreen mode

Make it look nice

I want to add a visual clue on how the amount of TODOs has changed since the last commit. For this, I'll color the output according to this scheme:

  • No change or any removed: Green
  • 1-25% added: Yellow
  • 25% and more added: Red

I can add some colors and format the output by adding two more functions to the script:

#!/bin/bash
# [...]
# Color output according to given percentage
get_output_color () {
    if (( $(echo "$1 <= 0" |bc -l) )); then # Green
        echo -e "\e[30;42m"
    elif (( $(echo "$1 > 25" |bc -l) )); then # Red
        echo -e "\e[30;41m"
    else # Yellow
        echo -e "\e[30;43m"
    fi
}

# Generates output according to change
get_todo_output () {
    echo ""
    echo "  TODO Measurement"
    echo "  ----------------"
    echo "  Total found: $1"
    if [[ "$2" -lt 0 ]]; then
        echo "  Removed:     ${2#-} ($3%)"
    elif [[ "$2" -gt 0 ]]; then
        echo "  Added:       $2 (+$3%)"
    else
        echo "  No change"
    fi
}
Enter fullscreen mode Exit fullscreen mode

... and use them futher below:

#!/bin/bash
# [...]

get_output_color $TODOS_CHANGE_PERCENT
get_todo_output $TODOS_TOTAL $TODOS_CHANGE $TODOS_CHANGE_PERCENT
echo -e "\e[0;0m"
Enter fullscreen mode Exit fullscreen mode

This should be enough to generate nice readable outputs, like these:

Screenshot of CLI output
Screenshot of CLI output
Screenshot of CLI output

The final script

The final script I've wrote now looks like the following:

#!/bin/bash

# Calculate the percentage of two numbers
get_percentage () {
    if [ "$2" -eq "0" ]; then
        # Possible division by zero
        if [ "$1" -eq "0" ]; then
            echo 0.0 # Error handling: Division by zero
        else
            # Division by zero, but all have been removed.
            echo 100.0
        fi
    else
        echo "scale=2; $1 * 100 / $2" | bc
    fi
}

# Subtracts two numbers
subtract () {
    echo $(expr $1 - $2)
}

# Get an uncluttered diff
get_diff () {
    git diff HEAD^ HEAD --unified=0 --ignore-all-space --ignore-blank-lines
}

# Color output according to given percentage
get_output_color () {
    if (( $(echo "$1 <= 0" |bc -l) )); then # Green
        echo -e "\e[30;42m"
    elif (( $(echo "$1 > 25" |bc -l) )); then # Red
        echo -e "\e[30;41m"
    else # Yellow
        echo -e "\e[30;43m"
    fi
}

# Generates output according to change
get_todo_output () {
    echo ""
    echo "  TODO Measurement"
    echo "  ----------------"
    echo "  Total found: $1"
    if [[ "$2" -lt 0 ]]; then
        echo "  Removed:     ${2#-} ($3%)"
    elif [[ "$2" -gt 0 ]]; then
        echo "  Added:       $2 (+$3%)"
    else
        echo "  No change"
    fi
}

KEYWORD=$1

TODOS_TOTAL=$(grep "$KEYWORD" -- `git ls-files | grep -v $0` | wc -l)

DIFF=$(get_diff)
ADDED=$(echo "$DIFF"|grep "^+.*$KEYWORD" | wc -l)
REMOVED=$(echo "$DIFF"|grep "^\-.*$KEYWORD" | wc -l)

TODOS_CHANGE=$(subtract $ADDED $REMOVED)

# Calculate percentage of change
TODOS_CHANGE_PERCENT=$(get_percentage $TODOS_CHANGE $TODOS_TOTAL)

get_output_color $TODOS_CHANGE_PERCENT
get_todo_output $TODOS_TOTAL $TODOS_CHANGE $TODOS_CHANGE_PERCENT
echo -e "\e[0;0m"
Enter fullscreen mode Exit fullscreen mode

Takeaway thoughts

I tried to explore some new kind of code quality measurement here and learned a lot about bash in the process. This little script can now be used within CI/CD pipelines or as a git commit hook. Adding some exit codes could enable it to fail CI builds if a dev has taken too many shortcuts. Please feel free to use and/or alter this script as you see fit. It's a tool after all!


Linked posts

As promised, here's the two posts that inspired me to write this script and post:


I write tech articles in my free time. If you enjoyed reading this post, consider buying me a coffee!

Buy me a coffee button

Top comments (2)

Collapse
 
moopet profile image
Ben Sinclair

In bash, you can assign defaults to variables, so you could use

KEYWORD=${1:-TODO}
Enter fullscreen mode Exit fullscreen mode

to make your script look for "TODO" or whatever you passed it as an argument.

Collapse
 
thormeier profile image
Pascal Thormeier

Thank you for, I wasn't aware of that! Will add this to the article :)