Jaakko Kangasharju

Posted on Mar 12, 2019

Fixing Regressions with Git

#git #tips #bug

I'm sure many of us have been there. A bug has appeared and you're sure it was working a couple of weeks ago. But it's a big project, with hundreds of commits and thousands of changes since then. Luckily Git has some tools to help track down exactly when and why that breaking change was introduced.

`git blame`

The simpler tool is git blame. This can be used when you know which code is causing the problem, and you only need to find the commit to figure out why the change was made. For example, say the code in boggles.lang looks like this

refresh_button.enable()
frequency = compute_frequency(next_time)
baffler.aeronate(frequency)

and the problem is that refresh_button is disabled.

Now, perhaps you're familiar with the code. Maybe you even wrote most of this functionality. You go "Of course! Aeronating the baffler starts the norigator and the refresh button is disabled whenever the norigator is running." So you know that the problem is with the line

baffler.aeronate(frequency)

First Attempt

Of course you can't just go deleting this line. Someone added it for some specific purpose. This is where git blame comes in. You start with

git blame boggles.lang

and browse to the line

baffler.aeronate(frequency)

It will show up as something like

30376ff1 (Tomasz 2018-10-23 14:56:11 +0100  65) baffler.aeronate(frequency)

This line has the following pieces of information, in order:

The commit identifier
The author of the commit
The date and time of the commit
The line number
The contents of the line itself

What this tells you is the last commit that touched this line. That doesn't mean that it was the commit that introduced the problem. To check the commit, you would run

git show 30376ff1

which might tell you something like

commit 30376ff158bd8713fbf3308efcc6a46d0b9cdf57
Author: Tomasz <tomasz@company.example>
Date:   Tue Oct 23 14:56:11 2018 +0100

    Clean up trailing whitespace

This is not very helpful...

Look Only at Actual Changes

Whitespace changes are practically never what I'm looking for when trying to track down a regression, so that's why I use the -w option of git blame to ignore whitespace changes:

git blame -w boggles.lang

Now the output might look like this

3f056217 (Lucia 2018-10-19 10:22:41 +0100  65) baffler.aeronate(frequency)

Again going in with

git show 3f056217

and scrolling to the appropriate line in the diff might now reveal something like this

 refresh_button.enable()
-fq = compute_frequency(next_time)
-baffler.aeronate(fq)
+frequency = compute_frequency(next_time)
+baffler.aeronate(frequency)

This is a diff that shows only that the local variable fq was renamed to frequency but the issue-causing aeronation is still present. We must dig deeper.

Go Further Back in History

git blame has a nice feature where it can begin looking from any specific commit. We now know that the issue we're tracking was introduced before commit 3f056217, so we can run

git blame -w 3f056217^ boggles.lang

(The ^ is important; it tells to start the search from the parent of 3f056217. You can also use ~ instead of ^ to get the parent; the differences of those two do not matter in this case.)

This sequence is already familiar. Scroll to the appropriate line in the output:

7e3d2394 (Peter 2018-10-17 14:02:01 +0100  65) baffler.aeronate(fq)

Check the commit:

git show 7e3d2394

Scroll down to the code in the diff:

 refresh_button.enable()
+fq = compute_frequency(next_time)
+baffler.aeronate(fq)

And success! Now we have found the commit that introduced the problematic aeronation.

So what next? Now I would read the commit message of 7e3d2394, and potentially the whole commit diff, to determine what functionality the commit introduced, and try to figure out how to reconcile those needs with the needs of the refresh button to stay enabled. If I could not figure that out, I would likely have a chat with Peter who introduced the aeronation to see if together we could come up with something. A typical case might be that the aeronation is doing too much and does not in fact need to start the norigator in all cases. This would be handled by splitting the method into two and calling only the aeronation part here.

`git bisect`

What if you can't tell where the problematic code is? You can tell that the code has changed since it was first written, but you have no idea how aeronating the baffler could have anything to do with the refresh button, so that's not where your mind goes first. And there have been many changes here recently, so it's not that there's only one change to consider.

Setting up Bisection

Git itself can help you here. The git bisect command is made for exactly this situation. If you know a commit that is "good" and another later commit that is "bad", you just tell git bisect that, and it will let you look through commits between them until you zero in on the one that caused the change from good to bad.

The way I begin this process is by first running

git log --first-parent

and look for some commit where the feature was still working. Of course I can't tell that by just reading the log, so I mostly rely on the commit timestamps and my memory of when things were definitely fine.

(The --first-parent option makes git log show commits only on the current branch, not commits from other branches that have been merged. It's useful for this kind of case.)

Let's say I picked commit 7102f250 as a likely one where the feature was still working. Now I will do

git checkout 7102f250

and test this version. If I discover that the feature is not working, I will make a note that it was broken already on this version and go back in the logs until I find a working version.

After I have a working version, it's time to start bisecting:

git bisect start
git bisect bad master
git bisect good 7102f250

(If I discovered some earlier commit where the feature was already broken, I would place that commit's identifier in place of master on the git bisect bad master line.)

Evaluating Commits

After executing all these statements, there will be output something like

Bisecting: 105 revisions left to test after this (roughly 7 steps)
[3f056217e73576fc6ba8f0cb8c277627fd154e43] <Imagine useful commit message here>

This means that git has determined there are 105 possible versions that could have introduced the bug. It estimates needing 7 tries to determine the actual version (the search is binary-search-like so it needs roughly the binary logarithm of the number of possibilities). And it has checked out a suitable version to try.

Now that git has checked out a version to test, you need to do exactly that. Test whether this now-checked-out version has the bug or not. If it does, tell git

git bisect bad

If it works correctly, tell git

git bisect good

And if for some reason you can't determine whether the bug is present or not, you can do

git bisect skip

For instance, if the version has another bug that prevents you from getting to where your bug of interest is, you would need to use skip.

Continuing with this process, testing each version that git checks out and marking it as good or bad, should eventually end up with something like

7e3d2394e73576fc6ba8f0cb8c277627fd154e43 is the first bad commit
commit 7e3d2394e73576fc6ba8f0cb8c277627fd154e43
Author: Peter <peter@company.example>
Date:   Wed Oct 17 14:02:01 2018 +0100

    Make sure baffler is aeronated...

Now we found the same commit that we found earlier with git blame. The bisect process is more time-consuming, requiring testing of each version, but it also requires less manual work, since git is taking care of checking out versions to test.

Finally, after finding where the problem was introduced, you should make a note of the commit identifier and run

git bisect reset

to stop the bisection process.

More `git bisect`

git bisect is not limited to finding regressions. It is a general tool to locate any change that happened because of a single commit. The "testing" phase doesn't need to involve running the program if the good/bad distinction can be made by inspecting the code. Also, when not specifically finding regressions, you can use the aliases old/new for good/bad, so that you don't get confused by the labels if, say, you're instead tracking down when a specific bug was fixed.

git blame is useful for finding when a specific line was added. But it cannot be used to find when a specific line was deleted. It occasionally happens that I notice a crucial line of code is no longer present, so I want to know when and why it was deleted. git bisect is a good tool for this: A version containing the line is good and a version not containing it is bad. With this, bisecting will find the commit that deleted the line.

There are even more features. You can record the bisect session with log and get to the same state later with replay. Editing the log in between lets you change your bad/good decisions, for instance if you made a mistake. It is also possible to use a script for the testing part with run, which makes the whole bisection process completely automatic. I rarely find a use for this, since for me the testing is typically unique to the issue, and trying to script it would often be more work. But in some cases, e.g., failing automated tests, it can be useful.

Bisect Stories

It's never as simple as in a constructed example specifically made to demonstrate how bisect works. In one of my projects, the result was often very unstable and didn't seem to find the actual problematic commit. Turned out, the compilation process didn't properly recompile everything that was needed after bisect checked out another commit. That's when I learned to run git clean -dxf between bisect steps to make sure I had a clean tree. Usually it's enough to do a regular clean build and installation, though.

Other times, it's happened that I haven't been able to find a working version, no matter how far back I go. That's usually caused by a toolchain update on my computer, or sometimes a dependency update that doesn't get rolled back when checking out earlier code. This can cause serious self-doubt when you first encounter it, but after a few times, you start recognizing it. It helps immensely if the dependency management system uses lock files, or if you specify precise versions for them (some teams ensure this by committing also the dependencies), so that the historical system behavior remains reproducible.

git bisect skip is there when some commit has other problems that make it not testable. But if you end up skipping many commits, git bisect may in the end just tell you "The problem is in one of these 10 commits". One common situation where this can happen is when a feature branch has been rebased and the main branch has changed something that new functionality on the feature branch depended on, making all the older commits on the feature branch unusable.

But after the obvious potential problems are dealt with, I find I can trust git bisect fully. One case that has stuck with me was when bisect pointed to a commit that seemed to have nothing to do with the bug. I did rerun the bisection but got the same result. After accepting that, yes, this was the commit that introduced the bug, I dug deeper and discovered that the change had in fact exposed a race condition that already existed in the code. This might have blown up in our face at any time. Being able to focus on just a small change discovered with bisect greatly reduced the time to fix that bug, and perhaps we never would have figured it out without that focus.