loading...

Using git bisect to spot borked yarn.lock update

michalbryxi profile image Michal Bryxí ・3 min read

The scenario

While we were moving all our Frontend apps into a monorepo, which is backed by yarn workspaces we encountered a somewhat unexpected problem of misbehaving code in one of the deep dependencies.

This happened because monorepo has only one, top-level yarn.lock. So when moving the code from application-A to the monorepo we had to drop the application-A/yarn.lock file and run yarn to get all the application-A dependencies written in monorepo/yarn.lock.

If you ever worked on even a little big bigger project using yarn you will know that this can easily mean tens of thousands of changed lines.

In theory, this should be fine as semver should guarantee that nothing should break as long as we're pinning packages in a proper way. In theory. Realistically speaking this broke some tests and our CI was red.

And now the £10000 question is: How do you find out which dependency is the problematic one? One of my coworkers had a great idea to hunt for the flaw in the application-A repo instead of the monorepo so that we can see the diff of the packages that changed.

The fix:

1) Go to application-A repo
2) Remove yarn.lock
3) Run yarn
4) Run following shell one-liner:

until git diff --quiet; do echo "yq" | git add -p; git commit -m 'hunk'; done

Let's break this down:

  • until - Like a while loop, but with inverted condition
  • git diff --quiet - This makes the output less noisy and also implies --exit-code, which make the program exit with 1 if there were differences and 0 when there are none
  • echo "yq" - sends letters y and q to the next command
  • git add -p - Enter git staging in interactive mode. The letters from the previous command will make it to: y = stage this (one, first) hunk; q = quit; do not stage this hunk or any of the remaining ones
  • git commit -m - Commit the hunk we just staged

And now used git bisect to do all the hard work!

If you don't know git bisect the tl;dr is: Mark one commit as "good", one commit as "bad" and create a script that can run a test to determine whether a given commit is "good" or "bad". Git will then run a bisect algorithm to automatically find the first "bad" commit for you automatically.

So let's create the test script named check.sh:

#!/bin/bash

# Install current dependencies.
yarn 

# Run your test that will tell whether the commit is "good" or "bad".
# The example here is running EmberJS test suite.
ember test -f 'failing test' 

# Capture the exit code of our test.
e=$? 

# Make sure yarn did not pollute working copy.
git checkout yarn.lock 

# Exit the script with the exit code of our test.
exit $e 

And now the cool part:

1) Run git bisect start
2) Checkout the latest commit in your branch and run git bisect bad
3) Checkout the first commit before your series of "hunk" commits and run git bisect good
4) Execute git bisect run ./check.sh
5) Make yourself a coffee ☕️
6) When the script is done, look at the first "bad" commit and write the old version of the offending package into your monorepo/package.json resolutions section
6) Victory dance 🕺🏻

Git bisect example in Fork App

The resolutions section in our case was:

  // monorepo/package.json
  ...
  "resolutions": {
    "fake-xml-http-request": "2.0.0"
  }
  ...

Conclusion

This gave us a "pixel-perfect" pointer to the problematic part in a huge diff of one file. I believe this technique can be leveraged in other scenarios as well.

As long as you can write test script for git bisect, it should work for you.

The trick of dividing one big diff to multiple commits is the "new" part here. And I think the one-liner above can work in many different situations.

Posted on by:

michalbryxi profile

Michal Bryxí

@michalbryxi

Cycle 🚴 , climb 🗻 , run 🏃 , travel 🌍 , enjoy life ♥. IT guy with the need to live fully.

Discussion

markdown guide