In this week’s post join me as I stop my Monorepo from being greedy and instruct it to lint, test and build only what’s required using Yarn workspaces and some black magic spells.
A Monorepo allows you to manage multiple packages on a single repo.
When you manage multiple packages on the same repo you need to have some tools to execute scripts across all the packages for good DX and for solid pipelines.
For instance, let’s take the “test” command. Obviously each package should have the command instructions for the test
command in its package.json
, but you would like to have the means to trigger the test
command for all the packages under the Monorepo in one call. Another example is the build
command which instructs to build every package the Monorepo has.
Luckily for us we don’t need to invent anything in order to incorporate this behavior into our Monorepo. If we take the 3 main tools out there to help us manage such a thing we will see that they all support executing commands across all packages -
Lerna has the lerna run command, NX has the nx run-many and Yarn has the yarn workspaces foreach.
In another post I wrote, titled “Rethinking the “One Ring To Rule Them all” Monorepo manager”, I dealt with the issue of Monorepo’s tools and stated that I find it better for each tool to be responsible for its domain, avoiding the “all in one” approach.
Because of that reason, the tool I selected to manage my workspaces is Yarn, so Yarn is basically responsible for running commands across all packages. This is the reason I will be focusing on Yarn in this post.
In my Pedalboard Monorepo there are several commands which get executed across all packages. Here is my root package.json
file:
"scripts": {
"test": "yarn workspaces foreach -p run test",
"coverage": "yarn test --coverage",
"lint": "yarn workspaces foreach -p run lint",
"build": "yarn workspaces foreach -ptv run build",
"publish:lerna": "lerna publish --yes --no-verify-access",
"coverage:combined": "pedalboard-scripts aggregatePackagesCoverage && nyc report --reporter lcov"
},
(you can read more about the different params in the post mentioned above)
And I use these scripts in my GitHub Actions to test, lint and build the packages before publishing them. Here is a part of the .yml file responsible for that:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- uses: actions/setup-node@v2
with:
node-version: 16
- run: yarn
- run: yarn lint
- run: yarn test
- run: yarn build
Lint, Build and Test are commands which execute across the entire package, and here is where it may present a problem…
Running these scripts for the entire packages is nice and dandy for a small Monorepo, but when you’re dealing with a big one, which has, let’s say, many components on its “components” package that each has a test to it or has a lot of packages that each is instructed to be built, the pipeline become v..e….r……y slow -
To put in simple words, introducing a change to the “components” package will now run the entire linters for all the packages, tests for all the packages and then build all the packages. Paying this time for a mere change to a component is what they call a “bad dev velocity”.
So what do we do?
Since
For those of you who are using Lerna there is this nice filter option parameter to the lerna run
command called “since”, so the command could look something like this:
lerna run test --since main
(This will tell Lerna to run the test command only on packages which changed compared to the main
branch).
I’m pretty sure NX has the same, but I’m not interested in it since, as mentioned before, my workspace manager is Yarn. Does Yarn have the equivalent?
Indeed it does, and surprisingly (or not) it has the same name: “since”. To quote from the docs:
”If --since is set, Yarn will only run the command on workspaces that have been modified since the specified ref. By default Yarn will use the refs specified by the changesetBaseRefs configuration option.”
Let’s understand the changesetBaseRefs
configuration first - If defined, Yarn will “look” at those settings to know what to compare against, but as stated here, if you don’t have these configuration set, the default is master
, and I’m cool with that.
Let’s try it on our test. I don’t have anything change on my packages so basically nothing should run if I add the since to my command like so:
"test": "yarn workspaces foreach -p --since run test",
Running yarn test
from the project’s root and… Nope. All my tests seem to be running. What gives?
Digging into more threads about this issue, I learned that in order to make this work as expected some black magic should be casted, mainly in what params should exist on the command and in what order.
To help whomever comes across this issue, here is the right way to apply the “since” param:
"test": "yarn workspaces foreach --since -pR run test",
(The “R” is for finding packages via dependencies/devDependencies instead of using the workspaces field. The “p” is for parallel. The “since” param should come first).
Running the yarn test
command now will not run any test. But if I change one of the files in the components package and run it again… yes! Only that package's test run.
That’s great, I would like to add to both the Lint and the Build:
"scripts": {
"test": "yarn workspaces foreach --since -pR run test",
"lint": "yarn workspaces foreach --since -pR run lint",
"build": "yarn workspaces foreach --since -ptvR run build",
"publish:lerna": "lerna publish --yes --no-verify-access",
"coverage": "yarn test --coverage",
"coverage:combined": "pedalboard-scripts aggregatePackagesCoverage && nyc report --reporter lcov"
},
Now my GitHub actions will run only on the relevant packages and reduce the time it takes to publish packages. When I pushed my project’s root package.json file changes you can see the the 3 steps mentioned above did not take any time since there was no need for them to run:
Wrapping Up
It is important to insist on a good dev velocity while maintaining solid pipelines - avoid running processes that shouldn’t run given a set of changes you’ve introduced. I’m not saying that this should be the set of mind from the get-go, but once your Monorepo’s pipelines are stable it is probably a good time to check whether they’re efficient or not.
Thankfully the tools out there, like Lerna, NX and Yarn support most of these needs and provide API’s to fine-tune your pipelines.
Hey! If you liked what you've just read check out @mattibarzeev on Twitter 🍻
Photo by SELİM ARDA ERYILMAZ on Unsplash
Top comments (0)