By the end of 2020, npm released the v7 of their package manager. The most noticeable change was that it now offered support for workspaces. That meant you could finally manage a monorepo with the Node built-in package manager—without the need for extra steps or tools.
At Hotjar, a platform that gives people insights into user and customer behavior, we decided to switch back from yarn to npm to simplify our tooling—so, npm would become our package manager again, now that it gave our team the ability to our team manage a monorepo.
But, it didn’t take long for us to realize that it probably wasn’t the best decision.
What is npm workspaces?
Npm workspaces provide the tooling you need to manage your dependencies in a monorepo. The setup is pretty simple with a new property in your package.json
:
// package.json
{
"name": "workspace-example",
"workspaces": [
"./packages/*"
]
}
It automates the installation and linking process, and there’s a special --workspace
flag to run commands in the context of monorepo projects.
# example: run the tests for the workspace "foo"
npm run test --workspace foo
When Hotjar started using npm workspaces, we had less than 20 workspaces in our monorepo. Today, we have more than 50!
So, we realized npm wasn’t a scalable monorepo tool—the hard way. Our CI started taking way too long (where we got to the point of surpassing our max average runtime), the node_modules
size got out of hand, and a simple npm install became a dense daily process. We actually weren’t adding that many dependencies to our monorepo, but working with the workspaces and installing their dependencies became tiring and tedious.
For reference: our node_modules
size took 3.2GB of space disk, and installing the dependencies on CI took around 4 minutes. This wasn’t desirable at all, so we started digging into how npm CLI works.
The disadvantages of npm
Npm docs are great in general. However, when it comes to understanding how their CLI works under the hood, you’re going in blind. After many undesirable hours of research, we came to the conclusion that the npm workspace was a great approach for simple monorepos, but not for massive repos like ours at Hotjar. Here’s why:
- Npm holds a copy of a dependency every time it’s installed: if we have 100 projects using a dependency, we’ll have 100 copies of that dependency saved on disk. This doesn’t work exactly like that for monorepos, since npm can dedupe dependencies with the same version, but it’ll still duplicate some sub-dependencies with different versions across the dependencies tree.
- Npm has blocking stages of installation: for example, it can’t parallelize the installation, so each dependency follows the Resolving → Fetching → Writing flow, and each dependency must finish one stage before we can start another
- Npm doesn’t have a proper mechanism to realize a dependency is another workspace from your monorepo: this means that every time you want to install your dependencies, npm will request such a dependency to the registry, then will try to link it locally when no dependency was found there.
- Npm hoists dependencies you install to the root of the modules directory: as a result, the source code has access to dependencies that are not added as dependencies to the project. This is known as ‘Phantom Dependencies’ and might break your dependencies or code in several unexpected ways.
A performant npm
The way npm works makes it complicated (if not impossible) to escalate properly for workspaces, causing our dependencies process to slow down, and become unmanageable and unsafe. That’s why Hotjar needed to find a faster, lighter, and safer alternative to npm that also scaled properly for our monorepo—so adding more workspaces didn’t degrade the whole dependencies installation process.
‘Performant npm’ (Pnpm), which is sponsored by Vercel and Prisma and is making a name for itself within the Node ecosystem, is an alternative to npm that caught Hotjar’s special attention. With its promise of being ‘fast, efficient, strict, and supporting monorepos’, it looked like the perfect package manager candidate.
You can see a full benchmark comparison of JavaScript package managers here.
Switching to fast monorepos
After running some tests, we estimated that the improvement in terms of installation, speed, and disk space usage would be massive. It was time to leave npm behind again, so we got down to work. We spent a few weeks working on this migration, and then finally made the switch to pnpm.
The setup is quite simple too:
# pnpm-workspace.yaml
packages:
- 'packages/*'
The whole installation and linking are intelligently automated, and you can run all commands in the context of workspaces with the --filter
flag:
# example: run the tests for the workspace "foo"
pnpm --filter foo test
Pnpm also provides advanced features like workspace protocol, or doing partial installs for your workspaces, giving you granular control over your monorepo.
However, the best improvement came in the form of performance. Let’s take a look at what changed after Hotjar made the switch:
Disk space comparison
NPM (before) | PNPM (after) | Δ delta | |
---|---|---|---|
Docker image size | 4.08GB | 2.54GB | 37% ⬇️ |
node_modules size | 3.2GB | 1.3GB | 59% ⬇️ |
Installation time comparison (in local dev env)
NPM (before) | PNPM (after) | Δ delta | |
---|---|---|---|
without cache, without node_modules | 4min 50sec | 1min 30sec | 69% ⬇️ |
with cache, without node_modules | 5min | 48sec | 84% ⬇️ |
with cache and node_modules | 1min | 8sec | 86% ⬇️ |
The result? We eventually got our CI back into its average runtime.
We couldn’t be happier about pnpm and can’t recommend it enough as we continue to use it—it makes your monorepo scalable and maintainable without adding any complexity. Even if you have a small or simple monorepo, you can take advantage of pnpm speed installation improvements. In our next post, we’ll talk about Phantom Dependencies and how they complicated our migration to pnpm.
Top comments (0)