Yarn doesn't handle the underlying NPM infrastructure with elegance — and it might never do so.
I’ve spent the last few days wrangling with Yarn errors. Our builds we’re failing in some weird and random ways — and all signs pointed at Yarn. I can give you the TL;DR; of the investigation, and it’s this: Yarn doesn’t handle upstream NPM infrastructure errors in ideal ways.
But the problem is not (only) that Yarn code is buggy — the problem is more the disconnect which exists between Yarn (the client) and NPM’s infrastructure. The errors caused are significant enough to start conversations for moving to the NPM client. But moving back to NPM raises a bigger question about the viability of third-party package managers that rely on NPM infrastructure.
The Problem With Yarn
The problems I’ve had to debug recently relate to the fact that Yarn wraps the NPM infrastructure. Yarn doesn’t host any of it’s own packages, and therefore does not have a lot of say in how these packages are served, what errors they throw, etc.
Each of the NPM CDN failure scenarios are written into the Yarn client, but if the CDN fails in unpredictable ways (such as failed integrity checks, installation of private packages or even too many published versions) the Yarn client is not appropriately handling these cases. In the best case, things like install steps fail — in the worst Yarn exits cleanly as if nothing ever happened.
So what were the issues we were facing?
False Positives On Install
Firstly, sometimes Yarn appears to hang mid way through an install. And sometimes it is (actually) hanging. But, worryingly, sometimes Yarn will exit cleanly mid install step. And in some scenarios not running a full install will work, and other times it might not — giving you false positives on builds.
These false positives have been happening for a while throughout Yarn’s history. A quick Google shows these types of issues being raised right back to right back to 2016. But have been dismissed by Yarn maintainers as trivial “internet issues” for instance. And will presumably come and go based on NPM’s availability. But interestingly enough, the NPM status page reports do not correlate with issues seen in Yarn.
Half Downloaded Files
Secondly, whilst sometimes the errors cause the Yarn client to exit as above sometimes the NPM infra fails in different ways, such as closing connections early. Which leads to the following type of error which points to an “unexpected end of file”.
Annoyingly, both the errors do not direct your attention to the NPM CDN, but instead send you on a rabbit hole thinking the problem is on your end.
The plight of Yarn
Maybe right now you could be thinking: “Okay so Yarn has some kinks, but so does all open source — why not make a contribution and be done with it?”
But the problem goes deeper. And my concern extends more to the inherent relationship that exists between Yarn and NPM. Let me explain…
Yarn Dances to NPM’s Tune
We need to remind ourselves that Yarn is only a client wrapped around the NPM infrastructure. Because NPM holds all of the package infrastructure it makes Yarn susceptible now (and continually) to any upstream issues with NPM. Which means that Yarn will always lag behind NPM on adopting any necessary client changes that are based on CDN changes.
Yarn Is Being Ignored
To add to these difficulties that faces the Yarn ecosystem, it doesn’t help that important players such as Github are choosing to prioritise the NPM client instead instead of the Yarn client.
Yarn 1.0 Is Being Deprecated
And lastly on top of the CDN issues, Yarn 1.0 is being mostly left in the dark so that contributors can work on 2.0. But no amount of features in yarn 2.0 is going to fix the disconnect between NPM and the Yarn client. For instance, if you look at the contribution graph of the current Yarn project.
And compare to the Yarn 2.0 repo.
You see what I mean? The shift in attention only exacerbates the problem. Fixes aren’t made as quickly or as readily into the Yarn client. And these fixes might help to lessen the pain of the errors caused by NPM.
The Fix(es)
Whilst these issues are well out of your, or my hands there is a few things you can do to fix, or lessen the pain you might be feeling.
Fix 1: Use latest node and NPM
The first thing to check is ensure you’re running latest. Running latest ensures that you pick up any additional error handling scenarios built into Yarn.
Fix 2: Validate your installs
Since sometimes the installs fail midway through, you should manually ensure that your install has the packages you expect. Yarn has a util built in to do this which checks the current package.json
against the node_modules
. To run the command, run: yarn check --verify-tree
Fix 3: Hard Install
Another trick is to ensure you’re doing a full install by passing the --hard
flag to Yarn to force a full update.
Fix 4: Swap to NPM and NPM CI
And last but not least — should all your other efforts fail — you can swap to NPM. Swapping to NPM won’t fix any CDN flakiness, but it will likely lead to better error handling for edge case scenarios.
Working Around Yarn Limitations
And that’s it. I wanted to share with you some of the difficulties that we’ve been having with Yarn, the reasoning and the potential fixes. Sadly though it raises interesting questions about the future of third-party clients that work with NPM infrastructure.
It seems without some changes to the way the ecosystem works that third-party clients are doomed to have a second-rate experience. Maybe they can fight back with better features? We can’t predict the future, but hopefully now you can at least fix your build system for now!
Lou is the editor of The Cloud Native Software Engineering Newsletter a Newsletter dedicated to making Cloud Software Engineering more accessible and easy to understand, every 2 weeks you’ll get a digest of the best content for Cloud Native Software Engineers right in your inbox.
Top comments (15)
Howdy! I'm Yarn's maintainer 🙂
It's an interesting article - here are some insight I can provide. Keep in mind I'm a bit biased, of course!
First, consider that the Yarn v1 codebase was written in a quite different time. The team changed, new major features got released (such as workspaces), and even the way we write Javascript shifted! Lots of its internals would be designed differently if we had the chance... And that's precisely what's happening!
The v2 effort you reference is an acknowledgement that Yarn wouldn't have been able to continue forever in its current state. It represents the new, more stable and mature foundation which will give us the tools to maintain Yarn for the foreseeable future.
No amount of feature is going to fix the design issues, that's true. Fortunately, we're on both fronts! It's an incredible amount of work, but the results are showing. Writing features and fixing bugs has never been so easy.
I don't think that the registry is constraining us in any way. Something nice with the open source is that prioritization is a default: if something is important, someone will work on it. With the v2 plugin system this will become even more true, as users won't have to be synced with our releases to integrate third-party features into their projects.
I wouldn't say that Yarn is "ignored". We have good relations with many other projects and companies. I met with the GitHub folks a few times, and while we didn't discuss this particular issue I know they have an eye on Yarn.
I'm not too worried about this in particular. I think the key question will be how easy it will be for our open source contributors to work with us. By making the codebase more approachable, I'm certain we have a bright future ahead of us.
Hey Mael!
Thanks very much for your time and your insight, I knew writing this was a risk, since theree was a lot that I did not know. But I felt it important to open up a conversation on the topic, and I'm glad that despite the points I raise you feel hopeful V2 will address a lot of these issues. It seems V1 served it's purpose in showing that package management could be done better, and hopefully V2 will do the same all over again. I'm quite excited for the modules, and the concept of running Yarn as an API, too.
The majority of what I discuss is merely from digging around in the past few days. I was also conscious that I didn't want to downplay all the hard work that goes into open source, and how grateful I always am for the maintainers like yourself — so thank you!
I wanted to give you a feedback on this. I also have problems with yarn always hanging up at different cases like running tests, removing or installing a package etc. The problem is I am very bad at package management, and I know it is the reason for at least half of my problems here. But I want to code, not to manage packages. Yet a lot of my precious free time (it is a hobby project) is wasted because of things like that. I really want to use this time to code. I decided to move back to npm. If things go wrong there, at least there is an error message I can use to try to find the problem.
I wanted to share this as a feedback. If you can get some use cases or a 'user persona' from my story that you can build on, please go for it. I mean it, since there was a good reason I switched to yarn in the first place, and there is a great chance I will come back. Especially if your team keeps up the hard work, and make this a better experience for users.
Very informative, thanks for sharing.
You said in the beginning that:
But why wouldn't you consider this a buggy behaviour?
Personally I would say that's a bug. Even if npm returns a totally new error out of a sudden, yarn must have an implementation that allows itself to fail gracefully, without leaving the user in the dark.
Ah — good spot Renato!
I think I mean to say "the problem is not only that yarn is buggy". The point here I want to stress is less that yarn has bugs — which are temporary and can be fixed. But more that these problems have been going on for years and we have no clear way that they'll go away, with any amount of bug fixes.
I'll make a quick edit!
🙏🙏🙏🙏🙏
That makes sense! ;)
Great article, lights out some parts of Yarn I did not know before, very instructive.
I might say, managing relatively small projects, that Yarn never failed me since I switched to using it. I like the speed of installation and upgrade. The fact you do not have to append "run" before each of your scripts is making me earning some few milliseconds but can make the difference if I make several installation in my day.
That Github issue seems more like a Yarn problem than a Github problem. If you're going to be an NPM client, you better make sure you accept what NPM accepts. It would have been a quick "fix your client" from me too, and I can't really see that being an example of ignoring anything.
For me, using CRA I had some issues getting up and running. Using yarn resolved my issues and I've been using it without any problem since.
Glad to hear it! We're running yarn in some production build systems that are running hundreds of times a day, and some pipelines currently run yarn install ~100 times a build. In a day we probably go through thousands, if not more installations a day. I guess on larger scale, problems surface faster!
If it works for you, and you like it — go for it! I'm not bashing Yarn or it's capabilities, more highlighting some difficulties we face as an ecosystem! :)
Oh I know you're not bashing it! Yes I'm using it on a MUCH smaller scale.
I simply can't live without yarn's offline mirror capability 😢
Your title talks about "third party NPM clients" but in the end you just talk about Yarn?
I mean you take one example (not sure it's a good one) and make it a rule?
I'm simply trying to highlight some difficulties in the infrastructure that makes it difficult for third parties to compete. There aren't so many more to speak of (except github.com/pnpm/pnpm) so to speak, but that's the premise of the article.
Hi Lou, did you start using Yarn v2 already?