loading...

Yarn and the dark future of third party NPM clients

loujaybee profile image Lou — Cloud Native Software Engineer Originally published at thedevcoach.co.uk on ・5 min read

Yarn doesn't handle the underlying NPM infrastructure with elegance — and it might never do so.

I’ve spent the last few days wrangling with Yarn errors. Our builds we’re failing in some weird and random ways — and all signs pointed at Yarn. I can give you the TL;DR; of the investigation, and it’s this: Yarn doesn’t handle upstream NPM infrastructure errors in ideal ways.

But the problem is not (only) that Yarn code is buggy — the problem is more the disconnect which exists between Yarn (the client) and NPM’s infrastructure. The errors caused are significant enough to start conversations for moving to the NPM client. But moving back to NPM raises a bigger question about the viability of third-party package managers that rely on NPM infrastructure.

The Problem With Yarn

The problems I’ve had to debug recently relate to the fact that Yarn wraps the NPM infrastructure. Yarn doesn’t host any of it’s own packages, and therefore does not have a lot of say in how these packages are served, what errors they throw, etc.

Each of the NPM CDN failure scenarios are written into the Yarn client, but if the CDN fails in unpredictable ways (such as failed integrity checks, installation of private packages or even too many published versions) the Yarn client is not appropriately handling these cases. In the best case, things like install steps fail — in the worst Yarn exits cleanly as if nothing ever happened.

So what were the issues we were facing?

False Positives On Install

Firstly, sometimes Yarn appears to hang mid way through an install. And sometimes it is (actually) hanging. But, worryingly, sometimes Yarn will exit cleanly mid install step. And in some scenarios not running a full install will work, and other times it might not — giving you false positives on builds.

These false positives have been happening for a while throughout Yarn’s history. A quick Google shows these types of issues being raised right back to right back to 2016. But have been dismissed by Yarn maintainers as trivial “internet issues” for instance. And will presumably come and go based on NPM’s availability. But interestingly enough, the NPM status page reports do not correlate with issues seen in Yarn.

Half Downloaded Files

Secondly, whilst sometimes the errors cause the Yarn client to exit as above sometimes the NPM infra fails in different ways, such as closing connections early. Which leads to the following type of error which points to an “unexpected end of file”.

Annoyingly, both the errors do not direct your attention to the NPM CDN, but instead send you on a rabbit hole thinking the problem is on your end.

The plight of Yarn

Maybe right now you could be thinking: “Okay so Yarn has some kinks, but so does all open source — why not make a contribution and be done with it?”

But the problem goes deeper. And my concern extends more to the inherent relationship that exists between Yarn and NPM. Let me explain…

Yarn Dances to NPM’s Tune

We need to remind ourselves that Yarn is only a client wrapped around the NPM infrastructure. Because NPM holds all of the package infrastructure it makes Yarn susceptible now (and continually) to any upstream issues with NPM. Which means that Yarn will always lag behind NPM on adopting any necessary client changes that are based on CDN changes.

Yarn Is Being Ignored

To add to these difficulties that faces the Yarn ecosystem, it doesn’t help that important players such as Github are choosing to prioritise the NPM client instead instead of the Yarn client.

Yarn 1.0 Is Being Deprecated

And lastly on top of the CDN issues, Yarn 1.0 is being mostly left in the dark so that contributors can work on 2.0. But no amount of features in yarn 2.0 is going to fix the disconnect between NPM and the Yarn client. For instance, if you look at the contribution graph of the current Yarn project. Yarn 1.0

And compare to the Yarn 2.0 repo.

Yarn 2.0

You see what I mean? The shift in attention only exacerbates the problem. Fixes aren’t made as quickly or as readily into the Yarn client. And these fixes might help to lessen the pain of the errors caused by NPM.

The Fix(es)

Whilst these issues are well out of your, or my hands there is a few things you can do to fix, or lessen the pain you might be feeling.

Fix 1: Use latest node and NPM

The first thing to check is ensure you’re running latest. Running latest ensures that you pick up any additional error handling scenarios built into Yarn.

Node Versions

Fix 2: Validate your installs

Since sometimes the installs fail midway through, you should manually ensure that your install has the packages you expect. Yarn has a util built in to do this which checks the current package.json against the node_modules. To run the command, run: yarn check --verify-tree

Yarn Check

Fix 3: Hard Install

Another trick is to ensure you’re doing a full install by passing the --hard flag to Yarn to force a full update.

NPM install --force

Fix 4: Swap to NPM and NPM CI

And last but not least — should all your other efforts fail — you can swap to NPM. Swapping to NPM won’t fix any CDN flakiness, but it will likely lead to better error handling for edge case scenarios.

NPM CI install

Working Around Yarn Limitations

And that’s it. I wanted to share with you some of the difficulties that we’ve been having with Yarn, the reasoning and the potential fixes. Sadly though it raises interesting questions about the future of third-party clients that work with NPM infrastructure.

It seems without some changes to the way the ecosystem works that third-party clients are doomed to have a second-rate experience. Maybe they can fight back with better features? We can’t predict the future, but hopefully now you can at least fix your build system for now!


Lou is the editor of The Cloud Native Software Engineering Newsletter a Newsletter dedicated to making Cloud Software Engineering more accessible and easy to understand, every 2 weeks you’ll get a digest of the best content for Cloud Native Software Engineers right in your inbox.

Posted on by:

loujaybee profile

Lou — Cloud Native Software Engineer

@loujaybee

I write stuff on Cloud Engineering @ thedevcoach.co.uk. I made splitoo.com. Also likes: 🏋️‍♂️🎸🚴🏻‍♂️🏍

Discussion

markdown guide
 

Howdy! I'm Yarn's maintainer 🙂

It's an interesting article - here are some insight I can provide. Keep in mind I'm a bit biased, of course!

But the problem is not (only) that Yarn code is buggy — the problem is more the disconnect which exists between Yarn (the client) and NPM’s infrastructure

First, consider that the Yarn v1 codebase was written in a quite different time. The team changed, new major features got released (such as workspaces), and even the way we write Javascript shifted! Lots of its internals would be designed differently if we had the chance... And that's precisely what's happening!

But no amount of features in yarn 2.0 is going to fix the disconnect between NPM and the Yarn client

The v2 effort you reference is an acknowledgement that Yarn wouldn't have been able to continue forever in its current state. It represents the new, more stable and mature foundation which will give us the tools to maintain Yarn for the foreseeable future.

No amount of feature is going to fix the design issues, that's true. Fortunately, we're on both fronts! It's an incredible amount of work, but the results are showing. Writing features and fixing bugs has never been so easy.

Which means that Yarn will always lag behind NPM on adopting any necessary client changes that are based on CDN changes.

I don't think that the registry is constraining us in any way. Something nice with the open source is that prioritization is a default: if something is important, someone will work on it. With the v2 plugin system this will become even more true, as users won't have to be synced with our releases to integrate third-party features into their projects.

it doesn’t help that important players such as Github are choosing to prioritise the NPM client instead instead of the Yarn client.

I wouldn't say that Yarn is "ignored". We have good relations with many other projects and companies. I met with the GitHub folks a few times, and while we didn't discuss this particular issue I know they have an eye on Yarn.

It seems without some changes to the way the ecosystem works that third-party clients are doomed to have a second-rate

I'm not too worried about this in particular. I think the key question will be how easy it will be for our open source contributors to work with us. By making the codebase more approachable, I'm certain we have a bright future ahead of us.

 

Hey Mael!

Thanks very much for your time and your insight, I knew writing this was a risk, since theree was a lot that I did not know. But I felt it important to open up a conversation on the topic, and I'm glad that despite the points I raise you feel hopeful V2 will address a lot of these issues. It seems V1 served it's purpose in showing that package management could be done better, and hopefully V2 will do the same all over again. I'm quite excited for the modules, and the concept of running Yarn as an API, too.

The majority of what I discuss is merely from digging around in the past few days. I was also conscious that I didn't want to downplay all the hard work that goes into open source, and how grateful I always am for the maintainers like yourself — so thank you!

 

I wanted to give you a feedback on this. I also have problems with yarn always hanging up at different cases like running tests, removing or installing a package etc. The problem is I am very bad at package management, and I know it is the reason for at least half of my problems here. But I want to code, not to manage packages. Yet a lot of my precious free time (it is a hobby project) is wasted because of things like that. I really want to use this time to code. I decided to move back to npm. If things go wrong there, at least there is an error message I can use to try to find the problem.

I wanted to share this as a feedback. If you can get some use cases or a 'user persona' from my story that you can build on, please go for it. I mean it, since there was a good reason I switched to yarn in the first place, and there is a great chance I will come back. Especially if your team keeps up the hard work, and make this a better experience for users.

 

Very informative, thanks for sharing.

You said in the beginning that:

"the problem is not that Yarn code is buggy"

But why wouldn't you consider this a buggy behaviour?

"if the CDN fails in unpredictable ways (...) the Yarn client is not appropriately handling these cases".

Personally I would say that's a bug. Even if npm returns a totally new error out of a sudden, yarn must have an implementation that allows itself to fail gracefully, without leaving the user in the dark.

 

Ah — good spot Renato!

I think I mean to say "the problem is not only that yarn is buggy". The point here I want to stress is less that yarn has bugs — which are temporary and can be fixed. But more that these problems have been going on for years and we have no clear way that they'll go away, with any amount of bug fixes.

I'll make a quick edit!

🙏🙏🙏🙏🙏

 
 

Great article, lights out some parts of Yarn I did not know before, very instructive.

I might say, managing relatively small projects, that Yarn never failed me since I switched to using it. I like the speed of installation and upgrade. The fact you do not have to append "run" before each of your scripts is making me earning some few milliseconds but can make the difference if I make several installation in my day.

 

That Github issue seems more like a Yarn problem than a Github problem. If you're going to be an NPM client, you better make sure you accept what NPM accepts. It would have been a quick "fix your client" from me too, and I can't really see that being an example of ignoring anything.

 

For me, using CRA I had some issues getting up and running. Using yarn resolved my issues and I've been using it without any problem since.

 

Glad to hear it! We're running yarn in some production build systems that are running hundreds of times a day, and some pipelines currently run yarn install ~100 times a build. In a day we probably go through thousands, if not more installations a day. I guess on larger scale, problems surface faster!

If it works for you, and you like it — go for it! I'm not bashing Yarn or it's capabilities, more highlighting some difficulties we face as an ecosystem! :)

 

Oh I know you're not bashing it! Yes I'm using it on a MUCH smaller scale.

 

I simply can't live without yarn's offline mirror capability 😢

 

Your title talks about "third party NPM clients" but in the end you just talk about Yarn?
I mean you take one example (not sure it's a good one) and make it a rule?

 

I'm simply trying to highlight some difficulties in the infrastructure that makes it difficult for third parties to compete. There aren't so many more to speak of (except github.com/pnpm/pnpm) so to speak, but that's the premise of the article.

 

Hi Lou, did you start using Yarn v2 already?