The node_modules problem
Leonardo Teteo Nov 19
I think I'm not the first one to talk about this problem even here in dev.to. I quick research trying to found any solution concluded with the image that is the head of this text. The node_modules folder is where your project dependencies are stored, common knowledge. Its weight is also common knowledge.
Why I decided to vent my frustration now
Black Friday is here! It means discounts and the opportunity to update your computer. Therefore I decided to buy a SSD to boost the performance of my laptop, from 1 TB HDD to 500 GB SSD. All my files right now sums 299 GB, so I will not lose much space, but I decided to do the housekeeping work even so, this includes making backups of my projects. Not all projects I make I put on GitHub, sometimes I'm just experimenting and it is not worth the trouble, but I keep them anyway.
When I started the copy & paste process I remembered how heavy node_modules are...
One example that shows clearly the problem is the node_modules folder of my ToRead CLI project as you can see in the imagem below.
The size of the folder is not really the problem although I will get to that later, but 15.000 files and more than 1800 folders!? Are you kidding me?! It is a simple CLI project with 5 files! Just for a comparison, let's see how many files and folders there is in the Windows folder:
While the system was counting I really thought node_modules would win this, but no. In any case, the folder has almost half of the amount of files an entire operating system has!
As I've said, the problem when copying node_modules folder from one place to another is not the size, it is the amount of files and folders, the complexity of the tree. It is a nightmare for a HDD. It takes many minutes to discover all files let alone copy them. In the end, it also impacts npm performance and there are memes for that also.
What other languages do
Besides NodeJS I have experience dealing with dependencies in two more languages: Java and C#. They have, in my opinion, a very similar way of handling dependencies and a much more efficient way than NodeJS.
Java has Maven, Gradle and other dependency management applications that works basically the same. There is a remote repository of the dependencies, generally Maven Central and a local repository. Maven always looks for the dependency in the local repository first and if not found it downloads from the remote repository. The dependencies are not within the project, like node_modules folder, it is more global, it is downloaded once and can be used by many projects, just add to your pom.xml.
C# follows the same idea, you list your dependencies in a .csproj file and Nuget takes care of the dependencies having also a remote and a local repository. It is much more efficient to handle dependencies this way, download once use in any project locally.
I think there is also a difference in culture and the way the languages were structured and what people see as libraries. Java has a very mature core of libraries that can deal with almost anything, common scenarios or not. Therefore, libraries in Java generally are meant to be an abstraction of what Java already has, making it easier to use. Therefore, the libraries have a more shallow dependency tree, reaching the Java core libraries much quicker.
What I see in NodeJS on the other hand is the opposite, everything can become a library, even a library that sums two numbers (hypothetical example, I hope) and libraries are heavily dependent of one another generating deep dependency trees, many files and folders.
Conclusion & Discussion
I am certainly not qualified to criticize NodeJs structure and engineering, but as a user I clearly see a problem and some lessons from other languages that can be used to improve the dependency management, which is paramount nowadays for almost every application. What do you think this problem came to be and what has been done to solve it? It would be very interesting to hear from more experienced developers what you do to remedy this.