As one of the very popular package managers, pnpm is mainly characterized by fast speed and saving disk space. I will introduce how pnpm works to help you understand the principle of pnpm.
Introduction
The meaning of pnpm is performant npm. From the benchmarks in pnpm website, we can see in many scenarios pnpm has good performance advantages compared with npm/yarn/yarn_pnp.
Directory structure of node_modules
Nested structure
In earlier versions of npm@2, corresponding to Node.js 4.x and previous version, node_modules is a nested structure while installing.
A simple case here, both demo-foo and demo-baz depend on demo-bar. When demo-foo and demo-baz are installed in the same repository, we can get the following node_modules structure:
node_modules
└─ demo-foo
├─ index.js
├─ package.json
└─ node_modules
└─ demo-bar
├─ index.js
└─ package.json
└─ demo-baz
├─ index.js
├─ package.json
└─ node_modules
└─ demo-bar
├─ index.js
└─ package.json
Although the directory structure is relatively clear at this time, each dependent package will have its own node_modules directory, and the same dependency has not been reused. For the above example, the same dependency demo-bar has been installed twice.
Another problem is the Maximum Path Limitation of windows. In some complex cases when the project has a deep dependency level, the dependent path often exceeds the length limit.
Flat structure
In order to solve the above problems, yarn proposed a flat structure design: flattening all dependencies in node_modules. And the implementation of the later npm v3 version is similar, so use yarn or npm@3+ to install the above example, you will get the following flat directory structure:
node_modules
└─ demo-bar
├─ index.js
└─ package.json
└─ demo-baz
├─ index.js
└─ package.json
└─ demo-foo
├─ index.js
└─ package.json
In addition, for different versions of the same dependency in this way, only one of them will be hoisted, and the remaining versions will still be nested in the corresponding packages.
For instance, if we upgrade the above demo-bar to v1.0.1 (its dependency demo-foo is also v1.0.1), you will get the following structure, which version will be hosited to the top depends on the order of installation:
node_modules
└─ demo-bar
├─ index.js
└─ package.json
└─ demo-baz
├─ index.js
├─ package.json
└─ node_modules
└─ demo-bar
├─ index.js
└─ package.json
└─ demo-foo
├─ index.js
├─ package.json
└─ node_modules
└─ demo-bar
├─ index.js
└─ package.json
Problems with flat structures
The flat solution is not perfect, and accompanies with some new problems:
Phantom dependencies
Phantom dependencies means the dependencies that are not declared in package.json but can be directly used in your project. This issue is caused by the flat structure, and the dependencies of the dependencies will also be hoisted to the top level of node_modules, so that you can reference it directly in the project. And if some day this sub-dependency is no longer a dependency of the reference package, there will be problems with the references in the project.
For example, in the project containing demo-foo and demo-baz as the dependencies, demo-bar also appears in node_modules as a dependent dependency:
node_modules
└─ demo-bar
├─ index.js
└─ package.json
└─ demo-baz
├─ index.js
└─ package.json
└─ demo-foo
├─ index.js
└─ package.json
Npm doppelgangers
NPM doppelgangers refer to different versions of the same dependency, due to the hoist mechanism, only one version will be hoisted, and other versions may be installed repeatedly. The same example as above, when we upgrade demo-bar to v1.0.1, the 1.0.0 version that demo-baz and demo-foo depends on will be repeatedly installed in a nested way:
node_modules
└─ demo-bar // v1.0.1
├─ index.js
└─ package.json
└─ demo-baz
├─ index.js
├─ package.json
└─ node_modules
└─ demo-bar // v1.0.0
├─ index.js
└─ package.json
└─ demo-foo
├─ index.js
├─ package.json
└─ node_modules
└─ demo-bar // v1.0.0
├─ index.js
└─ package.json
How to solve
First of all, pnpm installs the dependencies to the global store, and then use symbolic link and hard link to organize directory structure. It links the global dependencies into the project, and links the direct dependencies into top level of node_modules directory. All dependencies are flattened under the node_modules/.pnpm
directory, which realizes the global dependencies shared of all projects, and solves the problem of phantom dependencies and NPM doppelgangers.
Symbolic link and hard link
Link is the way of file sharing in the operating system, where the one is symbolic link, also known as soft link, and the onther one is hard link. From the point of view of use, there is no difference between them, both supporting reading and writing, and if it's an excutable file, it can also be excuted directly. The main difference is the core principle:
Hard link
- Hard links don't create a new index node, the source file and the hard link point to the same index node
- Hard links don't support directories, only file level, and don't support cross disk partitions
- The file is not actually deleted until the source file and all hard links are deleted
Symbolic link
- The symbolic link stores the path of the source file, pointing to the source file, similar to the shortcut of Windows
- The symbolic link supports directories and files, which are different files from source files. It has different inode values and different file types, so the symbolic link can be accessed across partitions
- After deleting the source file, the symbolic link still exists, but the source file cannot be accessed through it
How to create links through command
# symbolic ink
ln -s myfile mysymlink
# hard link
ln myfile myhardlink
How does pnpm work
At first, pnpm will installs the dependencies into <home dir>/.pnpm-store
in the current partition. The current store location can be obtained by the following command:
pnpm store path
And then hard link the required packages from node_modules/.pnpm
into the store path. Finally, symbolically link the top-level dependencies and dependent dependencies in node_modules to node_modules/ .pnpm
, an example that depends on demo-foo@ 1.0.1 and demo-baz@ 1.0.0, the node_modules structure is:
node_modules
└─ .pnpm
└─ demo-bar@1.0.0
└─ node_modules
└─ demo-bar -> <store>/demo-bar
└─ demo-bar@1.0.1
└─ node_modules
└─ demo-bar -> <store>/demo-bar
└─ demo-baz@1.0.0
└─ node_modules
├─ demo-bar -> ../../demo-bar@1.0.0/node_modules/demo-bar
└─ demo-baz -> <store>/demo-baz
└─ demo-foo@1.0.1
└─ node_modules
├─ demo-bar -> ../../demo-bar@1.0.1/node_modules/demo-bar
└─ demo-foo -> <store>/demo-foo
└─ demo-baz -> ./pnpm/demo-baz@1.0.0/node_modules/demo-baz
└─ demo-foo -> ./pnpm/demo-baz@1.0.1/node_modules/demo-foo
Here is a screenshot of the pnpm website to help you better understand how symbolic link and hard link are organized in the project structure:
For the actual usage of the link in pnpm, the following is the relevant source code:
function createImportPackage (packageImportMethod?: 'auto' | 'hardlink' | 'copy' | 'clone' | 'clone-or-copy') {
// this works in the following way:
// - hardlink: hardlink the packages, no fallback
// - clone: clone the packages, no fallback
// - auto: try to clone or hardlink the packages, if it fails, fallback to copy
// - copy: copy the packages, do not try to link them first
switch (packageImportMethod ?? 'auto') {
case 'clone':
packageImportMethodLogger.debug({ method: 'clone' })
return clonePkg
case 'hardlink':
packageImportMethodLogger.debug({ method: 'hardlink' })
return hardlinkPkg.bind(null, linkOrCopy)
case 'auto': {
return createAutoImporter()
}
case 'clone-or-copy':
return createCloneOrCopyImporter()
case 'copy':
packageImportMethodLogger.debug({ method: 'copy' })
return copyPkg
default:
throw new Error(Unknown package import method </span><span class="p">${</span><span class="nx">packageImportMethod</span> <span class="k">as</span> <span class="kr">string</span><span class="p">}</span><span class="s2">
)
}
}
Other abilities
At present pnpm can be installed and used out of Node.js runtime. And pnpm can also manage Node.js version through pnpm env, similar to nvm. For a full feature comparison with npm/yarn see feature-comparison
Limitations
- Due to compatibility issues with symbolic links in certain scenarios, pnpm cannot currently be used on applications deployed on Electron and Lambda, as outlined in: discussion
By adding node-linker=hoisted
to .npmrc, a flat node_modules directory without symbolic links can be created, which is similar to the directory structure created via npm/yarn.
- As the same store is shared globally, modifying the contents within node_modules will directly affect the corresponding content in the global store, which will also have an impact on other projects.
For the above issue, the most recommended approach is to use clone(copy-on-write). By default, multiple references point to the same file, and only when the user needs to modify it, a copy is made so that it will not affect other references that are reading the content of the source file.
However, not all operating systems support this. By default, pnpm attempts to use clone. If it is not supported, it falls back to using hard link. You can also manually set the package reference method by specifying package-import-method in your .npmrc configuration file.
- For other limitations, see: https://pnpm.io/limitations
Other tools
-
bun: https://github.com/oven-sh/bun
- bun is a JS runtime written in Zig. It also offers a package management tool, but it may encounter some compatibility issues.
-
Volt:https://github.com/dimensionhq/volt
- volt is a Node.js package manager written in Rust, which is known for its extremely fast performance. It is currently in beta, but it has not been updated for several months.
Top comments (0)