DEV Community

Cover image for How npm install Works Internally?
ShreeJ
ShreeJ

Posted on • Edited on

How npm install Works Internally?

Most of the happening frameworks like Node.js, react.js, vue, angular, etc are built with npm as the back-bone. The npm-registry maintains the libraries or dependencies used in various frameworks.

This post will help in understanding the below :

  1. the logic behind what happens when we execute npm install.
  2. the order of dependency downloaded and the node_modules folder structure.

Prerequisite :

  1. Basic knowledge in any JS frameworks
  2. Any one of the following installed to try on the samples below.
    • node and npm
    • nvm (node-version-manager to manage different versions of node and npm in machine)
    • docker-compose (to play with node app in a container)

What happens when we execute npm install ?

We all know that the command npm install will download the dependency module from the npm-registry.
This can be by any one of the following way.

  1. npm install - to fetch all dependencies mentioned in the dependency tree.
  2. npm install <dependency_name> or npm install <dependency_name>@<version> - to fetch a particular dependency by name and version (if no version is specified, then it pulls the latest version).
  3. npm install <git remote url> - to fetch a library pushed to github or bitbucket or gitlab.

Algorithm that makes the work of npm install easy :

  1. Check if node_modules folder exist or package-lock.json and trace the existing the dependency tree (folder structure) in it and clone the tree (or create a empty tree).
  2. Fetch the relevant dependencies (dev, prod or direct dependencies) from the package.json and add it to the clone (from step-1).
    • finds the difference between the trees and adds the missing dependencies.
    • dependencies will be added as close to the top of the tree as possible.
    • the dependencies are included without disturbing the other roots/branches of the tree.
  3. Compare the original tree (from step-2) with the cloned tree (step-1) and make a list of actions to take to get the new tree replicated in the node_modules.
    • the actions are install(new dependencies), update(existing dependency versions), move(change the placee off the dependency within the tree) and remove(uninstall libraries that are not needed by new tree).
    • execute all the actions identified (deepest first).

Folder-Structure in node_modules :

The folder structure that the npm follows varies according to the scenarios stated like below:

  1. No existing node_modules or package-lock.json or dependencies in package.json.
  2. No existing node_modules or package-lock.json, but package.json with dependency list is available.
  3. No existing node_modules, but package-lock.json and package.json with dependency list are available.
  4. The node_modules, package-lock.json and package.json with dependency list are all available.

1. No existing node_modules or package-lock.json or dependencies in package.json:
This simple case is when any JS framework applications starts initially without any dependency and adds them one by one.
In this scenario, the dependencies are downloaded in the order of installation like below:
Example: execute npm install <B> in a new application.
Here B is a dependency and assume it has internal dependency on alpha@v2.0, then both of them gets installed at the root level of the node_modules.

Inference: All the dependencies and the internal dependencies tries to get a place in the root of the node_modules unless there is a conflict with the same dependency, but different version.

node_modules
|_ B
|_ alpha @v2.0

2. No existing node_modules or package-lock.json, but package.json with dependency list is available:

In this scenario, an apllication has dependencies listed in package.json without lock-file.

Example: execute npm install in the application directory which has a package.json with dependencies like below:

{
  "dependencies": {
    "A": "1.0.0",
    "B": "2.0.0"
  }
}

Here, A internally depends on alpha@v1.0 and B depends on alpha@v2.0.
Inference: All the dependencies and the internal dependencies tries to get a place in the root of the node_modules unless there is a conflict with the same dependency, but different version. When a conflict raises, it creates a sub node_modules under each dependency needed and pushes conflicting internal libraries in it.

node_modules
|_ A
|_ alpha @v1.0
|_ B
    |_ node_modules
        |_ alpha @v2.0

3. No existing node_modules, but package-lock.json and package.json with dependency list are available:
Assume, A internally depends on alpha@v1.0 whereas, B depends on alpha@v2.0 and beta@v3.0.
package-lock.json snippet:

{
  "dependencies": {
    "A": {
      "version": "1.0.0",
      "resolved": "NPM REGISTRY URL of A",
      "requires": {
        "alpha": "1.0.0"
      }
    },
    "alpha": {
      "version": "1.0.0",
      "resolved": "NPM REGISTRY URL of alpha v1",
    },
    "B": {
      "version": "2.0.0",
      "resolved": "NPM REGISTRY URL of B",
      "requires": {
        "alpha": "2.0.0",
        "beta": "3.0.0"
      },
      "dependencies": {
        "alpha": {
          "version": "2.0.0",
          "resolved": "NPM REGISTRY URL of alpha v2",
        }
      }
    },
    "beta": {
      "version": "3.0.0",
      "resolved": "NPM REGISTRY URL of beta v3",
    }
  }
}

Inference: Irrespective of the dependency ordered in package.json, the packages will be installed in the tree structure defined by the package-lock.json.

And the resulting dependency tree structure would be :

node_modules
|_ A
|_ alpha @v1.0
|_ B
|    |_ node_modules
|        |_ alpha @v2.0
|_ beta @v3.0

4. The node_modules, package-lock.json and package.json are all available :
The node_modules folder will be re-arranged to match the incoming new tree from package-lock.json and installed in the order as defined in the package-lock.json file.

Package.json (vs) Package-lock.json :

Lets consider the following sequences of dependency installation in a new application without an existing dependency tree or node_modules in it.
Example :
Assume, A internally depends on alpha@v1.0 whereas, B depends on alpha@v2.0.

npm Scenario-1 Scenario-2
Commands npm install A
npm install B
npm install B
npm install A
package.json
{
  "dependencies": {
    "A": "1.0.0",
    "B": "2.0.0"
  }
}
{
  "dependencies": {
    "A": "1.0.0",
    "B": "2.0.0"
  }
}
package-lock.json
{
  "dependencies": {
    "A": {
      "version": "1.0.0",
      "requires": {
        "alpha": "1.0.0",
      }
    },
    "alpha": {
      "version": "1.0.0",
    },
    "B": {
      "version": "2.0.0",
      "requires": {
        "alpha": "2.0.0",
      },
      "dependencies": {
        "alpha": {
          "version": "2.0.0",
        }
      }
    }
  }
}
{
  "dependencies": {
    "A": {
      "version": "1.0.0",
      "requires": {
        "alpha": "1.0.0",
      },
      "dependencies": {
        "alpha": {
          "version": "1.0.0",
        }
      }
    },
    "alpha": {
      "version": "2.0.0",
    },
    "B": {
      "version": "2.0.0",
      "requires": {
        "alpha": "2.0.0",
      }
    }
  }
}
node_modules node_modules
|_ A
|_ alpha @v1.0
|_ B
|    |_ node_modules
|        |_ alpha @v2.0
node_modules
|_ A
|    |_ node_modules
|        |_ alpha @v1.0
|_ alpha @v2.0
|_ B

The above comparison helps in concluding the importance of package-lock.json.
If the package 'alpha' is imported from the JS application like var alpha = require('alpha');, the scenario-1 points to v1 whereas, scenario-2 imports v2.
Thus, the behaviour of the code snippets depending on the imported file might differ.

It is not the package.json that determines the tree structure(because the npm install downloads dependencies in the alphabetical order as saved in package.json).

Remember: The best practise is to push and maintain the package-lock.json into the source-code (like git), to ensure the same dependency tree is being used by all members using the project.

References :

  1. npm install basics- https://docs.npmjs.com/cli/install
  2. npm folder basics - https://docs.npmjs.com/configuring-npm/folders.html
  3. package.json basics - https://docs.npmjs.com/files/package.json
  4. package-lock.json basics - https://docs.npmjs.com/configuring-npm/package-lock-json.html

Top comments (2)

Collapse
 
immayurpanchal profile image
Mayur Panchal

This is really super informative! Thanks a lot Shree J for sharing this out with examples to make it easy to understand.

Collapse
 
amsaighi profile image
Amin SAIGHI

can NPM use cache-npm before downloading/installing from the npm root ( internet I mean ) !!