ndesmic

Posted on Oct 10, 2020

An Overview of Javascript Module Types

#javascript #modules #webdev #esm

Javascript has gone though various forms of modules, most of which still exist to this day. While bundlers have done a good job of making certain things Just Work™ they also sweep a lot of things under the run. This can lead to things that do work but are not properly optimized.

IIFE

The most simple kind of module is the IIFE which stands for Immediately-Invoked Function Expression. What does this mean? Well I find it makes more sense when you look at the code:

(() => {
  function foo(){
     console.log("Hello World!");
  }
  globalThis.myModule = {
    foo
  };
})();

globalThis.myModule.foo() //Hello World!

Basically we wrap the whole thing in a function so that the scope prevents private implementations from leaking out. We can then attach things to the global scope under a namespace to avoid conflicts. I'm using the canonical and modern globalThis but window is perhaps more common, but that name doesn't work in workers or node. The IIFE refers to defining a function and having it executed inline. In the above code this is done by wrapping it in parens and then calling it, however you may see it in other forms like this:

~function(){
  window.myModule = {
    foo: function(){ console.log("Hello World!"); }
  };
}();

window.myModule.foo() //Hello World!

This is a bit of magic. You might notice the beginning ~ which is a binary-not operator. This cryptic mix of unary operator plus function causes the function to self execute without wrapping parens (and thus saving one character in minification). Note that any unary operator works so !function(){}() is also common.

This pattern can also have pseudo imports:

((myDep) => {
  function foo(){
     console.log(myDep.message);
  }
  globalThis.myModule = {
    foo
  };
})(myDep);

globalThis.myModule.foo() //Hello World!

By passing things into the parameter list of the self execution call, we expose them to the inside of the module. This doesn't buy us a whole lot but we can do things like alias them, give them defaults, or locally clone data to protect it from outside mutation.

((myDep, otherDep) => {
  function foo(){
     console.log(myDep.message + otherDep.message);
  }
  globalThis.myModule = {
    foo
  };
})(myDep || { message: "default" }, myDep2);

globalThis.myModule.foo() //Hello World!

The first "import" in the code above uses a default. JS has evolved better ways of doing this such as nullish coalescence ??, default parameters etc. but || to do "truthy coalescence" is a common method from the time period in which these were popular. The second import internally aliases the dependency as otherDep rather than myDep2.

While I have not seen it much in the wild as the IIFE technique largely predates ES6, you can also get a lot of the benefits using braces to create a block scope and using let or const:

{
  const foo = () => console.log("Hello World!");
  globalThis.myModule = { foo };
}

globalThis.myModule.foo() //Hello World!

This does do all the same encapsulation but there are no clear imports, you just grab them from the global scope. This only works with block scope variable declarations:

//Don't do this
{
  var foo = () => console.log("Hello World!");
  globalThis.myModule = { foo };
}

globalThis.myModule.foo() //Hello World!

Here not only does foo get hoisted but it also creates a property window.foo and we've completely polluted global scope.

The biggest problem with this is it can become unwieldy when you have many modules. If one IIFE needs a function from another to be available then it has to be loaded below it otherwise the function will not exist when it comes to using it. This means the user ultimately must understand the load order and get it right.
In complex applications this is very difficult and because those references might be used at various points in the app's lifecycle we may not even find them all without interacting with the page.

CommonJS

CommonJS (or CJS) modules arose from nodeJS. In node there is no HTML document to structure the script loading and unless you want one giant file, you need a way to split the code up. This lead to the creation of CJS modules. CJS defines two things, a global per-file exports object namespaced to the global module (module.exports) and a require function. Functions and data are assigned to the exports object and those will be returned from the require function when importing it from another module. It looks like this:

//myModule.js

function foo(){
  console.log("Hello World!");
}

module.exports.foo = foo;

//main.js
const myModule = require("myModule");
myModule.foo(); //Hello World!

This simplifies things quite a bit. The problem was this was specifically designed for node and does not work in the browser. This is partly because it expects the environment to have an export object and a require function. But even had browsers added that, the biggest problem though is that require is synchronous. Synchronous XHR is a big no-no (and no longer allowed) because it literally freezes the UI. This is why nearly all modern web APIs are async. In order for this to work you need to bundle the entire code tree into one payload and it cannot be used to dynamically fetch things. However the ubiquity of node meant this became the most common format to export code and bundlers made it easy to support by providing wrappers and doing some code re-writing. Also, note that there is a lot of interesting behavior with the way node resolves these. They are string identifiers but they can refer to standard library packages, paths, or things installed from npm in node_modules. The files might have an extension, they might not, they might refer to an index.js in a directory or be redirected to some other script with a package.json main key. None of these work very well for browsers which have no knowledge of directory structures and do not use file extensions as a way to determine type. What this boils down to is a lot of magic in the build tool to make this work properly.

AMD

AMD or (Asynchronous Module Definition) had life for a short while as a way to make bundles more front-end friendly however it is not widely used anymore. They do not require a bundler and dependencies can be resolved by dynamically fetching them. Pretty much the de-facto way to use these was through requireJS. Unlike CJS, dependencies are declared first and the module code is executed inside a function closure. It looks like this:

define("myModule", [], () => {
   return {
     foo: () => console.log("Hello World!");
   }
}
define("main", ["myModule"], (myModule) => {
  myModule.foo(); //Hello World!
});

The AMD loader knows how to take these registrations and order them correctly. The first parameter of define is usually the module name but it can be anonymous and the bundlers can find a way to give it a name, such as using the file name.

AMD also provides a way to wrap CJS:

define("myModule", [], () => {
   return {
     foo: () => console.log("Hello World!");
   }
}
define("main", ["require", "myModule"], (require) => {
  const myModule = require("myModule");
  myModule.foo(); //Hello World!
});

Note how the inner part of the "main" module looks like CJS. This creates a nice compatibility layer. It's fairly limited though. CJS imports are dynamic, meaning they can appear at any arbitrary place in code and this means that it's hard to statically analyze, and since non-node environments do not allow importing of files and network resources to be synchronous, these simply don't work. In fact, to overcome this, bundlers typically need to inline the code where the dynamic require statement is.

UMD

UMD seeks to unify AMD and CJS so that the module can be used in either system and this probably the most common way to export to a browser. As mentioned above, AMD is able to wrap CJS and so with a little extra boilerplate we can make the module work in either environment.

Let's say I were to take the following code and run it through rollup with UMD format:

//my-module.js
export function foo(){
  console.log("Hello!");
}

//main.js
import { foo } from "./my-module.js";
foo();
export const main = "MAIN";

The export on main is mostly to demonstrate exports. Rollup gives us this:

(function (global, factory) {
    typeof exports === 'object' && typeof module !== 'undefined' ? factory(exports) :
    typeof define === 'function' && define.amd ? define(['exports'], factory) :
    (global = typeof globalThis !== 'undefined' ? globalThis : global || self, factory(global.main = {}));
}(this, (function (exports) { 'use strict';

    function foo(){
        console.log("Hello!");
    }

    foo();

    const main = "MAIN";

    exports.main = main;

    Object.defineProperty(exports, '__esModule', { value: true });
})));

Let's break it down. The meat of the module code is at the bottom and is a function that is passed into the IIFE. We can see that rollup did a little optimization to unwrap the module code and inline foo. The module code is passed in as factory. It then does 3 checks to decide how to deal with it.

If exports exists and module is defined we're in an environment that supports CJS. We then pass the exports to the factory so that it's can assign itself like a normal CJS module.

If define exists and define.amd exists then we're in an environment that support AMD. We can then define the module. Note that the factory depends on exports so it creates a dependency on it but it needs to be defined elsewhere.

Lastly, we're in an environment that supports neither so it'll try to expose the exports on globalThis. Except older environments don't support globalThis so it also checks self (worker global scope) and this that gets passed in under global. It then uses a code golf trick factory(global.main = {}) to both assign main to window and pass it in at the same time. Since global.main is referenced by exports it will be attached to global scope. In the browser this means we can access the main module at window.main.

The last little thing is it assigns a property to exports __esModule = true. This is a little book-keeping for other libraries so they know where it came from. If the code was written in CJS you wouldn't get this. If it was part CJS and part ESM you'd get some interesting results where myModule is "imported":

var myModule = /*#__PURE__*/Object.freeze({
    __proto__: null,
    foo: foo
});
function getAugmentedNamespace(n) {
    if (n.__esModule) return n;
    var a = Object.defineProperty({}, '__esModule', {value: true});
    Object.keys(n).forEach(function (k) {
        var d = Object.getOwnPropertyDescriptor(n, k);
        Object.defineProperty(a, k, d.get ? d : {
            enumerable: true,
            get: function () {
                return n[k];
            }
        });
    });
    return a;
}
var foo$1 = /*@__PURE__*/getAugmentedNamespace(myModule);

What this does is first freeze the object since ESM namespaces can't be modified like CJS export objects. Then, if the module is ESM it passes it along and if it's CJS then it creates a new object, iterates through all the keys in the module and assigns a getter either using the one that existed on the module or the simple property access. This effectively makes it read-only to maintain ESM behavior.

There are lots of slight variations, simplifications and modernizations to UMD but the basic principal is that we use AMD to wrap CJS and then figure out what to inject based on the environment. UMD has some good documentation that shows different forms and simplifications as templates.

ESM

ESM or ECMAScript Modules was the official module format chosen for Javascript. It has a long a storied history taking many years of debate and had a very slow integration into browsers and eventually node. At this point you can use it everywhere though. The primary reason it took so long was that so much code had already been written in CJS and it simply wasn't compatible. CJS couldn't be used because of it's synchronous expectations and ESM fixes that by expecting imports to load asynchronously. Another issue was that of static analyzability. As mentioned above, CJS imports are very dynamic but this makes it hard if not impossible to tell what code is actually used. Even when using a bundler ESM can improve the situation because all imports and exports must be static, meaning they can be analyzed at build time and code that is not referenced can simply be removed leaving you with a smaller bundle.

Perhaps a downside of ESM is it comes with a lot of features, many of which were designed for compatibility with CJS but just never actually worked out.

One such feature is default exports. In CJS we can define things like

module.exports = "FOO";

and import them like

const foo = require("foo");

To be compatible in ESM you can do

export default = "FOO"

and import like

import foo from "./foo.js"

This didn't turn out as expected as there's quite a few places where this breaks down. My advice is to avoid using these, especially when working in a mixed-module context, use named exports instead.

Another is the * (star) import. In CJS we can do

module.exports = { foo: "FOO", bar: "BAR" };

and then import like

const mod = require("mod"); 
console.log(mod.foo);

So in ESM it was decided you can take a module like this

export const foo = "FOO"; 
export const bar = "BAR";

and import like this

import * as mod from "./mod.js"
console.log(mod.foo);

Again, it's not quite the same especially as ESM namespaces are immutable. It has it's uses when you want to namespace imported functions.

By the way, we could have also defined the last module like:

//mod.js
const foo = "FOO";
const bar = "BAR";
export { foo, bar }

This is an export list. Sometimes it's helpful to draw attention to the things you are exporting in one place. You can rename exports:

const foo = "FOO";
export { foo as baz };

You can rename imports too:

import { foo as baz } from "mod.js";

You can also re-export parts of of modules:

export { foo } from "mod.js"
//with a rename
export { bar as baz } from "mod.js"
//or all of the module
export * from "mod.js"

Sometime modules just do stuff but don't need to give you back anything such as when declaring a custom element in it's own file. You can import it like this:

import "./my-element.js"

There's also a replacement for dynamic requires. If you need to load code dynamically you can use import("./foo.js") which is natively asynchronous. You can treat this sort of like an async function that will return the module namespace, same as if you did import *. Technically it's actually a keyword and not a function though so you can't do things like import.call or hold references to it. This import also has a "property" called meta that you can access that gives you the url import.meta.url. This can be handy to rebuild some of node's built-in module functionality like __dirname.

Also worth noting is that ESM is always strict mode and you always need to provide the file extension for browser compatibility.

Anyway, you should be doing as much as you can in ESM as it is the standard moving forward and provides some nice benefits even if the syntax can be a bit much. My advice: just stick to named exports and you'll be fine.

Typescript

Many flows now use Typescript (TS). TS does not have any module format of it's own but it does interact with modules and provides it's own syntax. For the most part everything in TS is ESM, however you can import things that don't normally exist in JS.

//ifoo.ts
export interface IFoo {
   foo: string
}
//main.ts
import { IFoo } from "./ifoo";
const myFoo: IFoo = {
  foo: "Hello!"
}
console.log(myFoo.foo);

You need a TS compiler to strip this stuff out because even if we erased the typing for myFoo to make this module 100% compatible with JS syntax the import will break. Maybe the module needed to perform a side-effect so we can't erase it, or maybe we only have the single-file context and can't know if it's a type or not. In fact, because of this sort of confusion TS also lets you use import type { foo } from "./ifoo" to be more explicit and these are guaranteed to be erased.

TS also deal with CJS. Above I mentioned that default exports and CJS default exports aren't the same. TS has ways to deal with that. If you were writing CJS with TS and wanted to make a CJS default you'd do so like this:

//foo.ts
const foo = "Foo";
export = foo;

That export isn't an object, this is specific TS syntax for assigning to module.exports so that it can understand these are exports and not just assignments to a global called module.exports. Likewise, require is a global function but there's nothing to say the user hasn't created their own global require function apart from CJS, TS needs to know that what you are trying to do is import a CJS module. You do so like this:

import foo = require("./foo");

Since imports statements ordinarily can't have functions TS can use this is tell that, actually, we want a CJS import.

TS can also let us type modules. While this is typically done via type annotations in the source itself, you can augment modules in a d.ts file or inline where you use them.

If I have:

//foo.js
export function foo(i){
  console.log("Hello" + i);
}

Let's say that i was supposed to be a number. You can write a d.ts file:

//foo.d.ts
declare module "foo.js" {
   export function foo(i: number): void;
}

And if you use foo.js and try to use a string for i the type checker will stop you.

SystemJS

This is more of a footnote as SystemJS was never very popular but you might rarely see this. SystemJS was largely to allow devs to write ESM for browsers that did not support it. I'm not sure if there was even an expectation to be written or if like UMD it's more of an output specification.

System requires the system module loader similar to how AMD needs the require module loader. The output looks like this:

System.register('main', [], function (exports, context) {
    'use strict';
        //let dep
    return {
                //setters: [_d => { dep = _d; }],
        execute: function () {

            function foo(){
                console.log("Hello!");
            }

            foo();

            var main = exports('default', "MAIN");

        }
    };
});

Much like with UMD rollup did some optimization to inline the modules but we can still talk about it. System modules are registered similar to AMD define. They take a name, list of dependencies and a function. The function doesn't return stuff directly but rather has an object with setters and execute. We don't see setters in this example so I've tried to show it in comments but if we did they would be an array of setter functions for when a dependency updates in the same order as the dependencies were defined. The execute function is where the module code executes from and this can be async. exports is a function that can take either a name/value pair or an object and set them which in turn calls the setters of code that depends on this module. context contains functions like import that allows you to do dynamic imports. This allows it have all the features of ESM and run in the browser.

Hopefully this gives you the tools to understand what's going on especially if you wind up looking at complicated mixed module projects. Bundlers do a lot to hide this but understanding it can help you solve some tricky errors when things don't work as expected. And remember to use ESM whenever you can!

DEV Community

An Overview of Javascript Module Types

IIFE

CommonJS

AMD

UMD

ESM

Typescript

SystemJS

Oldest comments (0)