Alex 🦅 Eagle for Bazel

Posted on Oct 10, 2019 • Edited on Oct 25, 2019

Layering in Bazel for Web

#bazel

Bazel is fast, general-purpose, and stable 1.0

Bazel is a build tool that gives you a typical 10x improvement in your build and test times, using a deterministic dependency graph, a distributed cache of prior intermediate build results, and parallel execution over cloud workers.

Bazel has just released 1.0 which is a huge deal. Congratulations to the Bazel team on this culmination of over four years of hard work! Large companies cannot afford risks on beta software and need a guarantee of stability. Now that we have 1.0, Bazel is ready to use! But Bazel by itself is generally not sufficient.

Bazel is like an execution engine. It's really the core of a build system, because it doesn't know how to build any particular language. You could use Bazel alone, in theory, using low-level primitives. But in practice you rely on a plugin to translate your higher-level build configuration into Actions, which are subprocesses Bazel spawns in a variety of ways like remote workers. So we see that Bazel is a layer beneath a variety of language-ecosystem-specific plugins.

These plugins are called rules, and these are at a wide range of maturities. Some rules, like those for Java and C++, are distributed along with the Bazel core and are also mature. Some rules like .net and ruby are community contributed and not at a 1.0 quality.

Javascripters delight: Bazel can run almost any tool on npm

The plugin I work on is for JavaScript/Node.js is getting close to 1.0 also, and is called rules_nodejs.

Bazel just orchestrates the execution of the programs you tell it to run. Any npm package that publishes a binary just works, without someone needing to write any Bazel plugin specific to that package. rules_nodejs has a "core" distribution that is just the stuff needed to teach Bazel how to run yarn/npm and consume the resulting packages. I'll get into more specifics about how we do it later, but let me first show you how it looks. You can skip to the final solution at https://github.com/alexeagle/my_bazel_project

First you need Bazel with some accompanying config files.

$ npx @bazel/create my_bazel_project
# If you used Bazel before, make sure you got @bazel/create version 0.38.2 or later so later steps will work
$ cd my_bazel_project

gives you a new workspace to play in.

Now let's install some tools. For this example, we'll use Babel to transpile our JS, Mocha for running tests, and http-server to serve our app. This is just an arbitrary choice, you probably have some tools you already prefer.

$ npm install mocha domino @babel/core @babel/cli @babel/preset-env http-server

Let's run these tools with Bazel. First we need to import them, using a load statement.

So edit BUILD.bazel and add

load("@npm//@babel/cli:index.bzl", "babel")
load("@npm//mocha:index.bzl", "mocha_test")
load("@npm//http-server:index.bzl", "http_server")

This shows us that rules_nodejs has told Bazel that a workspace named @npm is available (think of the at-sign like a scoped package for Bazel). rules_nodejs will add index.bzl files exposing all the binaries the package manager installed (same as the content of the node_modules/.bin folder). The three tools we installed are in this @npm scope and each has an index file with a .bzl extension.

Loading from these index files is just like importing symbols into your JavaScript file. Having made our load()s we can now use them. Each of the symbols is a function that we call with some named parameter arguments.

Now we write some JavaScript and some tests. To save time I won't go into that for this article.

Okay, how will we build it? We need to think in terms of a graph of inputs, tools, and outputs, in order to express to Bazel what it needs to do to build a requested output, and how to cache the intermediate results.

Add this to BUILD.bazel:

babel(
    name = "compile",
    data = [
        "app.js",
        "es5.babelrc",
        "@npm//@babel/preset-env",
    ],
    outs = ["app.es5.js"],
    args = [
        "app.js",
        "--config-file",
        "$(location es5.babelrc)",
        "--out-file",
        "$(location app.es5.js)",
    ],
)

This just calls the Babel CLI, so you can see their documentation for what arguments to pass. We use the $(location) helper in Bazel so we don't need to hardcode paths to the inputs or outputs.

We can try it already: npm run build
and we see the .js outputs from babel appear in the dist/bin folder.

Let's serve the app to see how it looks, by adding to BUILD.bazel:

http_server(
    name = "server",
    data = [
        "index.html",
        "app.es5.js",
    ],
    args = ["."],
)

And add to the scripts in package.json: "serve": "ibazel run :server"

ibazel is the watch mode for bazel.
Note that on Windows, you need to pass --enable_runfiles flag to Bazel. That's because Bazel creates a directory where inputs and outputs both appear together, for convenience.

Now we can serve the app: npm run serve

Finally, let's run mocha:

mocha_test(
    name = "unit_tests",
    args = ["*.spec.js"],
    data = glob(["*.spec.js"]) + [
        "@npm//domino",
        "app.es5.js",
    ],
)

Note that we installed the domino package here so we could test the webapp including DOM interactions in node, which is faster than starting up a headless browser.

Run it:

$ npm test

Without Bazel knowing anything about babel, http-server, or mocha, we just assembled a working, incremental, remote-executable toolchain for building our little app.

More examples

Bazel running a React app written in TypeScript/Sass with Webpack: https://github.com/bazelbuild/rules_nodejs/pull/1255
Bazel running more mocha tests: https://github.com/bazelbuild/rules_nodejs/blob/0.38.2/examples/webapp/BUILD.bazel#L34-L51
Bazel running TypeScript tsc: https://github.com/bazelbuild/rules_nodejs/blob/0.38.2/examples/app/BUILD.bazel#L51-L74
Bazel using Babel, then packing the resulting application into a Docker image: https://github.com/bazelbuild/rules_nodejs/blob/0.38.2/examples/angular/src/BUILD.bazel#L131-L204
Bazel runs Less and Stylus css preprocessors: https://github.com/bazelbuild/rules_nodejs/blob/0.38.2/examples/app/styles/BUILD.bazel
Bazel using Google Closure Compiler for smallest bundle sizes: https://github.com/bazelbuild/rules_nodejs/blob/0.38.2/examples/closure/BUILD.bazel#L4-L15
Bazel running Nuxt.js build for server-side rendered Vue: https://github.com/albttx/bazel-nuxt

Going further: custom rules and macros

It's great that Bazel can run arbitrary npm tools, but this required that we know about the CLI arguments needed for these tools. It also wasn't very ergonomic (we had to use syntax like $(location) to adapt to Bazel's paths), and we didn't take advantage of lots of Bazel features like workers (keep tools running in --watch mode), providers (let rules produce different outputs depending on what's requested) and a lot more.

It also required too much learning and evaluating. Toolchain experts like the engineers on the Angular CLI team spend half their time understanding the capabilities and tradeoffs of the many available tools and choosing something good for you.

As end-users we would get tired of assembling all our Bazel configuration out of individual tools, where our current experience is generally at a much higher level, and we expect a framework we use to provide a complete out-of-the-box build/serve/test toolchain. Bazel is perfect for toolchain experts to provide this developer experience.

For example, Angular CLI has a "differential loading" feature where modern browsers can get smaller, modern JS without polyfills, in a way that doesn't break old browsers. This requires quite some tricks with the underlying tools.

Angular CLI can make a differential loading toolchain using Bazel to compose the rules we saw above. Bazel has a high-level composition feature called macros. We can use a macro to simply wire together a series of tool CLI calls, and make that available to users. Let's say we want to let users call it this way:

load("@npm//http-server:index.bzl", "http_server")
load("@cool-rules//:differential_loading.bzl", "differential_loading")

differential_loading(
    name = "app",
    srcs = glob(["*.ts"]),
    entry_point = "index.ts",
)
http_server(
    name = "server",
    data = [":app"],
    templated_args = ["app"],
)

we need a macro called differential_loading that takes a bunch of TypeScript sources and an entry point for the app, and produces a directory that's ready to serve to both old and modern browsers.

Here's what a toolchain vendor would write to implement differential loading:

def differential_loading(name, entry_point, srcs):
    "Common workflow to serve TypeScript to modern browsers"

    ts_library(
        name = name + "_lib",
        srcs = srcs,
    )

    rollup_bundle(
        name = name + "_chunks",
        deps = [name + "_lib"],
        sourcemap = "inline",
        entry_points = {
            entry_point: "index",
        },
        output_dir = True,
    )

    # For older browsers, we'll transform the output chunks to es5 + systemjs loader
    babel(
        name = name + "_chunks_es5",
        data = [
            name + "_chunks",
            "es5.babelrc",
            "@npm//@babel/preset-env",
        ],
        output_dir = True,
        args = [
            "$(location %s_chunks)" % name,
            "--config-file",
            "$(location es5.babelrc)",
            "--out-dir",
            "$@",
        ],
    )

    # Run terser against both modern and legacy browser chunks
    terser_minified(
        name = name + "_chunks_es5.min",
        src = name + "_chunks_es5",
    )

    terser_minified(
        name = name + "_chunks.min",
        src = name + "_chunks",
    )

    web_package(
        name = name,
        assets = [
            "styles.css",
        ],
        data = [
            "favicon.png",
            name + "_chunks.min",
            name + "_chunks_es5.min",
        ],
        index_html = "index.html",
    )

This looks long, but it's much simpler than what we've built in the current default for Angular CLI that uses pure Webpack. That's because the composition model here has clear separation of concerns between the different tools.

Summary: the layering

Each layer can work on its own, but users prefer the higher level abstractions.

Raw tool, like babel: you can call this yourself to transpile JavaScript. These are written by lots of open-source contributors.
Bazel: use low-level primitives to call babel from a genrule, but tied to your machine. The Bazel team at Google supports this.
Bazel + rules_nodejs: use the binary provided to load and run babel. Written by the JavaScript build/serve team at Google.
Bazel + custom rules/macros: use a higher level API to run a build without knowing the details.

I expect that more tooling vendors, such as CLI teams on various frameworks, will provide a high-level experience that uses Bazel under the covers. This lets them easily assemble toolchains from existing tools, using their standard CLI, and you get incremental, cacheable, and remote-parallelizable builds automatically.