Josh Gibson

Posted on Nov 26, 2021

Introducing the @nxpansion/opentelemetry-tasks-runner

#opentelemetry #nx #ci #observability

As your applications grow in complexity, so do the tools required to build those applications. The Nx Build Framework and its accompanying product [Nx Cloud])(https://nx.app/) have made it easier than ever for JavaScript (and even non Javascript) developers to manage their build pipelines. Through its executors and tasks runners, Nx provides developers with an extensible monorepo to help developers manage dependencies between applications, create build scripts, cache results, and much more. When paired with Nx Cloud, these build artifacts can be cached in the cloud and distributed across build servers to speed up your CI Pipeline.

With this complexity, however, developers need a way understand how their build process is performing. CI platforms have come a long way help deliver these insights about your pipelines and workflows, but if you're using Nx, you're still probably left wondering "What is going on when I run nx affected build?" Nx Cloud is a great first step to see build performance and statistics, but that data still lives in the Nx Cloud and isn't queryable or correlatable. This is where @nxpansion/opentelemetry-tasks-runner comes in. This plugin instruments every command ran by the Nx CLI using OpenTelemetry, allowing you to generate traces and send them to the observability platform of your choice. Below are example traces sent to Honeycomb from an example application. Whether you're using stock Nx build executors, using incremental builds and Nx Cloud build cache, or you're using the Nx Cloud distributed build agents, the OpenTelemetry Tasks Runner can instrument you build commands.

Setup

The @nxpansion/opentelemetry-tasks-runner works by wrapping your existing tasks runner. For most users this will be either the @nrwl/workspace/tasks-runners/default or @nrwl/nx-cloud runners. In your nx.json file, simply replace the tasks runner with @nxpansion/opentelemetry-tasks-runner and move your previous settings into the wrappedTasksRunner and wrappedTasksRunnerOptions properties.

Install the plugin

yarn add --dev @nxpansion/opentelemetry-tasks-runner

Default Tasks Runner Example

{
  "tasksRunnerOptions": {
    "default": {
      "runner": "@nxpansion/opentelemetry-tasks-runner",
      "options": {
        "wrappedTasksRunner": "@nrwl/workspace/tasks-runners/default",
        "wrappedTasksRunnerOptions": {
          "cacheableOperations": ["build", "lint", "test", "e2e"]
        }
      }
    }
  },
}

Nx Cloud Example

{
  "tasksRunnerOptions": {
    "default": {
      "runner": "@nxpansion/opentelemetry-tasks-runner",
      "options": {
        "wrappedTasksRunner": "@nrwl/nx-cloud",
        "wrappedTasksRunnerOptions": {
          "cacheableOperations": ["build", "lint", "test", "e2e"],
          "accessToken": "SECRET_VALUE"
        },
        "accessToken": "SECRET_VALUE"
      }
    }
  }
}

Note that the access token is left at the top level and also provided in the wrappedTaksRunnerOptions. The nx-cloud cli expects that accessToken value to live at in the options object of the default tasks runner while the wrapped tasks runner needs the token to be passed into it when it is wrapped by the OpenTelemetry Tasks Runner. Putting the token in both places allows for compatibility with both cases.

Exporting Traces

Out of the box, the @nxpansion/opentelemetry-tasks-runner supports sending traces to an OpenTelemetry Collector via OTLP (via gRPC) or printing them out to the console. Those configurations can be set in the tasksRunnerOptions.

OTLP

{
  "tasksRunnerOptions": {
    "default": {
      "runner": "@nxpansion/opentelemetry-tasks-runner",
      "options": {
        "wrappedTasksRunner": "@nrwl/workspace/tasks-runners/default",
        "wrappedTasksRunnerOptions": {
          "cacheableOperations": ["build", "lint", "test", "e2e"]
        },
        "exporter": "otlp",
        "otlpOptions": {
          "url": "grpc://localhost:4317"
        }
      }
    }
  },
}

Console

{
  "tasksRunnerOptions": {
    "default": {
      "runner": "@nxpansion/opentelemetry-tasks-runner",
      "options": {
        "wrappedTasksRunner": "@nrwl/workspace/tasks-runners/default",
        "wrappedTasksRunnerOptions": {
          "cacheableOperations": ["build", "lint", "test", "e2e"]
        },
        "exporter": "console"
      }
    }
  },
}

Custom OpenTelemetry Setup

In some cases you want a little more control of the OpenTelemetry SDK being used to collect the traces. For example, if you want to send your traces directly to a platform like Honeycomb rather than first sending them through an OpenTelemetry Collector, you will need to set up a custom NodeSdk with these configurations. To do so, you can provide a custom setup file that exports a function that returns a custom NodeSdk with your desired configurations.

Honeycomb Example

nx.json

{
  "tasksRunnerOptions": {
    "default": {
      "runner": "@nxpansion/opentelemetry-tasks-runner",
      "options": {
        "wrappedTasksRunner": "@nrwl/workspace/tasks-runners/default",
        "wrappedTasksRunnerOptions": {
          "cacheableOperations": ["build", "lint", "test", "e2e"]
        },
        "setupFile": "setup-honeycomb.js"
      }
    }
  },
}

setup-honeycomb.js

const { credentials, Metadata } = require('@grpc/grpc-js');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-otlp-grpc');
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { BatchSpanProcessor } = require('@opentelemetry/sdk-trace-base');

const getOtelNodeSdk = (defaultConfiguration) => {
  const metadata = new Metadata();
  metadata.set('x-honeycomb-team', process.env.HONEYCOMB_WRITEKEY);
  metadata.set('x-honeycomb-dataset', process.env.HONEYCOMB_DATASET);
  const traceExporter = new OTLPTraceExporter({
    url: 'grpc://api.honeycomb.io:443/',
    credentials: credentials.createSsl(),
    metadata,
  });
  const spanProcessor = new BatchSpanProcessor(traceExporter);

  defaultConfiguration.traceExporter = traceExporter;
  defaultConfiguration.spanProcessor = spanProcessor;

  const sdk = new NodeSDK(defaultConfiguration);

  return { sdk };
};

module.exports = getOtelNodeSdk;

Example Traces

Example Repo

In the below examples, we are building a simple Nx repo that has two apps example-app-1 and example-app-2. Both apps share the libraries lib-2 and lib-3 while lib-1 is used exclusively by example-app-1 and lib-4 is used exclusively by example-app-2. common-lib-5 and common-lib-6 are both libs are are used internally by libs 1-4. Below we will use a variety of strategies to build these two apps and see how these various traces look with the @nxpansion/opentelemetry-tasks-runner when viewing the traces in Honeycomb.

Standard Build (Non-incremental)

This image shows a trace with two applications being built by the @nrwl/node:build executor. One span is created for the command to build these two apps and one child span is created for each application. Since this build is not using buildable libs, both applications have to rebuild all the libraries shared between them, thus each about takes about 7 seconds to compile all of the libs into a single webpacked bundle.

Incremental Build

For the above trace, we have enabled incremental builds for the repo using the @nrwl/node:package executor. The trace shows a span for each library indicating that each lib was built individually. As a result, the libraries were only built once between both apps, however, there is significant overhead to build each library. Instead of taking 14 seconds to build both apps, it took about 3.5 seconds per lib, 1.5 second per app, and 25 seconds in total. This use case clearly isn’t as helpful in a full repository rebuild scenario, but it is promising if we can cache the results of each library. If only one or two libraries change, we should be able to build both apps in well under the initial 14 seconds.

Partial Incremental Build

This trace shows an example of a partial build. We have the same number of spans as the previous incremental build, but for all libraries other than lib-3, their build time is only a few milliseconds (the time it takes to determine the result was already cached). As a result, the only time spent building is the 3.5 seconds for lib-3 and the 1.5 seconds per app, bringing us in at under 7 seconds.

Distributed Build

In this trace, we have taken the repository and built it in a distributed GitHub actions workflow using Nx Cloud’s distributed execution agents. The spans in this trace were generated across 5 different virtual machines. Using W3C Trace Propagation and the otel-cli, we were able to collect traces from all of these different machines into one trace. The @nxpansion/opentelemetry-tasks-runner can create spans under a parent span via the TRACEPARENT environment variable. Thus it is possible to create a root span for your workflow, pass that context to your agents executing the builds and create a single trace representing builds split across multiple machines.

Help Contribute!

Hopefully you can see all of the the possibilities of the OpenTelemetry Tasks. Check us out at GitHub and consider giving the project a star and contributing if you have suggestions!

Top comments (1)

Mark Pieszak • Dec 7 '21

Great work! :)

DEV Community