Alejandro Oviedo

Posted on Sep 30, 2020

Instrumenting Node.js internals

#node #instrumentation #javascript

Back with one of my clients, I remember a specific process that every engineer was too afraid to change because it was prone to failures and interacted with a few different parts of the infrastructure. It was a queue worker and I was told that whenever the worker just froze someone would had to restart it manually. I initially thought it made no sense but after a few failed attempts trying to reproduce the scenario I started to think different. I didn't had too much time to spend on fixing it, I was hired to focus on a newer architecture but it always resonated with me the lack of tooling to quickly get an overview of what a process is doing.

Fast-forward to today, I still think there's a lot of room to cover in the developer tooling space. With this in mind I've worked on an experiment called instrument that can collect information from your Node.js process and aggregate it in different ways.

Choosing your instrumentation spot

There are a few places where I can imagine you could intercept internal calls and collect data:

Intercepting system calls is doable and would also work for other runtimes/VMs not just Node, but you would have to target a specific operating system and it's generally more complex. With C++ (through node-gyp), you won't have to target a specific operating system but you will add some additional requirements (appmetrics falls in this space).
Intercepting calls from the javascript realm it's not only doable but also works great for portability: you don't have to install a different runtime or need additional requirements, you monkey-patch on top of it.

Configurability

I wanted to support different ways of running the tool for your existing application, much like dotenv does: include it programmatically or add an -r instrument/config flag for your command.
On top of these two alternatives, I've also added support for a standalone configuration file ("instrument.config.js") or you can also pass your own configuration as a parameter:

// at your entrypoint file
require('instrument')({
  summary: true,
  output: 'instrument-logs.txt'
})

My server is not a one-off

For cases when you're running a server and your process doesn't runs to completion you could still get the instrumentation running and separate it's output from your original process.

Reducing external noise

In most Node.js applications it's not uncommon to have a large list of dependencies, and sometimes you may not be interested in instrumenting the calls originated by your dependencies. Based on what you're looking for, you can switch on/off these logs by using the dependencies property from the configuration.

You can also choose the modules you want to instrument instead of having them enabled by default:

require('instrument')({
  modules: ['http', 'https']
})

The configuration above will only instrument the modules http and https.

Require-tree

I figured another useful feature would be to instrument the required modules. A great tool for this is madge, which focus on modules of your own and not node_modules dependencies. For my case, I choose a slightly different approach: you can choose to include the dependencies required by your dependencies or only to include one level of dependencies.

As an example, I exported the tree for running npm ls and graphed it using D3.js to end up with this SVG.

Measuring the overhead

Even if this experiment is a thin layer on top of some APIs it could have unexpected outcomes in terms of performance. I took fastify-benchmark and ran a few of the most common frameworks for HTTP servers with and without instrument enabled:

Library/framework	Throughput difference
built-in http	-11.32%
connect	-4%
express	-0.37%
fastify	-8.8%
hapi	-0.72%

Another benchmark I tinkered with was benchmarks-of-javascript-package-managers but for some reason I couldn't get consistent results out of the instrumented processes.

I see this mostly as a thought exercise, since my goal with this experiment is mostly for local development environments on which the performance overhead shouldn't matter that much.

Room for improvement

There were a lot of built-in modules or globals I didn't intercepted because I failed to see any added value (like process, path, util and the list goes on). It doesn't mean that those couldn't be instrumented, it just would take more time.
An interesting feature would be to measure the time for each of the calls that are instrumented but would require some extra work on figuring out how to graph the data to make sense of it.
The code is Open Source in case you want to take a look or if you found a bug:

a0viedo / instrument

Tool to collect information about Node.js native module calls

Another relevant question would be: could you monkey-patch other objects or functions that are inherent to JavaScript? The answer is yes!

const originalReference = Promise.resolve;
Promise.resolve = (...params) => {
  console.log('calling Promise.resolve', params);
  return originalReference.call(Promise, ...params);
}

It doesn't mean you should, though. Not only I can't think of a good reason for anyone to do it but it could also clutter your stack traces severely.
In the following weeks I want to explore exporting hooks from instrument to be able to build a warning mechanism for filesystems with strict permissions.

Shout-out to Marco Buono for donating the "instrument" package name!

DEV Community