DEV Community

Craig β˜ οΈπŸ’€πŸ‘»
Craig β˜ οΈπŸ’€πŸ‘»

Posted on

Lazy debug logging for Node.js

This may come as a surprise, but I can be very lazy. If I can get good results without much effort, then πŸ₯³! So the idea of adding debug logging to a whole codebase sounds absolutely exhausting to me, and I want to find a better (lazier) way...

The problem:

I have a tool that I've been working on for a while, and debugging it can be kind of a pain - especially when it's running inside VS Code. It'd be nice to have an easy way to get information about what is going on when the tool runs, without having to manually write a bunch of debug logging code and release a new version. That means that the usual approaches are not going to work:

  • no straight up console.log() statements
  • no debug(function foo () {}) wrappers
  • no fancy @debug decorators

I want something I can set and forget, and probably turn on and off via an environment variable. Is this even possible?

Attempt 1 - override Function.prototype.constructor:

I know that in the past I've done something where I wrapped every function call in some logging, so I want to try to get that working again. I'm pretty sure I did it by overwriting Function.prototype.constructor, which is a pretty insane hack πŸ˜…

JavaScript is a very flexible language, so I can just do stuff like this:

Function.prototype.constructor = (() => {
    const original = Function.prototype.constructor;
    return function (...args) {
        // do stuff...
        return original.call(this, ...args);
    };
})();
Enter fullscreen mode Exit fullscreen mode

I can store a reference to the original constructor function, and replace it with a new function that does some debug logging, and then calls the original function. Seems pretty smart! Job done!

Unfortunately, this doesn't actually work... 😐

function foo () { return 'foo'; } <- Function.prototype.constructor doesn't get called here
foo();
Enter fullscreen mode Exit fullscreen mode

So perhaps I can overwrite Function.prototype.call instead?

Function.prototype.call = (() => {
    const original = Function.prototype.call;
    return function (...args) {
        // do stuff...
        return original.apply(this, args);
    };
})();

function foo () { return 'foo'; }
foo(); // <- Function.prototype.call doesn't get called here either 
Enter fullscreen mode Exit fullscreen mode

I know I've done this in the past, so I must just be missing something. Off to Stack Overflow I go!

I found a few relevant questions/answers, like this one, which does exactly what I just tried? I also found an answer for this exact problem that I wrote 7 years ago - thanks past-Craig! πŸ¦„

This is one alternative, not entirely sure how reliable it is though, it feels a bit wrong:

(function () {
  var oldCall = Function.prototype.call;
  var newCall = function(self) {
    Function.prototype.call = oldCall;
    console.log('Function called:', this.name);
    var args = Array.prototype.slice.call(arguments, 1);
    Function.prototype.call = newCall;
    this.apply(self, args);
  }
  Function.prototype.call = newCall;
})();

…

It even handles a few edge cases, like handling the correct version of this that is passed in, and swapping back to the original for the actual call to console.log... But I tried it again and still no cigar. Maybe it used to work but doesn't now?

I used n to go back to an older version of Node.js (0.12), and it still doesn't seem to work. Maybe my old answer was just wrong?

Now that I think about it, this approach is probably a bit too much anyway - I would get debug logs for every single function call, including internal Node.js functions! Probably a bit too verbose.

Attempt 2 - Wrap require in a Proxy?

JavaScript has evolved quite a bit since I answered that question in 2013 - and one of the things that has been added is meta-programming capabilities. There's a special JavaScript object called a Proxy which is designed for this kind of function wrapping. But I still don't want to have to wrap all functions in a call to new Proxy...

One of the cool things about Node.js is that a lot of it is written in JavaScript, which means I can mess with the core functionality, for example require. I can do the same monkey-patching kind of thing from before.

global.require = (() => {
    const original = global.require;
    return function (...args) {
        const result = original.apply(this, args);
        // do stuff...
        return result;
    };
})();
Enter fullscreen mode Exit fullscreen mode

So now I can examine the result of every call to require and mess with it!

I want to look at each of the object on the result (the module.exports object of the module) and find which ones are functions:

function getFunctions (exports) {
    const functions = {};
    Object.keys(exports)
      .filter((functionName) => typeof exports[functionName] === 'function')
      .forEach((functionName) => {
        functions[functionName] = map[functionName];
      });
    return functions;
}
Enter fullscreen mode Exit fullscreen mode

So I can change the // do stuff to something like:

global.require = (() => {
    const original = global.require;
    return function (...args) {
        const exports = original.apply(this, args);
        const functions = getFunctions(exports);
        // wrap functions:
        return exports;
    };
})();
Enter fullscreen mode Exit fullscreen mode

And the wrap functions code will look something like this:

Object.keys(functions).forEach((functionName) => {
    Object.defineProperty(exports, functionName, {
        value: new Proxy(exports[functionName], {
            // ... Proxy options:
        })
    });
});
Enter fullscreen mode Exit fullscreen mode

This is looking pretty promising I think! But right now it is still going to wrap every single exported function from every module, and that's quite a lot of functions. It'd be great to exclude all dependencies, and only instrument the code from my library. So how does require work?

require('fs'); // Built in module
require('express'); // Third-party code in the node_modules folder
require('./some/thing'); // Local code for my library
Enter fullscreen mode Exit fullscreen mode

I only want to intercept the last case. I can just intercept any call to require where the argument starts with a ., that should get ./ and ../ calls.

global.require = (() => {
    const original = global.require;
    return function (id) {
        const exports = original.apply(this, [id]);
        if (shouldWrap(id) {
          const functions = getFunctions(exports);
          // wrap functions:
        }
        return exports;
    };
})();

function shouldWrap (id) {
    return id.startsWith('.');
}
Enter fullscreen mode Exit fullscreen mode

Ah, so slight problem. Third-party code in node_modules will refer to other local files in the same module, which also start with ., so I'm still getting too many functions. I need some better way of filtering the require ids!

I need to dig into require a bit more. It turns out that the global require function actually delegates to the Node.js Module class, which has a require function on its prototype. That function internally calls a _resolveFilename function which turns the require id into a file path. We can call that directly, and check if the path contains node_modules. We can also use the list of builtinModules to filter out things like path and fs:

immport { buildinModules, Module } from 'module';

Module.prototype.require = (() => {
    const original = Module.prototype.require;
    return function (id) {
      const requirePath = Module._resolveFilename(id, this);
      const module = original.apply(this, [id]);
      if (module && shouldWrap(requirePath)) {
        // wrap module:
      }
      return module;
    };
})();

function shouldWrap (requirePath) {
    const isNodeModule = builtinModules.includes(requirePath) || requirePath.includes('node_modules');
    return !isNodeModule;
}
Enter fullscreen mode Exit fullscreen mode

One more little issue is that the require function isn't just a function - it's also an object that has some other stuff on it like require.resolve and require.cache. Right now I am just clobbering everything, so I have to do that a bit better:

immport { buildinModules, Module } from 'module';

const original = Module.prototype.require;
const debugRequire = function (id) {
  // ...
};
Module.prototype.require = Object.assign(debugRequire, original);
Enter fullscreen mode Exit fullscreen mode

Okay, that's sorted, so how will the actual wrapping work? So far I have this:

Object.keys(functions).forEach((functionName) => {
    Object.defineProperty(exports, functionName, {
        value: new Proxy(exports[functionName], {
            // ... Proxy options:
        })
    });
});
Enter fullscreen mode Exit fullscreen mode

The way a Proxy works is that you have a set of traps that describe different operations in JavaScript. I care about functions, so there are two traps that I care about, apply and construct.

Proxy apply trap:

The apply trap gives you control to what happens when a function is called - exactly what I want! The basic form of an apply trap handler is like this:

function apply (target, thisArg, argumentsList) {
    return target.apply(thisArg, argumentsList);
}
Enter fullscreen mode Exit fullscreen mode

I can add a bit more stuff to make it more useful:

import { performance } from 'perf_hooks';

function apply (target, thisArg, args) {
    const startTime = performance.now();
    console.log(args);
    const result = target.apply(thisArg, args);
    const endTime = performance.now();
    console.log(result);
    console.log(endTime - startTime);
    return result;
}
Enter fullscreen mode Exit fullscreen mode

One other thing is that result could be a promise, so I need to handle that case as well. Usually I'd prefer to not use the .then() syntax, but using async would make this function always return a Promise, which I don't want!

import { types } from 'util';

function apply (target, thisArg, args) {
    // ...
    const result = target.apply(thisArg, args);
    if (types.isPromise(result)) {
        return result.then((result) => {
            const endTime = performance.now();
            console.log(result);
            console.log(endTime - startTime);
            return result;
        });
      }
    // ...
}
Enter fullscreen mode Exit fullscreen mode

Nice, that's pretty rough but it mostly works!

Proxy construct trap:

The construct trap gives you control to what happens when a function is called with new! The basic form of a construct trap handler is like this:

function construct (target, args) {
    return new target(...args);
  }
Enter fullscreen mode Exit fullscreen mode

I'm going to do basically the same thing as the apply trap, but I don't need to worry about Promises, since constructors are synchronous.

import { performance } from 'perf_hooks';

function construct (target, args) {
    const startTime = performance.now();
    console.log(args);
    const instance = new target(...args);
    const endTime = performance.now();
    console.log(endTime - startTime);
    console.log(instance);
    return instance;
}
Enter fullscreen mode Exit fullscreen mode

It'd be great if I could also wrap any methods of the instance too! I can access them via target.prototype and use the getFunctions function again:

function construct (target, args) {
    // ...
    const proto = target.prototype;
    const prototypeFunctions = getFunctions(proto);
    Object.keys(prototypeFunctions).forEach((functionName) => {
      Object.defineProperty(proto, functionName, {
        value: new Proxy(proto[functionName], {
          apply: applyTrapHandler
        })
      });
    });
    // ...
    return instance;
}
Enter fullscreen mode Exit fullscreen mode

Okay so now I have Proxies that wrap Proxies 🀯, but it should still work!

Adding function names:

One thing that would be good to add is for the traps to log the name of the functions. I have access to the name everywhere I need it, but I need the trap to know about it:

function createFunctionCallWrap (name) {
  return function apply (target, thisArg, args) {
      // ...
      console.log(name);
      const result = target.apply(thisArg, args);
      // ...
      return result;
  };
}

function createConstructorCallWrap (name) {
  return function apply (target, thisArg, args) {
      // ...
      console.log(name);
      const instance = new target(...args);
      // ...
      return instance;
  };
}
Enter fullscreen mode Exit fullscreen mode

Nice! Now I get logs for all function calls, all constructor calls, and all instance method calls!

Handling function arguments:

It would also be good to have logging for any function which are passed as arguments (like callback functions). Both traps give me access to the function arguments, so I just need to add another wrapper:

function wrapArgs(args) {
    return args.map((arg) => {
        if (typeof arg !== 'function') {
            return arg;
        }
        return new Proxy(arg, {
            apply: ...
        });
    });
  }
Enter fullscreen mode Exit fullscreen mode

Okay but now I have a little problem. In all the other cases I have access to the name of the function or constructor that is being wrapped, whether it is from the module.exports object, or the target.prototype object. I don't have the names of the arguments to the function, so I can't know the name of any argument that happens to be a function!

I do have access to the function though, which means I can call toString() and get the full text of the function.

function foo (callback) { callback(); };

foo.toString() // "function foo (callback) { callback(); };"
Enter fullscreen mode Exit fullscreen mode

I could do some RegExp magic and read the argument name?

const [argsString] = foo.toString().match(/function .* \((.*?)\)/);
const argNames = argsString.split(', ');
Enter fullscreen mode Exit fullscreen mode

This is going to fall apart very quickly - I'd have to handle async functions, generators, arrow functions, optional arguments, etc. etc.

But I could use a proper parser and it could handle all that for me!

import { parseScript } from 'esprima';

function getFunctionArgumentNames(func) {
    const AST = parseScript(`const f = ${func.toString()}`)
    // find argument names in AST
  }
Enter fullscreen mode Exit fullscreen mode

I can use esprima to get an Abstract Syntax Tree for the function. If func is an anonymous function (like function () {}) the parser will throw, so I need the const f = bit to handle that case.

Once I've got the AST, I can query it for the bits I care about:

import { parseScript } from 'esprima';
import { query } from 'esquery';

function getFunctionArgumentNames(func) {
    const [func] = query(parseScript(`const f = ${func.toString()}`), '[type=/Function/]');
    return func.params.map((param) => {
      const [identifier] = query(param, 'Identifier');
      return identifier.name;
    });
  }
Enter fullscreen mode Exit fullscreen mode

I find the first (and only) function in the AST (which is the representation of the target function), and get its list of params. This could be normal parameters like a, b, c, or complex parameters like d = 6, e = getE(). In either case I want to grab the first Identifier, and return its name. That would be 'a', 'b', 'c', 'd', and 'e' in my above examples.

Now I can finish off wrapArgs

function wrapArgs(argNames, args) {
    return args.map((arg, index) => {
        if (typeof arg !== 'function') {
            return arg;
        }
        return new Proxy(arg, {
            apply: createFunctionCallWrap(argNames[index])
        });
    });
  }
Enter fullscreen mode Exit fullscreen mode

And change createFunctionCallWrap and createConstructorCallWrap:


function createFunctionCallWrap (name) {
  return function apply (target, thisArg, args) {
      // ...
      const result = target.apply(thisArg, wrapArgs(args));
      // ...
      return result;
  };
}

function createConstructorCallWrap (name) {
  return function construct (target, args) {
      // ...
      const instance = new target(...wrapArgs(args));
      // ...
      return instance;
  };
}
Enter fullscreen mode Exit fullscreen mode

Nicer logs:

Okay, so now I have wrapped all functions, constructors, prototype methods, and any of their arguments that happen to be functions. And each time any of those are called I have access to their name, arguments, return value, and start and end times. It would be awesome to render these as a list with each entry and exit on their own line so I can scan it and see how the program ran.

It would also be cool to show somehow how deep in the call stack the program is. I can track that by incrementing a count before executing the real function, and decrementing it once it has run. I'm going to wrap all this logic up into logStart and logEnd functions:

let depth = 0;

function logStart (name, args) {
    depth += 1;
    let debugString = printDepth(name);
    debugString += ` args: ${JSON.stringify(args)}`;
    process.stdout.write(`${debugString}\n`);
}

function logEnd (name, startTime, endTime, result) {
    let debugString = printDepth(name);
    debugString += ` time: ${endTime - startTime}ms`;
    debugString += ` return: ${JSON.stringify(result)}`;
    process.stdout.write(`${debugString}\n`);
    depth -= 1;
}

function printDepth (name) {
    return `${'β–Έ'.repeat(depth)} ${name}`;
}
Enter fullscreen mode Exit fullscreen mode

One issue with this is that if the args or result have any circular references in them, then JSON.stringify will throw. Node.js has already solved that for me though, so I can replace JSON.stringify with util.inspect.

I'm going to wrap that in a little helper so I can have all the options in one place:

import { inspect } from 'util';

function printObject(object) {
    return inspect(object, { getters: true, depth: Infinity }).replace(/\n/g, '');
}
Enter fullscreen mode Exit fullscreen mode

This will print all the details about an object that I need, and then strip any new lines so that each log only uses one line.

So with some dummy code this is the output I get:

β–Έ init args: []
β–Έβ–Έ Fixture args: [ 1, [ 'hello' ], [Function] ]
β–Έβ–Έ Fixture return: Fixture { _a: 1, _b: [ 'hello' ], _c: [Function] }
β–Έ init return: Fixture { _a: 1, _b: [ 'hello' ], _c: [Function] }
β–Έ Fixture.func args: []
β–Έβ–Έ Fixture._func args: [ 1 ]
β–Έβ–Έβ–Έ _c args: [ 1 ]
β–Έβ–Έβ–Έ _c return: 1
β–Έβ–Έ Fixture._func return: [ 'hello' ]
β–Έ Fixture.func return: undefined
Enter fullscreen mode Exit fullscreen mode

Adding this into my library:

So this is all going great and I want to try to add it to library. Since I want to turn it on and off with a environment variable, I have wrapped it all in a function like this:

function debug () {
    if (process.env.DEBUG) {
        // Overwrite require so it wraps everything.
    }
};
Enter fullscreen mode Exit fullscreen mode

I've added the debug() call inside library and now when I run it nothing happens... The code looks something like this:

import { blah } from './blah';
import { whatever } from './whatever';

export function library () {
    debug();
    // run library
}
Enter fullscreen mode Exit fullscreen mode

And when I put a debugger; inside the overwritten require call, it is never hit... 😐

So there are two things going on here.

1) All the import ... statements are converted into const { blah } = require('./blah'); calls.
2) The whole dependency graph is traversed long before my call to debug() happens.

All the calls to require happen before I've even turned on my debug mode...

To check that I'm right, I move the debug() call to the first line of my index file, so now it looks more like:

debug();

import { blah } from './blah';
import { whatever } from './whatever';

export function betterer () {
    // run Betterer
}
Enter fullscreen mode Exit fullscreen mode

And cool, now it works! πŸ˜…

This is probably fine, but it means that the tool is much less configurable, and I can't change settings in between different Betterer runs. Maybe that's good enough?

Wrapping require.cache:

I think I can do better than that! As a Node.js application executes, and the dependency graph is traversed, a special cache on the require function is populated. I can access this cache using require.cache, and it contains the complete exports of any given dependency.

When I call debug() I can iterate over that cache and wrap all the functions that have already been required:

Object.keys(require.cache).forEach((requirePath) => {
    const module = require.cache[requirePath];
    if (module && shouldWrap(requirePath)) {
      wrap(module.exports);
    }
});
Enter fullscreen mode Exit fullscreen mode

Boom, just like that. Now I can have my debug() call wherever I want and all past and future requires will be wrapped correctly!

Wrapping up 😎

So that's pretty neat and is working quite well! I'm quite happy with how it ended up, cause it looked pretty unlikely here and there! I've taken my prototype from here, rewritten it in TypeScript, added a bunch of options, and released it on npm at @phenomnomnominal/debug. Maybe that will be of some use to someone else?!

Hit me up with questions/comments/suggestions πŸŽ‰

Top comments (0)