loading...

Five Misconceptions on How NodeJS Works

deepal profile image Deepal Jayasekara Originally published at blog.insiderattack.net on ・8 min read

Background Image Courtesy: https://wallpapercave.com/mythology-wallpapers

This article is based on a Brown Bag session I did at comparethemarket.com on “Five Misconceptions on How NodeJS Works”.

NodeJS was born in 2009 and it has gained massive popularity throughout the years because of one reason. It’s just JavaScript! Well, it’s a JavaScript runtime designed to write server-side applications, but the statement that “It’s just JavaScript” is not 100% true.

JavaScript is single-threaded, and it was not designed to run on the server-side where scalability was a critical requirement. With Google Chrome’s high-performance V8 JavaScript Engine, the super cool asynchronous I/O implementation of libuv, and with a few other spicy additions, Node JS was capable of bringing client-side JavaScript to the server-side enabling writing super-fast web servers in JavaScript that are capable of handling thousands of socket connections at a time.

Building Blocks of Node JS

NodeJS is a massive platform built with a bunch of interesting building blocks as the above diagram describes. However, due to the lack of understanding of how these internals pieces of Node JS work, many Node JS developers make false assumptions on the behaviour of Node JS and develop applications which lead to serious performance issues as well as hard-to-trace bugs. In this article, I’m going to describe five such false assumptions which are quite common among many Node JS developers.

Misconception 1 — EventEmitter and the Event Loop are related

NodeJS EventEmitter is intensively used when writing NodeJS applications, but there’s a misconception that the EventEmitter has something to do with the NodeJS Event Loop, which is incorrect.

NodeJS Event Loop is the heart of NodeJS which provides the asynchronous, non-blocking I/O mechanism to NodeJS. It processes completion events from different types of asynchronous events in a particular order.

(Please check out my article series on the NodeJS Event Loop, if you are not familiar with how it works!)

NodeJS Event Loop

In contrast, NodeJS Event Emitter is a core NodeJS API which allows you to attach listeners functions to a particular event which will be invoked once the event is fired. This behaviour looks like asynchronous because the event handlers are usually invoked at a later time than it was originally registered as an event handler.

An EventEmitter instance keeps track of all events and listeners associated with an event within the EventEmitter instance itself. It does not schedule any events in the event loop queues. The data structure where this information is stored is merely a plain old JavaScript Object where the object properties are the event names (or “types” as someone may call) and the value of a property is one listener function or an array of listener functions.

Oversimplified Diagram of how event handlers are attached to an event in EventEmitter

When the emit function is called on the EventEmitter instance, the emitter will SYNCHRONOUSLY invoke the listener functions registered to the event in a sequential manner.

If you consider the following snippet:

The output of the above snippet would be:

handler1: myevent was fired!
handler2: myevent was fired!
handler3: myevent was fired!
I am the last log line

Since the event emitter synchronously executes all the event handlers, the line I am the last log line won’t be printed until all the listener functions are invoked.

Misconception 2—All callback-accepting functions are asynchronous

Whether a function is synchronous or asynchronous depends on whether the function creates any asynchronous resources during the execution of the function. With this definition, if you are given a function, you can determine that the given function is asynchronous if it:

  • Calls a native JavaScript/NodeJS asynchronous function (e.g, setTimeout, setInterval, setImmediate, process.nextTick, etc.)
  • Performs a native NodeJS async function(e.g, async functions in child_process, fs, net, etc.)
  • Uses Promise API (includes the usage of async-await)
  • Calls a function from a C++ addon which is written to be asynchronous (e.g, bcrypt)

Accepting a callback function as an argument does not make a function asynchronous. However, usually asynchronous functions do accept a callback as the last argument (unless it’s wrapped to return a Promise). This pattern of accepting a callback and passing the results to the callback is called the Continuation Passing Style. You can still write a 100% synchronous function using the Continuation Passing Style.

Synchronous functions and Asynchronous functions have a significant difference in terms of how they use the stack during the execution.

Synchronous functions occupy the stack during the entire duration of its execution, by disallowing anyone else to occupy the stack until it returns.

In contrast, asynchronous functions schedule some async task and return immediately hence removing itself from the stack. Once the scheduled async task is completed, any callback provided will be called and the callback function will be the one who occupies the stack again. At this point, the function which initiated the async task will no longer be available on the stack since it has already returned.

With the above definition in your mind, try to determine whether the following function is asynchronous or synchronous.

In fact, the above function can be synchronous and asynchronous depending on the value passed to the data.

If data is a falsy value, the callback will be called immediately with an error. In this execution path, the function is 100% synchronous as it does not perform any asynchronous task.

If data is a truthy value, it’ll write data into myfile.txt and will call the callback after the file I/O operation is completed. This execution path is 100% asynchronous due to the async file I/O operation.

Writing function in such an inconsistent way (where the function behaves both synchronously and asynchronously) is highly discouraged because it will make an application’s behaviour unpredictable. Fortunately, these inconsistencies can easily be fixed as follows:

process.nextTick can be used to defer the invocation of callback function thereby making the execution path asynchronous.

Alternatively, you can use setImmediate instead of process.nextTick in this case, which will more or less give the same result. However, process.nextTick callbacks have a higher priority comparatively thereby making it faster than setImmediate.

If you need to learn more about the difference between process.nextTick and setImmediate, have a look at the following article from my Event Loop series.

Misconception 3— All CPU-intensive functions are blocking the event loop

It is a widely known fact that CPU-intensive operations block the Node.js Event Loop. While this statement is true up to a certain extent, it is not 100% true as there are some CPU-intensive functions which do not block the event loop.

In general, cryptographic operations and compression operations are highly CPU-bound. Due to this reason, there are async versions of certain crypto functions and zlib functions which are written in a way to perform computations on the libuv thread pool so that they do not block the event loop. Some of these functions are:

  • crypto.pbkdf2()
  • crypto.randomFill()
  • crypto.randomBytes()
  • All zlib async functions

However, as of this writing, there’s no way to run CPU-intensive operation on the libuv thread pool using pure JavaScript. Yet, you can write your own C++ addon which will give you the ability to schedule work on the libuv thread pool. There are certain 3rd party libraries (e.g. bcrypt) which perform CPU-intensive operations and uses C++ addons to implement asynchronous APIs for CPU bound operations.

Misconception 4— All asynchronous operations are performed on the thread pool

Modern operating systems have built-in kernel support to facilitate native asynchrony for Network I/O operations in an efficient way using event notifications (e.g, epoll in linux, kqueue in macOS, IOCP in windows etc.). Therefore, Network I/O is not performed on the libuv thread pool.

However, when it comes to File I/O, there are a lot of inconsistencies across operating systems as well as in some cases within the same operating system. This makes it extremely hard to implement a generalised platform-independent API for File I/O. Therefore, File system operations are performed on the libuv thread pool to expose a consistent asynchronous API.

dns.lookup() function in dns module is another API which utilises the libuv thread pool. The reason for that is, resolving a domain name to an IP address using dns.lookup() function is a platform-dependent operation, and this operation is not a 100% network I/O.

You can read more about how NodeJS handles different I/O operations here:

Misconception 5— NodeJS should not be used to write CPU-intensive applications

This is not really a misconception, but rather was a well-known fact about NodeJS which is now obsolete with the introduction of Worker Threads in Node v10.5.0. Although it was introduced as an experimental feature, worker_threads module is now stable since Node v12 LTS, therefore suitable for using it in production applications with CPU-intensive operations.

Each Node.js worker thread will have a copy of its own v8 runtime, an event loop and a libuv thread pool. Therefore, one worker thread performing a blocking CPU-intensive operation does not affect the other worker threads’ event loops thereby making them available for any incoming work.

If you are interested in learning how Worker Threads work in detail, I encourage you to read the following article:

However, at the time of this writing, the IDE support for worker threads is not the greatest. Some IDE’s does not support attaching the debugger to the code run inside a worker thread other than the main worker. However, the development support will mature over time as a lot of developers have already started adopting worker threads for CPU-bound operations such as video encoding etc.

I hope you learned something new after reading this article, and please feel free to provide any feedback you have by responding to this.

Further Readings:


Posted on by:

deepal profile

Deepal Jayasekara

@deepal

Blogger, Node.js Developer and Information Security Enthusiast

Discussion

markdown guide