Simon Plenderleith

Posted on Nov 9, 2020 • Originally published at simonplend.com on Nov 9, 2020

Notes from NodeConf Remote 2020

#node #nodeconf #notes

NodeConf Remote 2020 was held on November 2nd – 6th as a free online event, standing in for the annual NodeConf EU conference, which for obvious reasons couldn’t be held in person due to the pandemic. I tuned in when I could during the two days of excellent talks and I wrote up notes during the talks I watched.

Fair warning: The talk notes I’m sharing here are in a fairly raw state and provided with no warranty – I tried to make sure I noted down all the details accurately, but I can’t guarantee that it’s all 100% correct!

You can watch all of the talks from NodeConf Remote 2020 on the NearForm YouTube channel.

Jump links

Talk: Aaaaaaaaaaaaaah, They’re Here! ES Modules in Node.JS
Talk: Can we double HTTP client throughput?
Talk: AsyncLocalStorage: usage and best practices
Talk: Examining Observability in Node.js
Talk: Node.js startup performance

Talk: Aaaaaaaaaaaaaah, They’re Here! ES Modules in Node.JS

Speaker

Gil Tayar (@giltayar)

One big takeaway

The ECMAScript modules (ESM) implementation in Node.js is in a very mature state, so it’s a great time to start gradually migrating your packages and applications from CommonJS modules. You might also potentially be able to ditch Babel too if you’re only using it for import/export support.

Talk abstract

Yes, they’re here. Node v13.2.0 marked the first version of Node.JS where ESM support is unflagged, meaning you can start using ES Modules. It’s been a long, four year journey from defining them in the spec (June 2015!) till they could be used in Node.JS (November 2019).

Why did it take so long? What were the major hurdles? Should we migrate? How does the migration path look like? Are they really better than CommonJS Modules? What is in store for the future?

Gil Tayar, a former member of the Node.JS Modules Working Group, and now just a passionate observer of it, will try and navigate these confusing waters, and hopefully make you understand why, when, and how to migrate your Node.JS code to use ES Modules.

My notes

Unfortunately I missed a big chunk of this talk, but my main takeaway was the reasons why ECMAScript modules (ESM) are better than CommonJS (CJS) modules or Babel transforms. It’s because they’re:

Strict
Browser compatible – whoop, standards!
Statically parsed
Async + supports top-level await
Native 🎉

Talk: Can we double HTTP client throughput?

Speaker

Matteo Collina (@matteocollina)

One big takeaway

If you’re running Node.js microservices which make HTTP requests to each other, to keep things fast, you should 1. always create an HTTP agent with keepAlive, 2. use HTTP pipelining. Undici can take care of both of these things for you, and is capable of a much higher throughput than the Node.js core http module.

Talk abstract

The Node.js HTTP client is a fundamental part of any application, yet many think it cannot be improved. I took this as a challenge and I’m now ready to present a new HTTP client for Node.js, undici, that doubles the throughput of your application.

The story behind this improvement begins with the birth of TCP/IP and it is rooted in one of the fundamental limitations of networking: head-of-line blocking (HOL blocking). HOL blocking is one of those topics that developers blissfully ignore and yet it deeply impacts the runtime experience of the distributed applications that they build every day. Undici is a HTTP/1.1 client that avoids HOL blocking by using keep-alive and pipelining, resulting in a doubling of your application throughput.

My notes

I missed a bunch of this talk too 🙈 I’m planning to watch the full talk video (link below) to learn more about the TCP fundamentals which affect HTTP request performance in Node.js, but here’s what I noted live on the day:

Microservices typically communicate with each other over HTTP/1.1 – without any tuning, requests can get slow ← I’ve personally experienced this in projects I’ve worked on in the past.
To have decent request throughput you should always create an HTTP agent with keep alive enabled – this allows for connection reuse between requests.
You should also use HTTP pipelining so you can send concurrent HTTP requests over a single connection.
Undici HTTP/1.1 client allows you to create a "pool" which you can then make requests through. Using Undici with pool + pipelining is FAST – over three times throughput of node http agent with keep alive 🚀
Main takeaways:
- Always use an http(s).Agent
- Undici can drastically reduce the overhead of your distributed system

Talk: AsyncLocalStorage: usage and best practices

Speaker

Vladimir de Turckheim (@poledesfetes)

One big takeaway

Node.js is constantly evolving and there are some powerful new APIs being implemented that don’t always make headlines. The AsyncLocalStorage API is one of those, and I’m hoping I’ll have an opportunity soon to give it a try.

Talk abstract

During Spring, a curious API was added to Node.js core: AsyncLocalStorage. Most Node.js users are not familiar with the power of such tool.

That’s too bad: it can be used to drastically improve an application’s code and allow building powerful tooling around Node.js applications.

So, let’s discover what this API is and how to use it to leverage the unlimited powers of AsyncLocalStorage (ok, I might have exagerated it a bit here).

My notes

Given that efforts are being made to align Node.js more closely with browser standards, it seems odd that the AsyncLocalStorage API is named as it is: it has nothing to do with the browser Local Storage API 🤔

In threaded languages e.g. PHP

A request enters the process → a thread is created
The request has its own thread – it’s basically a thread-singleton

In single-threaded world e.g. Node.js, a single thread handles multiple requests.

Exception handling is weird in Node.js, nextTick async operations will lose the call stack.

Let’s create contexts for asynchronous environments: AsyncLocalStorage – asynchronous-proof store, can create async contexts for you to use.

Basic example:

const { AsyncLocalStorage } = require("async_hooks");

const context = new AsyncLocalStorage();

context.run(new Map(), () => {
    // Do stuff
});

You can always know what the current request context is. Using process.on('uncaughtException'), which is normally advised against, however AsyncLocalStorage allows us to create an application state. Allows for unified error handling.

Other use cases:

User management – store current user and use in DB abstraction
Monitoring – build your own monitoring tool to log/track/monitor what your apps
Single DB transaction for HTTP request

Key points:

Memory safe and pretty fast.
It’s experimental, but production ready.
It won’t work with queues.
Don’t share AsyncLocalStorage instances.
Don’t create too many AsyncLocalStorage instances.
Consider the store as immutable if using basic types
Use a Map for everything else
Use the run method, but enterWith only if you need to
Call exit() if you are not sure if it will be GCed

Talk: Examining Observability in Node.js

Speaker

Liz Parody (@lizparody23)

One big takeaway

Observing = Exposing internal state of an application so it can be viewed externally and continuously analysed.

Monitoring = Waiting for problems to happen.

Talk abstract

Imagine your productivity and confidence developing web apps without chrome dev tools. Many do exactly that with Node.js.

It is important to observe and learn what’s happening in your app to stay competitive and create the most performant and efficient Node.js applications, following the best practices.

In this talk, we will explore useful tools to examine your Node.js applications and how observability will speed up development, produce better code while improving reliability and uptime.

My notes

What is observability? It’s a measure of how well the internal state of a system can be determined from the outside.

Observing or asking questions from outside the system – no new code should be needed.

Tools to the rescue!

Software becoming exponentially more complex: microservices, Docker, Kubernetes etc. Great for products, hard for humans.

Big growth in observability tools, but hard to choose one.

Why is observability important? Just monitoring for problems not enough – new issues could be "unknown unknowns".

A good observability tool:

Helps you find where problem is
Doesn’t add overhead to app
Has great security
Flexible integrations
Doesn’t require code changes

Observing = Exposing internal state to be externally accessed.

Monitoring = Waiting for problems to happen.

Layers of observability:

Cloud/Network
Service/Host
Node.js
Internals

Node.js + Internals tools

A. Node.js Performance Hooks

Performance monitoring should be part of development process, not an afterthought when problems arise.

Using perf_hooks module allow you to collect performance metrics from the running Node.js application.

Requires code to implement in your application.

B. Profiling

Flame graphs can be very useful, but they’re very intensive to collect so cannot be captured in production.

C. Trace Events

Enable with —trace-event—categories trace_events

node, node.async_hooks, v8 – enabled by default

To get the output of several events —trace-event-enabled

Connect to the locally running application: chrome://tracing

Tracing has less overhead, but it can become tricky to work with as it exposes a lot of Node.js internals.

D. Heap Snapshot

Is a static snapshot of memory usage details at point in time, glimpse into V8 heap usage

Useful for finding and fixing memory + performance issues in Node.js applications.

Built in heap snapshots signal flag --heapshot-signal

Chrome DevTools allow you to compare snapshots.

E. The V8 Inspector

Chrome DevTools was integrated directly into Node.js a few years ago.

--inspect flag, listens by default on 127.0.0.1:9229

--inspect-brk for using the inspector with breakpoints

Go to chrome://inspect so you can connect DevTools to your Node.js application

Allows you to… edit code on-the-fly, diagnose problems quickly, access sourcemaps for transpiled code, LiveEdit, console evaluation, sampling JavaScript profiler with flame graph, heap snapshot inspection, async stacks for native promises.

Only suitable for development, not for production.

Problems with Node.js internals tools

Tells you there’s a problem, but not where.

Not easy to implement, not enough information.

Not presented in user-friendly way, data overload.

Significant overhead, not viable in production.

External Tools for Node.js Observability

A. Blocked Library

Available in Node.js 8+. Helps you checked if event loop is blocked, provides stacktrace pointing to blocking function. blocked() function reports every value over configured threshold.

B. New Relic (hosted service)

Offers application performance monitoring (APM).

C. DataDog (hosted service)

Similar service to New Relic.

D. Instana (hosted service)

APM for microservices – trace every distributed request, map all service dependencies, profile every production process.

E. Dynatrace (hosted service)

Another APM, with a focus on "advanced observability".

F. Google Cloud Stackdriver

Another APM, for Google Cloud and Amazon Web Services.

Problems with APMs

They have to be integrated into your applications and can cause a significant amount of overhead.

Accuracy might be questionable as the APM modules themselves can have

N|Solid

Native C++ agent which runs alongside your application, doesn’t require integration with your application code, resulting in minimal overhead on application performance. [N|Solid is a product of NodeSource, the speaker’s employer]

Talk: Node.js startup performance

Speaker

Joyee Cheung (@JoyeeCheung)

One big takeaway

There is a tremendous amount of important work being done in the background by developers like Joyee who are working on the Node.js core. If you want to get a deeper understanding of what Node.js is doing under the hood, and why improvements to the Node.js core are so important, I thoroughly recommend that you watch Joyee’s talk.

Talk abstract

In this talk, we will break down how Node.js spends its time starting up from scratch, and look into recent changes in the Node.js code base that have improved the startup performance by using different tricks and various V8 APIs.

My notes

The journey of Node.js startup performance

Refactoring to avoid unnecessary work
Implement code caching
Integrating V8 startup snapshot

Used to take ~60ms on a modern server.

After optimisations ended, startup time on same server dropped to 21ms.

Between Node.js v10 – v15 – startup time time reduced by 40 – 50%

Overview of the Node.js bootstrap process

Around half of the Node.js core is written in JavaScript, the rest in C++.

Initialize the process e.g. processing command line flags, setting up signal handlers, creating default event loop etc. (C++)
Initialize V8 isolate (C++)
- v8::Isolate is instance of the v8 JavaScript engine
- Encapsulates the JS heap, microtask queue, pending exceptions…
Initialize V8 context (JavaScript)
- Sandboxed execution context
- Encapsulates JavaScript builtins (primordials) e.g. globalThis, Array, Object
- Node.js copies original JS built-ins at beginning of bootstrap for built-in modules to use.
- In Node.js userland JS executed in main V8 context by default, shares same context as one used by built-ins of Node.js
Initialize Node.js environment (JavaScript and C++)
- Initialize runtime-independent states (JavaScript)
- Initialize event loop (C++)
- Initialize V8 inspector (C++) – can only debug JavaScript once the inspector is initialized
- Initialize runtime dependent states (JavaScript)
Load main script (JavaScript)
- Execution from CLI (node index.js) – Create + initialize environment, select a main script → Load run_main_module.js, detect module type → Read and compile ${cwd}/index.js with CommonJS or ECMAScript module loader → Start event loop
- Execution for worker intialiized by code in main thread – Create + initialize environment, select a main script → Load worker_thread.js, setup message port and start listening → Start event loop → Compile and run the script sent from the port
Start the event loop – will be kept running until nothing is keeping it open.

Refactoring

Lazy-load builtins that are not always used
- Lots of builtin modules depend on each other
- Caveat: more time would be spent loading them on demand later
- Can be reverted when startup snapshot covers these modules
Initializing runtime states were cleanly separated as part of the refactoring work.

Code caching

This speeds up JS compilation.
Previously: parse and compile source code of JS native modules at Node.js run time and execute them to make them available as built-in modules.
Now: parse and compile source code of JS native modules at Node.js executable build time, then deserialize them from the Node.js executable in the Node.js process (run time), and execute them to make them available as built-in modules.

Refactoring for snapshot integration

This was enabled by splitting runtime initialization into two separate phases (as mentioned earlier).
Before: At Node.js process run time: Array, Object, String etc. → Runs through initialization scripts → Initialize Primordials: process, URL, Buffer etc. → Node.js process
After: At Node.js executable build time: Array, Object, String etc. → Runs through initialization scripts → Initialize Primordials: process, URL, Buffer etc. → Serialize and compile into snapshot blob. At Node.js process run time, deserialize snapshot blob from executable.
Saves quite a lot of time at startup.

Ongoing work

During this refactoring work for Node.js, contributions were made to V8
Supporting more language features in the V8 snapshot
- JSMap and JSSet rehashing (previously disabled in Node.js v8)
- Class field initializers

Future work

Userland snapshotting

Take a snapshot of an application and write it to disk
Load from file system or build into an executable
Tracking issue: https://github.com/nodejs/node/issues/35711

Questions & Answers

What’s inside the startup snapshots?

Two types:

Isolate snapshots – e.g. V8 strings, numbers
Context snapshots – e.g. objects you create

What are runtime dependent states?

Runtime dependent states = things configured with command line flags or environment variables

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Jump links

Talk: Aaaaaaaaaaaaaah, They’re Here! ES Modules in Node.JS

Speaker

One big takeaway

Talk abstract

My notes

Related links

Talk: Can we double HTTP client throughput?

Speaker

One big takeaway

Talk abstract

My notes

Related links

Talk: AsyncLocalStorage: usage and best practices

Speaker

One big takeaway

Talk abstract

My notes

Related links

Talk: Examining Observability in Node.js

Speaker

One big takeaway

Talk abstract

My notes

Related links

Talk: Node.js startup performance

Speaker

One big takeaway

Talk abstract

My notes

Related links

See why 4M developers consider Sentry, “not bad.”

The Next Generation Developer Platform

Read next

Nine words from life lessons: I was wrong. You were right. I love you.

GitHub Webhook CI/CD: Step-by-step guide

Understanding Bearer Tokens: A Simple Guide for Node.js APIs

Simplifying JSON Validation with Ajv (Another JSON Validator)