DEV Community: Amir Blum

Aspecto OpenTelemetry Sampler for NodeJS

Amir Blum — Mon, 19 Dec 2022 15:17:00 +0000

At Aspecto, we enable users to configure advanced sampling rules (both head and tail sampling) in an easy-to-use UI with reach features and centralized management. It's a control plane for your trace sampling.

The Aspecto Sampler for NodeJS is an implementation of the OpenTelemetry head sampling that allows you to remotely configure all head sampling needs from one UI with zero code changes.

Introduction

Until today, head sampling capabilities were part of our OpenTelemetry distribution, which included a full OpenTelemetry SDK, instrumentations, resource detectors, payload collection, and more. Due to ongoing demand, we are introducing AspectoSampler. Now you can craft your own custom vendor-agnostic OpenTelemetry NodeJS SDK setup and plug the Aspecto sampler with just a few lines of code.

Here is an example of setting up instrumentation for your typescript service from the official OpenTelemetry docs:

/*tracing.ts*/
import * as opentelemetry from "@opentelemetry/sdk-node";
import { getNodeAutoInstrumentations } from "@opentelemetry/auto-instrumentations-node";
const sdk = new opentelemetry.NodeSDK({
 traceExporter: new opentelemetry.tracing.ConsoleSpanExporter(),
 instrumentations: [getNodeAutoInstrumentations()]
});
sdk.start()

Your real-life setup will probably include more components, like resource detectors and an otlp exporter to ship the telemetry data to an OpenTelemetry Collector or your chosen vendor (Aspecto, ahem ahem 👀)

This simple installation of OpenTelemetry will record all traces -- each operation going on inside your service. While collecting all spans is fine for POCs, production workloads usually create a firehose of costly and noisy data. Sampling is a powerful tool in your OpenTelemetry toolbox to reduce your costs and focus on telemetry that is useful for you. Here is the same code with an Aspecto sampler integrated (just two lines of code):

/*tracing.ts*/
import * as opentelemetry from "@opentelemetry/sdk-node";
import { getNodeAutoInstrumentations } from "@opentelemetry/auto-instrumentations-node";
import { AspectoSampler } from "@aspecto/opentelemetry-sampler";
const sdk = new opentelemetry.NodeSDK({
  traceExporter: new opentelemetry.tracing.ConsoleSpanExporter(),
  instrumentations: [getNodeAutoInstrumentations()],
  sampler: new AspectoSampler(),
});
sdk.start()

That is it! Just two lines of code that you will never need to touch again, and you unlocked the FULL power of sampling 🕺

Now let's see how it works.

Plain OpenTelemetry Sampling

Once you realize you collect millions of boring and useless health-check traces or that active endpoint in your system that floods your trace viewer and blows up your motley bill, you ask yourself -- how do I get rid of them!? And the immediate answer is sampling. It is a widely used practice and the ideal go-to solution for this problem.

OpenTelemetry provides out-of-the-box samplers such as TraceIdRatioBased. These are great if you want to blindly record a percentage of the traces (sample 1% of spans).

It is a simple strategy to implement, but despite reducing the overall trace amount, you randomly sample traces while dropping other critical ones, those you always want to collect.

After you realize you need more fine-grained control of your sampling, you can write your own custom sampler by implementing the Sampler interface. You might start with something like this:

class MySampler implements Sampler {
  shouldSample(context: Context, traceId: string, spanName: string, spanKind: SpanKind, attributes: Attributes) {
      if (attributes[SemanticAttributes.HTTP_TARGET] === '/health-check') {
          return { decision: SamplingDecision.NOT_RECORD };
      }
      // fallback
      return { decision: SamplingDecision.RECORD_AND_SAMPLED };
  }
  toString() { return 'MySampler'; }
}

You then wrap with the following, to apply the sampling decision on the entire trace based on its root span.

const sampler = new ParentBasedSampler({ root: new MySampler() });

Then you test this code, create a PR, get it approved, and deploy it to your dozens of microservices.

A few days later, your vendor's Billing dashboard already looks happier with fewer spans and less cost.

Then you browse your Trace Search screen and realize that 95% of your remaining traces come from one heavily used endpoint that gets executed millions of times daily. You still want observability into it but ideally, collect fewer spans than what you do today. You wish you could sample just 1% of this endpoint.

You might go back to your custom sampler and add something like this:

import { Sampler, ParentBasedSampler, TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-base';
import { SemanticAttributes } from '@opentelemetry/semantic-conventions';
import { Context, SpanKind, Attributes, SamplingDecision } from '@opentelemetry/api';
class MySampler implements Sampler {
  private probabilitySampler = new TraceIdRatioBasedSampler(0.01);
  shouldSample(context: Context, traceId: string, spanName: string, spanKind: SpanKind, attributes: Attributes): SamplingResult {
      if (attributes[SemanticAttributes.HTTP_TARGET] === '/health-check') {
          return { decision: SamplingDecision.NOT_RECORD };
      }
      if ( attributes[SemanticAttributes.HTTP_TARGET] === '/my/busy/endpoint') {
          return this.probabilitySampler.shouldSample(context, traceId);
      }
      // fallback
      return { decision: SamplingDecision.RECORD_AND_SAMPLED };
  }
  toString() { return 'MySampler'; }
}
const sampler = new ParentBasedSampler({ root: new MySampler() });

And again, create a PR, review, merge, deploy, test, and watch your billings drop and your trace viewer becomes less busy.

At this point, you, your manager, and every other trace consumer in your company are happy, but that is not where the story ends.

Now, maintaining this sampler becomes one of your tasks. You might need to implement it in multiple languages, add dozens of complex statements, synchronize it across dozens of microservices and serverless functions and debug it when it fails to do what you meant. Requests keep flowing. "I want to see more of this", "can you hide that?", and "why can't I find my trace??"

One day, an incident in production hits your /my/busy/endpoint. You say to yourself -- I wish I could see all traces for just a few hours until we understand what is going on. But modifying and deploying new versions is not your first priority, while the incident keeps everyone stressed.

If you deploy changes in sampling, you need to remember to go back and revert these changes after the incident is resolved.

This problem does not scale well. Managing sampling for a real-life production environment in code can be a massive headache.

Introducing Aspecto Sampler 👏🎉

Aspecto Sampler

Aspecto has years of experience implementing OpenTelemetry in high-workload production environments. We have collected needs and use cases from our customers and iterated them to crystallize the best-managed sampling mechanism, so you can focus on your task and leave the hard work and technical implementation to us.

Aspecto sampler abstracts away all the nitty details and implementation code and exposes a UI where you configure the logic in a simple interface. It is basically the sampler from above, but you do not need to write it yourself. Plus, it provides advanced features you can explore to customize your sampling to your specific needs.

Let's review how it works and its features.

Remote Configuration

All sampling configurations are centralized and managed on the Aspecto platform. You only need to install the sampler into your OpenTelemetry SDK once or use our OpenTelemetry distribution and never touch it again.

Up-to-date Configuration

Aspecto Sampler always fetches the latest configuration. Sampling configuration is always up-to-date and consistent across all your services. You do not need to deploy any code or worry that some services still run with an old configuration that is not getting updated.

Automatic Updates

Your changes are immediately and automatically pushed to all your samplers in seconds. With one click, all your services are updated live. No need to restart, deploy or monitor anything.

No Code

Adding, updating, or removing sampling rules does not require any code changes. It is all done in an easy-to-use UI and involves zero code. Any function in your organization can manage it, not only developers who master a specific programming language.

Instant

Any change takes effect immediately. You can experiment and fine-tune your configurations and get fast feedback.

UI

The entire sampling workflow is managed and configured in a dedicated UI in the Aspecto app, built to give you focused and easy yet powerful tools to achieve your sampling goals.

Multiple Languages

One unified interface to customize your sampling for multiple programming languages. No need to master or jump between languages to implement a sampling policy in your organization.

Timer Rules

If you are debugging something or working on a specific service or endpoint, it is convenient to "sample everything" while you need it, but you also do not want to forget about it and see your tracing costs inflate. You can add rules with a timer -- so you get instant visibility into the desired area and sleep well knowing that your traces bill will not blow up.

Turn Rules On and Off

With a single click, you can turn a rule on and off, dynamically adapting sampling as you go and integrating sampling tools into your everyday workflow to find the right balance between costs and telemetry verbosity.

Search Capabilities

OpenTelemetry implementation can easily include many sampling rules. Search your rules with free text to quickly find what you need. You can sort them and check who added each rule and when.

Rich Sampling Language

Define your sampling strategy with rules, which are evaluated in order to derive a sampling decision for each new trace.

Conditions

Use span attributes like http path, http method, messaging queue or topic name, or any custom span attribute to add conditions. Describe each one with a rich set of operators, such as HTTP path starts with "/user" or even write regular expressions for parameterized routes like */account/:id/users*

Services and Environments

Narrow a rule to affect only a specific service/s or environment. For example, you can easily add a rule that only affects the users-service in the production environment.

Sampling Rates and Priority

Arrange your rules according to priorities, place-specific important rules on top, and general fallback rules at the bottom. Each rule defines a sampling rate that can be 0% (record nothing).

Installation Instructions

Visit our docs for complete instructions on installation and configuration.

SAMPLER DOCS

Tail Sampling

The sampler above is a Head Sampler that applies sampling decisions in the SDK as spans are created.

A popular alternative is to use the Aspecto Tail Sampler -- an OpenTelemetry Collector distribution that aggregates spans into traces and applies a sampling decision on an entire trace. Each option has pros and cons, and we encourage you to research and choose the option that best fits your needs.

Not sure what to choose? Schedule a call with our OpenTelemetry experts to assist in the process.

Feedback

We are in a constant process of learning and improving and would love to hear from you. Do not hesitate to contact us (also via our website chat) to share feedback / ask questions / or get support for our managed sampling product.

Our OpenTelemetry experts are here for you.

Supported Languages

Our managed sampler has the following support:

NodeJS -- integrated with our OpenTelemetry distribution, or as a standalone sampler which you can integrate into your custom opentelemetry setup
Ruby -- integrated into our Ruby OpenTelemetry distribution

Support for more languages is coming soon.

Our Tail Sampler can be used with any OpenTelemetry SDK and components that produce trace data.

Running Jaeger Locally: How to Get Started

Amir Blum — Thu, 08 Jul 2021 14:33:22 +0000

In this article, you’ll learn how to run Jaeger locally, why and when you should do it, as well as what are Jaeger’s limitations when running locally.

Let’s start with the basics: a distributed tracing system is generally composed of client and backend components.

I will touch briefly on client components, though most of this post is about backend components.

Client Components

The client part is usually a set of libraries installed inside an application which “instrument” it — generating a “span” object for each interesting event happening in runtime inside the service.

A modern and recommended open-source client SDK that does that is OpenTelemetry.

Spans on the client alone are meaningless — they need to be accessible to a person who consumes them. Consumers are usually dev-ops teams monitoring a system or developers maintaining the system and adding new features.

Trace Usages

There are many ways in which collected trace data can be used and provide value. These are the common ones:

Aggregate spans to trace — group all spans (events) which are part of the same trace (logical operation) arriving from different distributed services, into a single entity
Query the collected data (show me all traces in the last hour starting at endpoint GET /users in service X)
Visualize the data — usually in a graph, or timeline
Find errors (exceptions, 500s, etc) and investigate their root-cause
Investigate performance bottlenecks

Backend Components

To fulfill the requirements above, we need to set up backend components.

They are used to collect spans from client components, process them, store them in a database, expose an API for the data and UI to view traces and perform queries.

Jaeger (a CNCF graduated project) is a popular open-source project with backend components that does that and is easy to set up.

To use Jaeger in production, it is recommended to install it in a cloud environment with load-balancing, auto-scaling, replications, and all that jazz.

However, it is sometimes enough to just run it locally in a lightweight and simple setup.

Running Locally

The recommended approach for running Jaeger backend locally is to use docker:

$ docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 14250:14250 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.23

And access the UI in http://localhost:16686.

You can then configure an Opentelemetry Client SDK installation or OpenTelemetry Collector to use Jaeger exporter and send trace data to this local Jaeger.

import { NodeTracerProvider } from "@opentelemetry/node";
import { SimpleSpanProcessor } from "@opentelemetry/tracing";
import { JaegerExporter } from "@opentelemetry/exporter-jaeger";

const provider = new NodeTracerProvider();
provider.addSpanProcessor(new SimpleSpanProcessor(new JaegerExporter()));
provider.register();

(This is how exporting data to local jaeger looks like in nodejs).

Running Local Jaeger: Benefits

1. Faster Debugging

If you work on a service codebase (e.g., fixing a bug, developing a new feature, or implementing integration to other services/databases/messaging systems, etc.), most likely that this is what you do:

You start an instance of the service on your local dev station
Send traffic to it to test your changes
Validate the behavior you were expecting

By instrumenting it locally, you can debug development issues faster.

For example: find the point in your app where an error occurred with less logging to console, breakpoints, etc.

An example of a trace in Jaeger showing error while accessing Redis key as a list (the red underlines are not part of the Jaeger UI)

2. Running Tests

When running your integration test suite locally — if a test fails, it can sometimes be easier to understand what went wrong by examining it in Jaeger UI, where you can view highlighted errors and events organized into a hierarchical structure.

3. Instrumentation Development

If you are writing a new instrumentation library, observing the tracing output in a UI can be much easier than browsing through textual logs. You can browse the JS instrumentation package for more info and examples.

Running Local Jager: Limitations

Jaeger is free, relatively easy to set up, and will do a good job for most basic setups and tracing needs.

The UI and features set are quite basic and you may quickly find yourself in the need of more advanced features.

Other alternatives can give value and increase development productivity in the following cases:

1. Lack of End to End Visibility in Async Messaging

When using async messaging systems, generally, there are two cases for traces.

The first one is when the message broker generates one trace. Jaeger does a great job of displaying us with that one trace.

The second case, common in batch processing scenarios, is when the senders and receivers in message brokers like Kafka and AWS SQS generate multiple traces (for example, each receive starts a new trace). In this case, Jaeger will display these traces separately. That makes it more complicated to track and debug complex transactions.

More advanced backends might have an out-of-the-box solution for that and will detect it and merge those traces into a single logical flow.

2. Cross Environment Traces

If your organization works with Jaeger in production, and let’s say you want to use Jaeger to do your tests, sending your local traces to the production Jaeger is highly suboptimal.

Not only can it pollute the production environment, but it makes it difficult to find your traces within this trace jungle. By running Jaeger locally, you get an isolated playground for your tests and development.

However, one major pitfall in this scenario is that your local Jaeger will show only the part of the trace generated from your local dev station. It means you lose the context of how your traces communicate and affect downstream and upstream services (i.e., production and staging).

In this case, you won’t have an isolated and local development session while seeing the full effect of your changes across the different environments.

3. Advanced Search

Free text search on all data or based on trace attributes. For example, if you want to search a token in the payload within specific traces.

4. Trace Data Processing and Insights

Jaeger presents raw traces and highlights errors, however, generating insights based only on that data isn’t trivial and quite complex.

Examples for such insights can be API breaking change detection, aggregation of traces based on structure, parameter journey in a trace, dependency analysis, comparison to baseline from production or staging, etc.

5. Enhanced UI

Jaeger UI enumerates all attributes for a span in a long list. It does not group or organizes related data, show JSON content in a tree, highlight common data like HTTP status code, and other good stuff that makes our life easier.

The Bottom Line

Running local Jaeger offers great benefits. It is easy to set up, and when it comes to faster debug and running tests, you get this extra confidence you might want when working locally.

Jaeger is a great tool and does an awesome job answering your basic tracing needs.

However, when your work with services gets a bit more complex, or when you want to take your productivity to the next level when working locally, you might want to consider other alternatives.

If the limitations I mentioned above are a deal-breaker for you (if not now, they might be in the future), there are several vendors in the market, supplying various solutions which enhance your tracing-based workflow. One that can help you overcome all these issues is Aspecto.

Aspecto gives you everything you get with Jaeger but with enhanced UI, search, and troubleshooting capabilities for local development and debugging. It takes 2 minutes to get started with, it’s free and OpenTelemetry-based. Think of it as Jaeger and Chrome DevTools fusion for your distributed applications.

Is Protobuf.js Faster Than JSON?

Amir Blum — Wed, 21 Apr 2021 13:59:20 +0000

When you have structured data in JavaScript, which needs to be sent over the network (for another microservice, for example) or saved into a storage system, it first needs to be serialized.

The serialization process converts the data object you have in the JavaScript program memory into a buffer of bytes, which then can be deserialized back into a JavaScript object.

Two popular serialization methods are JSON and Google Protocol Buffers (Protobuf).

JSON

Serializing data to JSON is as easy as:

const data = { name: 'foo', age: 30 };
const serialized = JSON.stringify(data); // produce: '{"name":"foo","age":30}'

Protobuf.js

Google Protocol Buffers is a method of serializing structure data based on a scheme (written in .proto file).

Example of how to serialize the previous payload to Protobuf with the protobufjs package:

syntax = "proto3";
message Message {
    string name = 1;
    uint32 age = 2;
 }

const protobuf = require("protobufjs");

protobuf.load("message.proto", (err, root) => {
    if (err)
        throw err;

    const Message = root.lookupType("Message");
    const data = { name: 'foo', age: 30 };
    var errMsg = Message.verify(data);
    if (errMsg)
        throw Error(errMsg);

    const serialized = Message.encode(data).finish(); // produce: <Buffer 0a 03 66 6f 6f 10 1e>
});

You can see that the generated output is only 7 bytes long, much less than the 23 bytes we got on JSON serialization.

Protobuf can serialize data so compactly mainly because it does not need to embed the field names as text in the data, possibly many times (“name” and “age” in this example are replaced by short descriptors of 2 bytes).

Picking the Right Format

Choosing the correct serialization format that works best for you is a task that involves multiple factors.

JSON is usually easier to debug (the serialized format is human-readable) and easier to work with (no need to define message types, compile them, install additional libraries, etc.).

Protobuf, on the other hand, usually compresses data better and has built-in protocol documentation via the schema.

Another major factor is the CPU performance — the time it takes for the library to serialize and deserializes a message. In this post, we want to compare just the performance in JavaScript.

You might eventually choose a format that is less performant but delivers value in other factors. But if performance might be a big issue for you, well, in that case, keep reading.

Encode Performance

At Aspecto, we wrote an SDK that collects trace events and exports them to an OpenTelemetry collector.

The data is formatted as JSON and sent over HTTP.

The exporter and collector can also communicate in protobuf using the protobufjs library.

Since the protobuf format is so compressed, we might think that encoding to protobuf requires less CPU (measured as the number of operations (encode/decode) in a second).

A quick Google search on the topic strengthens this thesis.

The Performance Section in protobufjs documentation led us to replace our SDK exporter from JSON to protobuf payload, thinking we will get better performance.

Actual Performance

After changing from JSON serialization to protobuf serialization, we ran our SDK benchmark.

To our surprise, the performance decreased.

That observation, which we first believed was a mistake, sent us to further investigate the issue.

Benchmarking — baseline

We first ran the original benchmark of protobufjs library to get a solid starting point. Indeed we got results similar to the library README:

benchmarking encoding performance ...

protobuf.js (reflect) x 724,119 ops/sec ±0.69% (89 runs sampled)
protobuf.js (static) x 755,818 ops/sec ±0.63% (90 runs sampled)
JSON (string) x 499,217 ops/sec ±4.02% (89 runs sampled)
JSON (buffer) x 394,685 ops/sec ±1.75% (88 runs sampled)
google-protobuf x 376,625 ops/sec ±1.05% (89 runs sampled)


   protobuf.js (static) was fastest
  protobuf.js (reflect) was 4.2% ops/sec slower (factor 1.0)
          JSON (string) was 36.1% ops/sec slower (factor 1.6)
          JSON (buffer) was 48.4% ops/sec slower (factor 1.9)
        google-protobuf was 50.4% ops/sec slower (factor 2.0)

These results show that protobuf.js performance is better than JSON, as opposed to our previous observation.

Benchmark — telemetry data

We then modified the benchmark to encode our example data which is an opentelemetry trace data.

We copied the proto files and data to the benchmark and got the following results:

benchmarking encoding performance ...

protobuf.js (reflect) x 37,357 ops/sec ±0.83% (93 runs sampled)
JSON (string) x 52,952 ops/sec ±2.63% (89 runs sampled)
JSON (buffer) x 45,817 ops/sec ±1.80% (89 runs sampled)

          JSON (string) was fastest
          JSON (buffer) was 12.8% ops/sec slower (factor 1.1)
  protobuf.js (reflect) was 28.2% ops/sec slower (factor 1.4)

These were the results we expected — for this data, protobuf was actually slower than JSON.

Benchmark — strings

We got two results for two different data schemas.

In one – protobufjs was faster, and in the second — JSON was faster.

Looking at the schemas, the immediate suspect was the number of strings.

Our schemas were composed almost entirely of strings. So we created a third test, populating a simple schema with many many many strings:

syntax = "proto3";
message TestStringArray {
    repeated string  stringArray = 1;    
}

We ran the benchmark with this payload (10,000 strings, of length 10 each).

var payload   = {
    stringArray: Array(10000).fill('0123456789')
};

And the results proved our suspicion:

benchmarking encoding performance ...

protobuf.js (reflect) x 866 ops/sec ±0.68% (92 runs sampled)
JSON (string) x 2,411 ops/sec ±0.91% (94 runs sampled)
JSON (buffer) x 1,928 ops/sec ±0.85% (94 runs sampled)

          JSON (string) was fastest
          JSON (buffer) was 20.0% ops/sec slower (factor 1.2)
  protobuf.js (reflect) was 64.0% ops/sec slower (factor 2.8)

When your data is composed of many strings, protobuf performance in JavaScript drops below those of JSON.

It might be related to JSON.stringify function being implemented in C++ inside V8 engine and highly optimized compared to the JS implementation of protobufjs.

Decoding

The benchmarks above are for encoding (serializing). The benchmarks results for decoding (deserializing) are similar.

Conclusion

If you have the time, our recommendation is to profile your common data, understand the expected performance of each option, and choose the format that works best for your needs.

It is essential to be aware that protobuf is not necessarily the fastest option.

If your data is mainly string, then JSON format might be a good choice.

OpenTelemetry KafkaJS Instrumentation for Node.js

Amir Blum — Wed, 10 Mar 2021 15:23:43 +0000

TL;DR — Our JS OpenTelemetry plugin for kafkajs, is available here.

This article describes this plugin for the kafkajs package and our thought process behind it.

Some Background

OpenTelemetry is a CNCF project, which, among other things, enables the collection of distributed traces.

At Aspecto, we use OpenTelemetry at the core of our product.

While implementing it in our backend, we found a few plugins that were missing, especially when dealing with asynchronous communication.

One of them was kafkaJS.

We took this opportunity to give back to the community and developed it ourselves.

The Plugin

This plugin allows you to track all Kafka interactions in your collected traces, which means, you get a more comprehensive view of your application behavior when using Kafka as a message broker.

The kafkajs plugin captures producer and consumer operations and creates spans according to the semantic conventions for Messaging Systems.

Each message being produced and consumed is represented by a span with attributes such as messaging.destination(topic name).
Context is propagated from producers to consumers. When a message is sent to Kafka, the trace will reveal which services consume it and what other cascading operations happen down the pipe.
Batch operations can aggregate multiple messages into a single batch and receive\process them together. This is handled in the plugin according to the specification.
The plugin can be extended with hooks, which enable users to run custom logic for adding span attributes depending on the Kafka message.

The above screenshot shows an example of a producer application named kafka-producer, that exposes an HTTP endpoint (first line), route it in express (second line), and produce two messages to a Kafka topic named test, which are then consumed by another application called kafka-consumer .

As mentioned above, kafkajs was one of the missing plugins we found and as you are reading this we are working to add more plugins.

Feel free to reach out to us with any questions as we are very much invested in OpenTelemetry and the OpenTelementry community.

How to Reduce RAM Consumption by X6 When Using ts-node

Amir Blum — Wed, 03 Mar 2021 17:38:44 +0000

It turns out that running ts-node-dev / ts-node is constantly consuming hundreds of megabytes of RAM even for small and simple applications.

In development, it is usually not a big concern, however, it can be, if your application is running inside a docker container with limited resources (for example, with Docker Desktop on Mac which allocates by default only 2GB of RAM to all the containers in total).

Typescript code should be transpiled to Javascript which can be done either before running the process (tsc) or in runtime (ts-node).

The most efficient way is transpiling before running, however, this isn’t as developer-friendly since it takes forever. ts-node-dev loads everything into memory then watches the changes the developer is making and transpiles the project fast on every change.

We encountered the issue while building a demo application to showcase our product at Aspecto.

We were running multiple typescript services with docker-compose and started seeing arbitrary ts-node-dev processes exiting without even running the application, displaying the message “Done in 79.06s”.

This was due to a lack of memory. Each typescript service was using ~600MB of RAM out of the total 2GB available for all containers.

After digging a bit, we found a few possible solutions and wanted to share them.

Run ts-node-dev with option `--transpile-only`

In our case, adding the --transpile-only option to ts-node-dev reduced the consumed RAM from ~600MB to ~170MB.

The price was that the typescript code would only be transpiled, and typechecking would be skipped. Most modern IDEs (vscode, web storm), has built-in typescript IntelliSense which highlights errors, so for us, it was a fair price to pay.

If you use ts-node to run code in production that was already successfully compiled and tested in the CI, you can only benefit from setting this option.

Compile the code with tsc and monitor file changes with nodemon

Instead of using ts-node-dev, which consumes a lot of memory, it is possible to compile the application directly with tsc and then run it from dist/build like this: node dist/index.js.

For automatic reload on source file changes, nodemon / node-dev can be used.

This is our “start” script in package.json:

For automatic reload on source file changes, nodemon / node-dev can be used.

This is our “start” script in package.json:

"scripts": {
  "start": "nodemon --watch src -e ts --exec \"(tsc && node dist/index.js) || exit 1\""
}

This approach reduced the RAM on our service from ~600MB to ~95MB (but there was still a spike in RAM to 600Mb for few seconds while tsc was compiling).

Unlink the previous option, this approach does check for typescript errors and warnings, and the service does not start if errors exist in the code.

The price to pay here is a longer compilation time. In our setup, it’s about 10 seconds from saving the file until the service restarts.

Increase Docker desktop available RAM

This is the easiest fix. Just allocate more Memory to Docker Desktop by going to Preferences => Resources => Memory, and increase the value.

While it fixes the immediate problem, the containers still consume a lot of memory, and if you have plenty of them, it might be a problem soon enough.

In addition, changing the default configuration should be done by every user that wants to run the system with docker-compose, which introduces complexity in installation and usage.

Conclusion

If memory consumption is not an issue for you, just use ts-node in production and ts-node-dev in development.

However, if you do care about memory, then you have a tradeoff between fast restart time after modifications (but typechecking only in the IDE, set --transpileOnly, or typechecking in compilation) and slower restart on each modification (directly use tsc and nodemon / node-dev ).

DEV Community: Amir Blum

Aspecto OpenTelemetry Sampler for NodeJS

Introduction

Plain OpenTelemetry Sampling

Aspecto Sampler

Remote Configuration

Up-to-date Configuration

Automatic Updates

No Code

Instant

UI

Multiple Languages

Timer Rules

Turn Rules On and Off

Search Capabilities

Rich Sampling Language

Conditions

Services and Environments

Sampling Rates and Priority

Installation Instructions

Tail Sampling

Feedback

Supported Languages

Running Jaeger Locally: How to Get Started

Client Components

Trace Usages

Backend Components

Running Locally

Running Local Jaeger: Benefits

1. Faster Debugging

2. Running Tests

3. Instrumentation Development

Running Local Jager: Limitations

1. Lack of End to End Visibility in Async Messaging

2. Cross Environment Traces

3. Advanced Search

4. Trace Data Processing and Insights

5. Enhanced UI

The Bottom Line

Is Protobuf.js Faster Than JSON?

JSON

Protobuf.js

Picking the Right Format

Encode Performance

Actual Performance

Benchmarking — baseline

Benchmark — telemetry data

Benchmark — strings

Decoding

Conclusion

OpenTelemetry KafkaJS Instrumentation for Node.js

Some Background

The Plugin

How to Reduce RAM Consumption by X6 When Using ts-node

Run ts-node-dev with option --transpile-only

Compile the code with tsc and monitor file changes with nodemon

Increase Docker desktop available RAM

Conclusion

Run ts-node-dev with option `--transpile-only`