Fernando Doglio

Posted on Aug 20, 2019

Logging at Scale Done Right

#node #logging #scale #javascript

How to avoid siloed logs in your distributed Node.js platform

Originally published here: https://blog.bitsrc.io/logging-at-scale-done-right-714896554d94

Distributed platforms are fantastic for solving a lot of troubles, such as scaling, high availability, even maintainability of a big code base.

But for all the great benefits they provide, they also come with some added baggage you need to take into account when working on one. In this article, want to cover one of them: distributed logging

Because logging is easy when you’re doing it locally for a single service, but when you’re starting to span tens or even hundreds of them in parallel, things start getting a little crazy.

What can go wrong with your logs?

Moving from a single instance type of application into a microservices-based platform can be quite a project on and of itself.

Specifically, when it comes to logging a few things can go wrong:

Fragmented truth: this is the obvious and most common problem, your log files are saved locally inside each server and thus whenever you need to check what happened, you only get part of the story. In order to fully understand what is going on in your entire platform, you’d need to manually collect all log files, merge them and study them together.
Missing context: another side effect of not taking the big picture into consideration while writing your logging code, is that you’re only focusing on a single process. You might fail to log things like the IP or name of the server running your service, or how many copies were active at any given time. The context is everything when there are multiple moving pieces, not so much when there is only one.
Running out of storage space: logs aren’t something you’re looking at all the time unless you’re running some sort of mission-critical service. So having logs stored locally will eventually fill whatever storage you assign to them. And even if you’re considering rotating them (with something like log-rotate), spikes in activity will cause data loss due to the fast increase in size.

I can keep going, but I think you get the point by now, there are many things that can go wrong with logs, and you’ll especially regret not having a better logging strategy when things go wrong and you find yourself going thousands of log lines manually.

In order to try and avoid these problems, we might want to start considering going about it in a different way.

Traditional logging setup vs scalable setup

What are the benefits of a scalable logging strategy?

A scalable logging strategy is exactly what the name implies: you’re able to log as much as you need. Just like you can (and should) scale your processing power or your bandwidth when your platform is experiencing a spike on traffic, your logging capabilities should have a similar elasticity.

The rule of thumbs should be:

The heavier your platform is working, the more logging it’ll require

So what are the benefits of a scalable strategy then?

For starters, you’re not limited by the hardware of your existing server. You can have a tiny hard drive on your server, while a massive, cloud-powered storage waiting to receive log messages.
Your logging activities don’t affect your server’s I/O operations. In other words, you’re not constantly writing on your disk, freeing up cycles for the actual needs of your app.
By centralizing your logs, they’re easier to browse and examine. You don’t have to go server by server, manually downloading log files and then trying to merge them before being able to look at them. With a distributed approach, you’re sending your logs elsewhere, and through that process you can merge them before storing them in a central and common place.
Log & forget. Normally when you’re logging locally, you’re having to worry about things like log format, log file size, periodicity and other variables. On a distributed setup, you’re able to let the logging service take care of that upon reception of the log, and your developers (and the services they develop) don’t need to worry about that, they just send the log event and forget about it.
Easier to keep a standard format among all services. Related to the previous point, if you have a centralized logging service, capable of receiving and processing log events from different places, then you can centralize the ETL code inside it. That way you gain control over the format without affecting or adding extra work to the rest of the platform.

And that is just off the top of my head, depending on your particular situation and platform, other benefits might start cropping up as you start considering this architecture.

Now that I’ve (hopefully) convinced you about the benefits of going distributed, let me explain what kind of tools you can use for that.

The tools for the job

There are many options when moving into a distributed setting, some of them are completely free whilst others will charge you quite a lot of money. Of course, free comes at the price of a required manual installation, whilst paid services will be hosted on the cloud and all you have to do is point your logs at them.

Third-party services that offer to act as elastic log storage, with an added bonus of providing a web UI capable of browsing the logs and getting statistics from them.

For this particular case, I’m going to cover the ELK (Elastic, Logstash, Kibana) stack, but you’re more than welcome to search for other options and pick the one that fits your needs the best.

The ELK stack

The way this stack works is by providing you with the three products you need to transfer the data, store it, make it browsable and finally provide a UI to search and gather statistics from the logs.

The way to do that is by using the three components of this wonderful, open-source, and free stack:

Elastic: This is basically a NoSQL database. In particular, one that is specialized in search. So it’ll act as the main storage for your log events, making them really easy to search and retrieve later on.
Logstash: This is the way you get your logs from your servers into Elastic. By installing small agents in your servers, you can configure them to read, transform and transfer the log file’s lines all the way to your Elastic server.
Kibana: Finally, once your logs have been transferred, and stored in Elastic, Kibana will act as a user-friendly UI, capable of interacting with Elastic’s REST API.

Connecting to ELK from your Node.js app

So you have your ELK stack ready and rocking (and if you haven’t, just follow one of the many tutorials online), but no content. Let’s now connect our app to it, you’ll see how easy it is.

Since we’re dealing with Node.js, I’d say there are two ways we can go about it: we can either keep logging the way we’re already doing it, most likely into a file and configure Logstash to capture updates to that file and re-send them into Elastic. Or we can use a logging library, such as Winston and configure one of it’s transport to do it for us.

Guess which one I’m going to be talking about?

Going from Winston to Elastic

The beauty of Winston is that we can even avoid having to configure Logstash. Don’t get me wrong, Logstash is a very useful tool, it can do a lot for us in the realm of transportation and formatting of the logs, which sometimes can be a godsend, especially for those cases when we’re unable to access an application’s code and manipulate the way it logs.

If we can’t change that, then we need to grab whatever is being saved and manipulate it enough to make it fit our storage needs, after which we’ll send it over to Elastic. This is where Logstash shines. You can find many places that deal with the most common log formats from others applications and how to configure Logstash for them.

But if you *are *in charge of the coding of your app, then there is no need for this. Thanks to libraries such as Winston, we can easily redirect (or even add to) our logging destination so our information ends up where we need it.

In order to do this, we’ll be using Winston with it’s corresponding plugin called winston-elasticsearch.

So in order to install things, we can simply do:

    $ npm i winston --save
    $ npm i winston-elasticsearch --save

After that, here is how you’d want to create a new logger object that can be later modified. Maybe you already have your Winston-based logger, so in that case, just grab the transport-related code and add it to your own.


const winston = require('winston');
const Elasticsearch = require('winston-elasticsearch');

const esTransportOpts = {
  level: 'info'
};

const logger = winston.createLogger({
  level: 'info',
  format: winston.format.json(),
  transports: [
    new winston.transports.File({ filename: "logfile.log", level: 'error' }), //save errors on file
    new Elasticsearch(esTransportOpts) //everything info and above goes to elastic
  ]
});

if (process.env.NODE_ENV !== 'production') {
  logger.add(new winston.transports.Console({ //we also log to console if we're not in production
    format: winston.format.simple()
  }));
}

The code creates a new logger object, which has two or three different transports, depending on the environment. Clearly, here I’m playing with the default values and letting the plugin connect to my local copy of Elastic.

So using the following code I can log into my local copy:

//Logging tests...
logger.info("Test!")
logger.error("This is an error message!")
logger.error("This is an error message with an object!", { error: true, message: "There was a problem!"})

By default, if you’re not using Kibana right now, you can simply query Elastic’s REST API like so:

    $ curl [http://localhost:9200/logs-2019.07.29/_search](http://localhost:9200/logs-2019.07.29/_search)

Notice how the index is created by date, so you might want to adapt that part to your current date. This is what you’d get:

{
    "took": 994,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 4,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [{
            "_index": "logs-2019.07.29",
            "_type": "_doc",
            "_id": "Cl2KP2wBTq_AEn0ZM0t0",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2019-07-29T21:01:57.472Z",
                "message": "Test!",
                "severity": "info",
                "fields": {}
            }
        }, {
            "_index": "logs-2019.07.29",
            "_type": "_doc",
            "_id": "C12KP2wBTq_AEn0ZM0t0",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2019-07-29T21:01:57.474Z",
                "message": "This is an error message!",
                "severity": "error",
                "fields": {}
            }
        }, {
            "_index": "logs-2019.07.29",
            "_type": "_doc",
            "_id": "DF2KP2wBTq_AEn0ZM0t0",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2019-07-29T21:01:57.475Z",
                "message": "This is an error message with an object!There was a problem!",
                "severity": "error",
                "fields": {
                    "error": true
                }
            }
        }]
    }
}

The most interesting bit from the above JSON is the last hit (check the hits array), notice how the fields element only has one property, because the library is mixing the message field with the first parameter I passed to the error method.

Connecting to a remote instance of Elastic

Ideally you’d want to connect to a remote Elastic instance, and in order to do so, you can simply pass in the Elastic client configuration to the ES Transport config object. Like this:

const esTransportOpts = {
  level: 'info',
  clientOpts: {
      host: "http://your-host:your-port",
      log:"info"
  }
};

With that, you’re automatically sending your log messages out into the ether(net).

Transforming your data before sending it

You can do some pre-processing of your log messages just for Elastic, thanks to the transformer property you can setup on the ES transport properties, for example:

const esTransportOpts = {
  level: 'info',
  transformer: logData => {
      return {
        "@timestamp": (new Date()).getTime(),
        severity: logData.level,
        message: `[${logData.level}] LOG Message: ${logData.message}`,
        fields: {}
      }
  }
};

That transformer function will ignore all meta properties (basically any objects we might want to log) and extend a bit the actual message by prefixing it with a “[LEVEL] LOG Message:” string.

Conclusion

That is it, sorry for the long intro, but as you can see, actually setting up a centralized logging platform and connecting your Node.js applications to it, is quite straight forward:

Setup Elastic
Install Winston and Winston-elasticsearch
Use the code or the transport code I gave you above
????
Profit!!!

And you’re done! (maybe that last part is a bit of an exaggeration, but the first 3 steps are quite valid :P)

Let me know down in the comments if you’ve had any experience working with Elastic for a centralized logging platform.

Otherwise, see you on the next one!

Oldest comments (5)

Pubudu Kodikara • Nov 6 '19

Thanks a lot! This article saved my day! I was trying to pass logs from my node app to the ELK stack but I tried to send the logs to Filebeat or Logstash. Since I'm deploying to a Kuberenetes cluster this won't be that easy. But sending directly to Elasticsearch seems to be the best way to do it.

Fernando Doglio • Nov 6 '19

I'm really glad it helped! Thanks for letting me know!

vikas • Jul 6 '20

How to log https request/response body of HTTP calls and how to stop logging the same via this way?

AniketSalvi • Dec 18 '19

Hi,
I am pushing logs to elasticsearch using winston-elasticsearch NPM module.
But in my case npm module not push real time logs to elasticsearch.

Thank you!

2-amithap • Nov 23 '21

Hi,

Thanks a lot for good article, it was very useful.

I have resolved few errors with below changes with latest versions.

Elasticsearch is not a constructor error was resolved by extending the ElasticsearchTransport function below. new Elasticsearch.ElasticsearchTransport(esTransportOpts) //everything info and above goes to elastic
1. clientOpts accept only object with fields "node" instead "host". clientOpts: { node: "localhost:9200/", log: NullLogger },