DEV Community

loading...
LogDNA

Tailing Logs with Node.js

Darin Spivey
I'm a software engineer who loves to focus on thorough testing, scalability and efficiency.
・6 min read

Most techies will be familiar with the age-old tail -f <filename> command in Unix-like systems. It's great for shell scripting and CLI commands, but what about being able to tail a file in a language such as Node.js? Sure, you could fork a child process and scrape stdout, but that's not fun (or reliable) no matter what language you're in. Let's examine what it takes to do it right.

Why Do We Need It?

The ability to "tail" a file in Node.js can have many use cases. It could be for a sysadmin dashboard that looks for certain errors in /var/log/system.log for which you'd want to examine every line across log rolls for a particular pattern. Even if log rolling isn't a concern, if a file needs to be tailed programmatically, something that avoids creating a child process to run the real tail -f command is less expensive and easier to manage.

For LogDNA, the tail of a file is the foundation of our Node-based agents: They need to watch (many) files for changes and send those lines up to LogDNA servers for ingestion, so they need a Node-based tailing method. Unfortunately for us, several of the packages available on NPM, although they optimally use streams, do not properly respect stream backpressure and blindly push data through the stream regardless of whether or not something is consuming it. That's a big no-no when working at large scales since that can lead to data loss in the stream. If the idea is to have a tail read stream as with a logging system sending data, then it needs to properly implement this functionality.

What is Stream Backpressure?

Backpressure is a condition that happens in both readable and writable streams. Although the word stream implies a constant flow of data, there is still an internal buffer that acts as a temporary bucket for data to live while it's being written or read. Think of a busy line at a continental buffet breakfast. There is a steady flow of patrons wanting their morning bagel. As they move past the bagel tray and take one, the employee behind the table must provide (periodically) fresh bagels to keep the tray full. The same concept applies to streams. The internal buffer (the tray) exists so that data (bagels) can periodically be provided and is always available when it's needed. Streams place data into the buffer by calling a push() method (for readables), or a write() method (for writables). The problem is that the buffer size is NOT unlimited and therefore can fill up. When that happens, Node.js terms it as backpressure. Whatever is trying to put data into the buffer is told to stop (by returning false from push() or write() calls) until Node.js signals that it's ready for more data. Mind you, most of this control flow is internal to Node's various stream classes, but implementers must define functions like _read() since Node will call it when backpressure has ended.

What Are Some Other Pitfalls?

The main difficulty with doing file I/O properly at scale is efficiency. Reading chunks of a file at scale, especially in production, should not be done by reading all of the changes into a buffer. The size of the data that you need to consume may vary widely depending on throughput to the log file. For example, if the log is getting flooded with entries, then a one-second poll could result in thousands of kilobytes (kB) or even megabytes (mB) of log lines that need to be read. Trying to read that into a buffer all at once will, at best, slow down your system; at worst, it will fall over. Just think, then, that a server that does 1000+ requests per second, which is a very reasonable expectation, will have a TON of log entries every second. The sheer scale of that data means backpressure issues are a very real possibility.

However, creating an efficient tailing package isn’t just dealing with the backpressure problem. Here are some of the challenges that any solution needs to consider:

  • Since file "watchers" are not reliable across operating systems (even with node's built-in watcher module), we need a polling solution to repeatedly query the file for changes. This problem requires the code to keep the state of the last position (kind of like remembering where a cursor was when you reopen a document) and whether or not the file has been renamed.

  • Consuming the added lines should be done via a stream to avoid reading file chunks into memory all at once.

  • How can we ensure that no lines are lost? If a file is rolled between polls, then the "old" file may contain lines that will not be read on the next poll of the "new" file.

  • Similar to log rolling, if the file is truncated manually or otherwise, the code cannot resume reading from its previous position. It will have to detect this case and start reading from the beginning of the file.

Overall, a tailing solution that accounts for backpressure needs to be able to work with the common problems of log files where data flow is large and the file itself changes constantly and rapidly, whether to be renamed, moved, or truncated, without being overwhelmed by memory concerns.

How Did We Do It?

For TailFile, the open-source package we’ve released, we decided to grapple with the overall problem of file I/O, including the use of streams, the identification of filename changes, and the management of backpressure. As with other packages in the wild, a Node Readable stream implementation is the efficient way to read data from a file. That means the main TailFile class in the new package needed to be a Readable class implementation to consume the tailed bytes. The new TailFile class also uses a stream to read the underlying file resource. This pairing allowed us to use async/await iterators to read the file's data rather than use static buffers that would consume much more memory. When that data is read, it is pushed through the main TailFile implementation as if the data came from a single file, despite the possibility of log rolling.

A differentiator of this code is that it maintains an open filehandle to the log file. This is the key to being able to handle log rolling. When the file changes, the filehandle is still attached to the original file, no matter what the new name (which isn't possible to know) is. Although we cannot use createReadStream() to read from the filehandle, a one-time operation to read the remainder of the file from the last known position can be done. Since we track "start position", the remainder of the file is just fileSize - startPos. By reading that chunk, we will get any data added between the previous poll and the rename, and no data will be lost. Successive polls of the new file are allowed to use createReadStream() as normal, and an async/await flow ensures that we read from the file descriptor prior to streaming data from the newly-created file with the same name.

Another accomplishment of TailFile is its proper implementation of stream backpressure. Backpressure from a stopped consumer can happen if the data is unpiped after running for a bit or if, upon starting, does not immediately add data events or a pipe to put it in “flowing mode.” Following the Readable implementation rules, if the calls to push() return false, then TailFile pauses until _read() is called, signifying that there is a consumer reading the data.

The combination of all of these choices means that TailFile can handle large amounts of data amidst the occasional renaming of the target file without losing any lines.

How Can You Help?

Do you have a project that needs tail functionality in node? Please try our package! Open GitHub issues on the repo for bug tracking or even to add new features. If you like the project, please give it a "star" on GitHub. We are confident that this package can become the best tail package that exists on NPM.

Discussion (0)