Discussion on: why you should not use console.log( ) for debugging ?

View post

So, this is nice. I think I will look into it. But, I do go more for the debugger in the browser. It works. Choose between Chromium based or Mozilla. At least Mozilla knows how to step through promises.

But, the log stuff is no small deal in organization. (For me, it's just for getting stuff to work.) But, the reason it's console.log - get it - it's supposed to put stuff in the log of the program. That's those oversized files that end up in places like /var.

You would think it's not much of a deal. But, putting in logs and using custom log levels is important to places like banks, insurance companies, places that sell huge numbers of ticket, hospitals, etc. etc.

So, I was at Google and saw some politics happen over it. In fact, it turned rather ugly.

So, a managers at a large financial institution wanted more logging. So, he defined his lovely grandiose format. In the mean time, a person on his staff who worked closely with the logs wanted a simple JSON format. A man at Google posed a solution that addressed the JSON format. It was a simple as loading in a module and pressing the run button. Anyway, a certain amount of misery happened around the nice, simple solution that could work perfectly for the people in the trenches and some guys wanting stupid work for lots of money for friends in their home town. Needless to say, the best solution did not win in this debate.

In fact, something worse than a bad solution won. Maybe the worst possible.

Next you had a team of guys taking months to rewrite console.log statements into something that takes forever to run and gobbles up memory to boot.

So, I had to check out just what happened.

The chosen solution uses objects that have to be formatted over a number of lines of code to make a log event happen. If the log level is high enough, most of the objects are not used. But, JavaScript is a language that goes in and allocates memory for objects ahead of time. So, even if the objects aren't used they take up space and there is lots of time lost to memory allocation. (Most programmers overlook memory allocation and think that calling the system allocator all the time is just fine. This is great for tiny little apps that run and then exit. This is really bad for things that stay up for days on end. That's why programs like Nginx and Memmap use slabs.)

So, I did some timing. Yep. Just the console methods by themselves are hundreds of times faster and consume much less memory.

But, how you invoke a function is important. And, lots of mechanics can be set up for a log context long before the program gets to it. This is just shifting some work to initialization. So, trying some of that out, the programs get considerably faster.

Now, what about a preprocessor? I used gpp for a test. Yes, gpp improves all the implementations. But, people who go for JavaScript might feel uneasy about the precompile nature of such a tool. One Google manager said the idea was "Scary". I brought it up just to scare them. But, if they wanted the best product, they would use a precompiler. Why? That's because the interpreter deals with all the code it sees. And, without the code there, the program will just deal with it's intended purpose.

So, some people here have suggested editing tools that remove statements. Great idea! Only, if the client (being extra large and rich) wants log levels, you can't take log statements out of the code. You can take them out of the code just before the program runs. And, you can do that reliably. Perhaps the "debug" module does that?

Now, I did not want to stop there. One really needs to address the use of program memory.

No matter what programming language you use. Extraneous log messages use up more memory. In compiled language, that just might be some buffer the compiler leaves off to the side. But, there is still a lot of last minute formatting.

So, what about shared memory?

The idea here is that all the extra junk that log messages make can be shoved out of the program before messing with the execution environment. And, with many cores on a CPU, one core can just deal with the junk (like a garbage man).

Sure enough this speeds up the programs radically. In fact, you never have to really format a message. Just write the junk into an available spot in shared memory, and let another process worry about storing it. In fact, the log messages don't have to be formatted until someone looks at them. So, you can dump coded buffers to a file (fixed sized buffers if you like spending less time on all aspects of dealing with the junk.) Later, someone could write a specialized version of "cat". This would format the messages for human readers.

So, what about the time lost just making a call to a method that dumps logging junk?

Here is where the language gets to be problematic.

I made a Lua version of the logging junk dumper and found that it was considerably faster than anything in JavaScript. So, millions of message can leave the program in eight hundredths of a second, while the Google JavaScript logging system will take tens of minutes to dump the same stuff. Somewhere in between the Google implementation and the Lua implementation the just plain old console.log works fine.

So, what is the moral of this story?

Basically, if you are debugging, then "Hey" some log statements that you remove, stepping in the wonderful graphics debuggers, cool edit macros. Those are all good.

But, if you have to go to trouble to set up systems logging and you really don't want all that junk to just take up time and space, then you might do something a little more clever with the OS than a lazy programmer or manager might consider.

Certainly, there is no reason for a company of any size to spend huge amounts of time to put bloat into a program in order to fill up disks with logging information. If the binary dump of parameters can go through a fast backdoor out of the program and stay unformatted on disk that will be a huge saving in the long run. Later, this files can be read and formatted in any number of ways for the systems they have to feed.

Now, don't raise the "gathering in real time" issue. The backend can be a pipe to, with formatting happening closer to the consumption of data if that is ever needed.

Keep in mind that formatting is for people and not the machines.

Hope this comment was not too bothersomely long.