DEV Community

loading...

Discussion on: Logging done right, makes your life bright!

Collapse
phlash909 profile image
Phil Ashby

Hi Shashank, Gajanan makes a good point about being able to adjust logging levels at runtime /while an application is exhibiting a fault/, he says: "once the application is in some problematic state, it would remain in that state".

This is an assumption that is not always true (sadly!), hence my approach of creating a logging system as a detailed (debug level) history recorder albeit with limited capacity. Provided faults are detectable (eg: system crash, exceptions), then the debug level history buffer can be saved at that point, hopefully with enough duration & detail to find the root cause. If you are familiar with the concept of core dump files, this is similar, but more manageable than a dirty great memory blob, and with event history, not just a memory snapshot at the point of failure.

If, in addition to crash traces, we also want to create a long term activity log, we would apply a filter (eg: log levels, log sources or both) to the history buffer, and emit these filtered events to suitable storage (eg: log file).

Such a filter can respond to configuration changes in much the same way that Gajanan describes for setting a 'log config' at source, that is: detecting file changes and reloading itself, thereby changing the activity log content. The important difference is that the debug level history buffer and crash dumps are not affected by changes to an activity filter.

FYI: In one system, we had a separate management card that held the event history in RAM for a number of daughter boards, and filtered this information to disk storage. The management card had a CLI, so activity logging could be managed interactively as well as via configuration files (Cisco IOS has similar features).

Thread Thread
grhegde09 profile image
Gajanan H Author

I mostly agree with Phill. However unfortunate it is, the following is true! 1. It may be very hard to reproduce an issue with proper log settings. 2. System may not remain in the problematic state forever.

But like I mentioned the size of the log file on some systems are limited. We have a rotating log file of 50MB size. You may not believe me when I say 50MB is not sufficient to capture the logs for 30 minutes when we set log level to debug/trace. We don't have a way of backing up these logs to some server to analyse later. What we have is just 50MB! So we have to apply filters at the generation point!

We have to smartly design our logging system after considering all the limitations of the target system!

Thread Thread
phlash909 profile image
Phil Ashby

Sucks to be limited to 50MB and no way to siphon off to longer term storage! How about allocating some of that space to a short term circular log at debug level, enough to capture a few seconds of history or a couple of cycles through the main application loop (probably <1MB?), then as/when bad things happen, copy that buffer to a new file and raise the alarm... you'll likely get 50+ chances to see what's going wrong before running out of space and having to remove the oldest logs.