loading...

Application logging and production monitoring

muthandir profile image muthandir ・3 min read

In my old days, I used to work in the corporate world as a developer, tech lead, architect etc. Back in those days I rarely worried about how we should do logging & monitoring. We always had means and ways to accomplish end 2 end visibility.

Later on, I co-founded a startup and my partner and I had to pick our tech stack. Me being a .net guy forever and him being a laravel pro, we went on with node.js :) (For several reasons, but that is another story)

Back to logging, what I needed was the ability to save the entire lifetime of an incoming request. This means the request body/header info, service layer calls and respective responses, DB calls etc. Additionally we wanted to use microservices back then (Again, another story with lots of pros and cons). So the entire lifetime also includes the communication between the microservices back and forth. So we needed a request id, and with it we could filter the logs and sort by time. Lemme break it down into different steps:

UI: We use a SPA on our front-end. The UI makes HTTPs calls to our API with unique req-id info per request.

API Layer: Our business services in the APIs are instantiated using Factories which inject the dependencies. So in theory you could create a custom logger, enrich it with "req-id" and send it over to the business services for the use of developers but honestly we've never needed to take that route so far. Instead, I wanted to have a stable and automatic way to create and manage the logs. Also, I never wanted to have logging-related codes in the service layer. I believe they reduce the readability and they could potentially cause bugs which should not happen in the first place inside a business logic. This way, the developers simply know that the entire flow will be "logged" without any extra effort. So the factories, rather than injecting the logger into the services, they just wrap the service objects with an in-house logging library which adds another layer of promise to capture the input parameters and resolving responses.

Microservice Communication: We created another in-house library, a forked version of Request Promise Native. It helps our developers with the authentication mechanism and injecting out of band req-id info so the target microservice can read and use it throughout the lifetime of its underlying services. This means, all our microservices have the capability to read the incoming req-ids and forward it to outgoing microservice calls.

Logger: There are lots of good logging libraries out there. Although, a word of caution, please mask your messages and don't log any sensitive data! I've seen logs with credit card info on it, please don't do it. Your users depend on you and this is your responsibility!
Anyways, we decided to use Winston because,
1-Winston is good
2-It has Graylog2 support, which brings us to our next item:

Log Repository: In the last 10 years or so, I don't remember a single case when I had to check the server log files for monitoring purposes. It is just so undoable to walk through those files with a line for req1 right after another line for req2. It simply won't help and actually in one of the US banks that I used to work at, the Devops folks suggested that we could simply stop creating them. Of course, that doesn't mean you could stop logging. 'Au contraire!', it is very important that you have a log repository where you can search, filter, export and manage your logs (like creating an alert-This is important because this way you will know there is a problem even before the customer sees the error message!). So we reduced our options to the following tools:
-Splunk
-Graylog
We selected Graylog because we had experience administrating a Graylog server, it is an open-source tool (meaning much lower costs as it needs a mid-sized server) and it just does the job.

I remember back in the days I used to walk-through the logs before each release to understand if we are about to introduce any unexpected error. Your logs will give you lots of different insights about your application. It will potentially help you to uncover bugs, detect fraudulent activities, receive error alerts and much more. Embrace them, love them, monitor them.

Posted on by:

muthandir profile

muthandir

@muthandir

I gave my kids 7 pencils in different colors for painting. Each color meant a musical note. Then I played simple chords on the guitar with those notes. Now they got paintings with soundtracks.

Discussion

pic
Editor guide
 

Nice post.

...logging library which adds another layer of promise to capture the input parameters and resolving responses
Will there be any performance impact as the number of microservices and messages in between increases?
Also, how did you monitor system state changes from the logs?
I am new to monitoring and currently studying event-store mechanism.

 

Hi thanks,
For logging, we had a run of perf tests. The impact was negligable considering the benefits.
The perf impact of the microservice architecture is another topic to go through. One should seriously consider if it is really needed. Anyways, in our case, we use AWS auto-scaling groups and a pub-sub mechanism using SNS -SQS coupling. So I am really glad about the scalability of the entire system.
For system state changes, pre-release: we got scripts of queries for error logs. You can search your logs excluding the known issues, which brings up only the new errors. Post-release: 1-we have error alerts (you can configure throttling, error count and such conditions about the alerts) 2-we added duration of each function call into our custom logging library. So we have rules about the long running function. Anything runs over X ms in a microservice and Y ms in-between microservices trigger an alert.(actually maybe i should add this perf checks to the original post). Hope this helps.

 

Sure it does. Thanks for explaining.
Btw what is your startup?