Full-stack infrastructure monitoring solution and my experience with InsightCat. From DevOps newcomer.

Hello all!

A little bit about me. I'm a newcomer to the DevOps world and right now I try to understand what should I, as a junior DevOps, know and use to monitor IT infrastructure successfully.

My first major tasks sounded frightened to me at first. But, later, I understood that I just need to find the right solution that is accompanied by automation, so I can confidently rely on it. So, my goal was to find a SaaS to monitor a big IT ecosystem, prevent downtimes, analyze log data, etc.

I went through different blogs, forums, review websites, and the systems I came across everywhere were Datadog, New Relic, Zabbix, ELK, etc. Thanks to my team, we decided to test all of them and then decide which one is our favorite.

We started with Datadog. I can admit as a person, who doesn't have a strong tech background, you can't really understand what to do on the initial steps. Really steep learning curve that doesn't allow you to stay calm when you don't understand how to set the product up. Almost the same picture relates to other products.

Ok, I know that perhaps my scenario isn't for senior DevOps or SecOps specialists. When you've spent years in IT, it's obvious for you, how to write tones of code and configure tools like Datadog. But my case is more about guys who try to figure everything out quickly and hope that this tool will cover all business-critical needs. Especially, when you want your boss to be proud of you :)

So, the choice was I have never met them before and was curious to try this product.

The first good sign. I set it up quickly without the need to write a lot of code. Wow! I installed the agent by using Telegraf. Then I connected my infra to InsightCat server. And, finally, I started to receive data. Because of auto-discovery, InsightCat explored my system automatically.

My personal favorite parts are Logs and Insights. I worried that I won't find a solution that can display log data in so user-friendly framework.

log management

And, my need to have downtime prevention was filled by Insights. This tool allows you to see the most important metrics and see whether the figures were increased and get root cause analysis.

detect downtime

I'm not saying that other solutions are bad. Once again, I'm not a deep tech person. Yes, my skills are required to be deeper. But even in my situation, there was a relevant solution.

Good job, guys from InsightCat!

Share your experience if you also used InsightCat or any other products mentioned above. Was a Datadog also difficult for you? Maybe New Relic?

Thank you!

