From AI/ML to observability - An Unexpected Learning Experience

#observability #ai #machinelearning #beginners

Introduction: My AI/ML Comfort Zone

Basically I come from an AI/ML background. For a long time, my learning revolved around data, models, training and testing the dataset, and so on. I always assumed my path would stay somewhere related to machine learning, AI, deep learning and related areas.

An Unplanned Shift into Observability

But recently, I got an opportunity to work on an Observability related project and honestly, I didn’t have any prior knowledge on Observability. It was not something I planned for. At first, it felt uncomfortable. Observability was not a term I had worked with deeply. I heard about monitoring, metrics, logs but I really don’t know what if observability comes to picture.

The Initial Discomfort: Facing Real System Failures

Suddenly I was exposed to a world, where I have to face so many system failures, I have to resolve those failures as well as I understand problems don’t come up with some neat error messages. Before solving the failure, I had to find out what actual failure is. As I’m coming from an AI/ML background, this shift really felt big.

From Datasets to Distributed Systems

Instead of thinking about datasets, model training, I had to think about metrics, containers, logs, latency. Initially, it felt both challenging as well as interesting.

Understanding What Observability Really Means

As I spent more time on the project, I started exploring the core ideas behind observability. I learned that observability is not something like what happened, but actually understanding why it happened.

Key Concepts and Tools I Explored

So here I got the chance to explore,

Metrics, Logs and Traces - which are the three major components of Observability.
Golden Signals - Saturation, Latency, Traffic and Errors.
Basic kubernetes concepts like pods, nodes and deployment.
And also I explored monitoring tools like Prometheus and visualizing tools like Grafana.

Why Observability Felt So Real

What stood out to me was how practical and real this domain is. Unlike assumptions, observability deals with real-time live data, downtime and failures. And also I learned one thing, how observability helps the teams debug faster instead of guessing what is the actual failure.

Lessons Beyond Technology

This experience taught me that learning does not always follow a straight line, and stepping outside a comfort zone actually strengthens my core engineering skills. Observability gave me a clear picture that how real world applications behave after deployment and also building an intelligent system with AI support is not only enough, the system should be observed, maintained and monitored properly.

Expanding My Perspective as an AI Engineer

I don’t see this shift as moving away from AI/ML, but rather as expanding my perspective. Understanding observability makes me think more responsibly about how systems are built, deployed, and maintained.