DEV Community

Cover image for Introduction to MLops
Rohan Raj
Rohan Raj

Posted on

Introduction to MLops

Let us suppose you have trained an accurate machine learning model and might be wondering what to do next?
Well, after having developed such a good model it's time to celebrate first and next is to use it in production so that we have a real-time software making actual use of the model.

So, what is MLops?

As the name suggests, it's ML + DEV + OPS

It's a whole machine learning project life cycle in which we start by scoping our project, to data, to modelling through deployment all the way from start to finish.

So, is developing a Machine Learning system in production a single step process? Well, it isn't! There are several aspects and challenges that come into picture when the deployment has been done. Having a system up and running is halfway through the process. The other half is keeping that system consistent and dependable and handling issues such as data drift, where there is gradual drift in terms of the data, which we used to train our model and the data we are using to perform inferences.

D. Sculley et. al. NIPS 2015: Hidden Technical Debt in Machine Learning Systems

There are several tools and methods which are involved in a complete machine learning system. The ML model is a small yet essential component in the infrastructure.

But no model lasts forever. Even if the data quality is fine, the model itself can start degrading. What does this mean in practice?
The world is dynamic, and our model needs to be aware of the changes happening to be accurate and dependable.

In short, having a machine learning model in production is an iterative process where we constantly evolve and improve our model.
MLops is the way to use several tools and methods to automate this process and make this process of evolving easier.

So, are you ready to dive in?

Software Engineering issues in MLops;

Let’s suppose we have a system which takes in “x” as an input and predicts “y” as the output, then there are some aspects which you need to keep in mind:

  1. Realtime or Batch: Does the model make real-time predictions as in case of speech recognition models or is the model working on a batch of data whose output may or may not be instant.

  2. Cloud vs. Edge/Browser: Is your model deployed in the browser or at an edge device such as mobiles which do not require internet access? Let’s suppose there is a critical system running in clouds and suddenly the internet goes down. In such cases, edge device-based models would be more reliable.

  3. Compute resources: Choosing the right software architecture also depends on the hardware limitations we have. A model trained on an exceedingly high GPU or TPU may not work with same efficacy on a CPU.

  4. Latency, throughput (QPS): The architecture also depends on the number of queries being handled by our model. It can be measured in Queries Per Second.

  5. Security and privacy: For different applications, the level of security and privacy would be different based on how sensitive the data is.

Deployment patterns

Common deployment cases:

  1. New product/capability: If you are offering a new service such as voice commands, then the best way to implement it is to start with a small batch of service and then slowly ramp it up.

  2. Automate/assist with manual task: There are some tasks being done by humans and we want to use the advantage of ML models to automate or assist the task.

  3. Replace previous ML system: You now want to replace the already running ML system with a new one.

For a system, we start with a small traffic and gradually increase its uses.

There are several deployment patterns which can be used:

  1. Shadow mode: In this scenario, the ML system shadows the human and runs in parallel. The output of the ML system is not used for any decision during this phase. This can also be thought of as the monitoring phase in which we measure how accurate our model is.

  2. Canary deployment: In this method we pass a small part of the incoming traffic to the ML system to make real time predictions. Here we keep monitoring the system and ramp up the traffic gradually.

  3. Blue green deployment: In this type of deployment, we have two instances of system running. One instance is the previous version (Blue) and the other one is the updated version (Green). The router switches the traffic from blue to green instantly to use the new system. And if there is any issue, the traffic can be routed back to the previous version.

So, in general, there is a spectrum of the degree of automation we want to achieve. On one end of the spectrum are humans and on the other are machines. We can have a pollination of both and decide the degree to which we want to involve humans in the loop.

Image description

Monitoring ML systems:

The best way to monitor any system is through dashboards. We can monitor several metrics of our system such as server load, latency etc. through these dashboards.
Using these metrics, we can brainstorm through them to understand if our learning algorithm is performing well.
Just as ML modelling is iterative, so is deployment. After deployment, we keep monitoring the metrics and do performance analysis to improve it. This metric can also be used to update the model and tune it for more accuracy.

Hope you have a brief idea about MLops in general.

If you enjoyed reading it, please do give it a like :)

Top comments (1)

Collapse
 
errorrr404 profile image
Errorrr-404

Wonderful article.
Looking forward to more articles like these!

Jabtak aapki posts aate rahegi humara knowledge badhta rahega.. iss chiz ko barkaraar rakhiye🙏
Har Har Mahadev 🙏