If you've ever trained a beautiful model in a Jupyter notebook, watched the metrics shine, and then realized you have no idea how to actually put it in front of users, congratulations: you've just discovered why MLOps exists.
In this series, we are going to walk together from a notebook to a fully deployed, monitored and self-retraining ML system, one tiny step at a time. But before we write any code, let's get the foundations straight
So, what is MLOps?
MLOps (short for Machine Learning Operations) is the set of practices, tools and culture that lets you ship machine learning models to production reliably and repeatedly. Think of it as DevOps' younger sibling: same spirit (automation, reproducibility, monitoring), but adapted to the weirdness of ML, where your code is not the only thing that changes, your data changes, your model changes, and the world your model lives in changes too
A useful way to picture it is the ML lifecycle:
- Data collection & versioning — where does the data come from, and which version did we train on?
- Experimentation — which features, which model, which hyperparameters?
- Training & evaluation — does it actually work, and is it better than what we had?
- Packaging — wrap the model in something deployable
- Deployment — serve predictions to real users (batch or real-time)
- Monitoring — is it still working? Did the data drift?
- Retraining — close the loop and start again
Traditional software has steps 4–6. ML has all seven, and steps 1–3 keep coming back to haunt you
Why "it works on my machine" is worse in ML
In classical software, if your code runs locally, it has a decent chance of running in production. In ML, that's a trap, because the model's behavior depends on three moving things, not one:
- Code: the training script, the preprocessing, the inference logic
- Data: the exact dataset (and its version) you trained on
- Environment: Python version, library versions, CUDA versions, OS
Change any of these three and your "great model from Tuesday" becomes "mysterious garbage on Friday" This is why ML teams need stricter versioning, tracking and packaging discipline than most web teams.
What problems does MLOps actually solve?
Concrete pains you'll feel without MLOps, and that we'll fix in this series:
- "Which dataset gave us that 0.94 F1 score? Nobody remembers."
- "The model works locally but crashes in the Docker container."
- "We retrained the model and accuracy dropped, but we can't roll back."
- "Production is silently degrading and we noticed two weeks later."
- "Every deploy is a hand-crafted artisanal disaster."
Each of these has a tool and a workflow that solves it, and we are going to meet them(almost) one by one
The MLOps stack we'll build
Here's a sneak peek of the tools we'll touch in the next articles:
- DVC for data versioning
- MLflow for experiment tracking and the model registry
- FastAPI for serving
- Docker for packaging (we'll lean a bit on Clelia's 1minDocker series here)
- GitHub Actions for CI/CD
- Evidently for monitoring data and model drift (we can use prometheus and grafana too)
- A cloud provider (we'll pick one later) for actually deploying it all
Don't worry if some of these names sound intimidating, we'll introduce them gently, one per article, and always with a working example.
What you need to follow along
Nothing fancy:
- Python 3.10+
-
gitinstalled - A GitHub account
- Docker installed (highly recommend to follow this series https://dev.to/astrabert/1mindocker-1-what-is-docker-3baa)
- A laptop and ~1 minute per article 😉
In the next article, we'll get our hands dirty: we'll take a small dataset, version it with DVC, and finally answer the question "which data did we train on?" without crying
Stay tuned and have fun!
Top comments (0)