DEV Community

Muhammad Abdullah
Muhammad Abdullah

Posted on

The Unglamorous Truth About Deploying ML Models in Production

Everyone talks about building ML models. Nobody talks about what happens after.
You've trained your model, hit a decent accuracy score, and it works beautifully in your notebook. Then you try to deploy it — and everything falls apart in ways no tutorial prepared you for.
I've been building and deploying ML systems in production for the past year, and here's what I wish someone had told me earlier.

1. Your Model is the Easy Part
The model itself is maybe 20% of the work. The other 80% is everything around it — API design, data validation, error handling, and keeping the whole thing alive under real traffic.
Lesson: Treat your deployment stack with the same rigor you treat your model.

2. Latency Will Surprise You
Your model runs in 200ms on your machine. Now add network overhead, cold starts, and concurrent requests — suddenly you're at 2 seconds, and users are complaining.
Lesson: Benchmark early. Use async where possible. Cache aggressively.

3. Input Data in Production is Messy
In training, your data is clean. In production, users will send you anything — missing fields, wrong types, empty strings. Your model won't fail gracefully.
Lesson: Validate everything at the API boundary before it touches your model. Pydantic with FastAPI makes this effortless.

4. You Need Logs More Than You Think
When something breaks at 2 am — and it will — you need to know what input triggered it and what the model returned. Without structured logging, you're debugging blind.
Lesson: Log before you need it, not after something breaks.

Final Thoughts
The gap between a working notebook and a reliable deployed system is enormous — and most of that gap isn't about ML at all. It's about software engineering fundamentals applied to a non-deterministic system.
Build accordingly.
What's caught you off guard in production? Drop it in the comments.

Top comments (0)