The Model Registry That Stopped Working
Migrating from MLflow Model Registry to a Kubernetes Operator isn't just swapping out one tool for another. It's changing how your models get versioned, promoted, and deployed — and if you mess up the handoff, you'll discover your staging environment is serving last week's model while production points at a registry that no longer exists.
I've seen this migration go wrong in predictable ways. The mistakes aren't dramatic explosions. They're quiet: a model version mismatch that surfaces three days later, a rollback that doesn't work because the old artifact storage is gone, or a deployment that succeeds but serves inference from the wrong endpoint.
This post walks through four specific pitfalls I've hit (or watched others hit) when moving from MLflow's model registry to a Kubernetes-native operator like Seldon Core or KServe. These aren't edge cases — they're the default failure modes if you don't actively prevent them.
Pitfall 1: Model Version Drift Between Registries
Continue reading the full article on TildAlice

Top comments (0)