Marwan Mohammed

Posted on May 23

ML System Development and Redundancy: Stop Rebuilding the Wheel

#ai #machinelearning #softwareengineering #tutorial

Introduction

For the longest time I’ve found myself asking one question repeatedly:
“Do I really have to rewrite all of that every single time?”
The answer for me at the time was to create a “helper functions” repository on GH, it was a painkiller that worked, until it didn’t. Maintaining it was inefficient and exhausting, and it lacked the cohesion a real project needs.

Then, as I was hanging out with a few friends of mine who happen to be Backend Engineers (Shoutout to Youssef Tarek And Yahia Al-Touny), I saw the light. They were discussing a custom template they had built for their services. It was impressive, and it hit me: “Why don’t I create something similar for ML?”

This is when I started researching industry standards, best practices, and existing templates. Afterwards, I started working on it, and in Today’s article, you’ll get first class seats into the “How” and “Why” behind the MLOps template I created.

Templates vs Boilerplates

When I first started with this, I was confused by both terms, specifically what each one meant and how they were different from one another. So, If you’re like me, read on, if you already know, feel free to skip forward

Templates: These are architectural blueprints. They enforce a structure, ensure scalability, and allow teams to collaborate within a standardized environment.
Boilerplates: These are “copy-paste” solutions — reusable code blocks that require little to no modification to work. My goal is to build a boilerplate-flavored template, trying to get the best of both worlds!

The Graveyard: Finding Gold in Dead Code

I started by scrapping my own graveyard. I looked at every dead project, every “I’ll fix this later” comment, and every helper script I’d ever written. I asked one critical question: “Does this work outside of its intended project?” If the answer was no, I had to figure out how to make it modular.

The Pillars of the Template

1. Observability: Seeing in the Dark

Observability is arguably as important in ML as it is in standard Software Engineering. You need to be able to see how your code behaves, what succeeded and what did not, why, etc.
This makes debugging much easier, not just in development, but in production as well. And so, I started setting up my logger to track the most important information

Schemas: The underrated warrior in ML

Having a schema for everything that is not controlled by you is important, simply because validates that both inputs to the model and outputs are clean and valid..at least type safe.
So, I started setting up schemas (Pydantic) for requests, responses, app configurations, even a validator for environment variables.

Stop Switching Engines, Start Changing Oil

Previously, I created a different “engine” for every task (e.g., one for binary classification, one for multiclass). This led to massive, 1,000-line modules. Which honestly make no sense at all.
And so, I created a unified engine for every task, and then some adaptors for each task, that means that the core is the same for everything, but since each task is unique, a small adapter can be fitted to accommodate for its needs. This applies to both training, and loading/inference abstractions.

“It works on my Machine!”

An infuriating sentence, isn’t it?
Well, that’s why I decided to start working with Docker.
Because simply put, Docker makes sure that your machine is ported into whatever environment the code is shipped to, making sure that everything is consistent anywhere with minimal setup.

Developer Experience: We’re Human Too

We sometimes forget that we are human as well, and so, we need to take care of ourselves while we’re writing absolute units of computer programs.

And so, I decided to look further into this, and decided to implement the simplest form of it: Command Line Abstractions.
Initially I was thinking about Makefiles, but as I was researching more, I cam across this beautiful tool called “just”, which is basically just a wrapper around any number of commands you want, just create the configuration file (pun intended) called “justfile”, set any command you want in there with parameters if you want to and comments as “help”, and it’s running just <command>, if you need to, you could also run just list to see all commands you defined.

Breadcrumbs: Don’t lose your way in the maze

Before, I used to either manually log everything or not log anything at all, which meant that tracking my experiments, including hyper-parameters, metrics, models, artifacts, etc. was a nightmare!
Till I came across MLFlow, a piece of art disguised as a tool. It gave me the ability to build anything I want and do however many experiments I want and not lose context for how each run performs and what each run included, including data signitures, datasets, metrics, hyper-parameters, models and artifacts, etc.
This made life so much easier honestly

Notebooks: Not a liability as many think

Notebooks are notorious for making Data Scientists lazy, they just run the experiment in there, see the results, maybe log the model, and call it a day. But in reality, notebooks are much more powerful if you create them correctly!
It must give you the ability to transition from the experimentation environment into the system with the least amount of friction. Including properly logging runs and models, proper setup for pre- and post-processing, data loading, etc.
And so, I created what I call Notebook-as-a-Bridge (NAAB), not a fancy tool, but a way to write notebooks that make transitioning into a production system frictionless.

Don’t ship what you didn’t test

Writing beautiful code is meaningless if it breaks easily. In ML, this means testing:

Model loading and training

Inference logic
The API server
Proper tests don’t touch the disk or load heavy models; they use mocks to test components in isolation and in memory.

And suddenly…You’re Ready

Now, you’ll find that you actually have everything you ever need to start building any ML project, experimentation environment, testing, and even prod ready system for your use!
The more proper your template is, the faster you can go from experimentation to production, and honestly, this is the most beautiful thing ever about this.

Now what?

This template is a living organism. It will continue to iterate as I learn. Use it, fork it, and if you see an improvement, open a PR! Let’s improve the community together.

(Also, to prove this works: this template officially survived its first major project, which is now being deployed into production!)

GH Repo

You can find the repo here, if you think it is useful, do give it a star, and if you feel like you want to improve on it, feel free to fork!

DEV Community