Understanding What Serverless ML Is
Machine learning systems are broken down into a 5-step process.
The steps are: identifying the problem, collecting data for training models, training the model, inference, and service/user interface generation.
Of these 5 steps, the data scientist is, more often than not, only used within the third step of the process. They are disconnected from finding the data (the first step) and use the generated models to make a prediction based on new data (the fourth step onward).
This is where serverless machine learning (SML) steps up and allows data scientists to be a part of the entire 5-step process.
As defined by Jim Rowling, “SML is a new category of loosely coupled serverless services that provide the operational services (compute and storage) for AI-enabled products and services.”
Put another way, SMLs are a group of cloud services (think AWS, Google Cloud, etc.) that do the heavy lifting for each process phase of machine learning systems.
Okay, so we have a system that will put the data scientist at the forefront of the production process from start to finish.
Why? Why would anyone want to undertake this endeavor?
For the most part, most machine learning models never make it to production.
I can attest to this, as I’ve built numerous models in my time experimenting or during my tenure at the university, but no one other than myself, a few friends, and my lectures saw those models.
Production meant I would have to devise a method to get an application with a UI that allows people to query the model, allowing it to make predictions and receive new data on a frequent period.
Not to forget the sheer number of AI infrastructure applications I would have to learn to deploy this model, no less having the proper computing power.
I think about apps such as Kubernetes, Docker, and other cloud infrastructure.
SML prevents all of this.
It creates a system for machine learning models to make intelligent decisions without the need to install or operate for ourselves any cumbersome AI infrastructure.
Another upside of this debacle of cloud vs. no cloud application is that the services deployed all work in what I call connected silos.
Errors within each step in the process can be pinpointed to an exact component of the pipeline. Does the model needs more frequent data to be trained on?
Easy.
Adjust the first step of the workflow to run more often on GitHub Actions. The beauty in this is that the data only stops coming if you turn off automation.
In addition, the data can also come from multiple different sources, such as tabular data, unstructured data, search, JSON, APIs, web scraping, and many more.
The cloud comes with scalability baked in. It makes it that much easier to build services that are always online, scalable based on user usage demands, upgraded, backed up and restored on a moment’s notice.
Now, with any service that is being provided, the main drawbacks are having to not only pay for what you use but also not knowing what is being done to the data that is being sent to websites.
This, in my opinion, is a fair trade. The other option is to learn a series of applications that have a far steeper learning curve (weeks to months for each application of the machine learning process).
Sure, you own your data and have a higher level of autonomy as to not being locked into a particular vendor, but that isn’t what I want.
My primary goal is to get a working model into the hands of a user.
So we’ve discussed here about what is serverless machine learning (SML). A decoupling of each step of the machine learning process from using local application to cloud infrastructure to get a product into the hands of a user.
Let connect on either LinkedIn or Twitter
If you want an admin task automated for yourself. I’m taking 3 users right now for free
Fill out this 3 step form here so we can start a conversation.
Thanks for reading.
Top comments (0)