DEV Community

Cover image for A Step-by-Step Guide to Deploying a Machine Learning Model in a Docker Container
Victor Isaac Oshimua
Victor Isaac Oshimua

Posted on

A Step-by-Step Guide to Deploying a Machine Learning Model in a Docker Container

A machine learning model becomes truly valuable when it is deployed beyond the confines of a Jupyter Notebook. In other words, its potential remains untapped unless made accessible to users, enabling them to leverage the model for informed decision-making. Consequently, deploying a machine learning model into production is not just important; it is imperative to unlock its practical utility and bring its benefits to the real world.

Developing and deploying a machine learning model involves various packages and requirements, each with different versions. Suppose a specific package version used during the model's development is unavailable on a user's machine. In that case, it can lead to errors, often called the "It worked on my machine" or version conflict problem.

To address this challenge, Docker provides a solution. Docker allows us to isolate all the packages used in developing a machine learning model and encapsulate them within a separate system.
This ensures that any version of a package used in the model's development remains available within this isolated environment. Importantly, this system can run on any machine, including Windows, macOS, or Linux, eliminating compatibility issues and enabling seamless deployment and usage of the machine learning model across different platforms.

In this article, you will learn how to deploy your machine learning model with Docker.

What is a Docker container?

A Docker container is a lightweight, portable, and self-sufficient unit that encapsulates an application, along with its dependencies and runtime environment, into a single package. It allows you to run applications in an isolated environment, ensuring consistency across different environments and simplifying deployment.

Docker containers provide several key benefits, including portability (applications can run on any system that supports Docker), reproducibility (consistent behaviour across different environments), efficiency (containers share the host OS, reducing resource overhead), and ease of deployment and scaling. Containers have become a fundamental technology in machine learning engineering and are widely used for deploying machine learning applications in various contexts.

Prerequisites

To get the most out of this article, you should:

  • Be familiar with machine learning, i.e., you can build a machine learning model.
  • Be familiar with the command line interface.

Setting up docker

To get started with deploying a machine learning model in a Docker container, you need Docker installed on your machine. For this tutorial, I will be using Ubuntu Linux as my operating system.
Note that you can install and run Docker on any computer, including Windows, Mac, and Linux. Follow the instructions below to install Docker on your computer.

Install docker on Ubuntu linux

  • Open your command terminal and run this command: sudo apt-get install docker.io

  • Verify if Docker has been installed by starting the Docker service: sudo systemctl start docker

  • Next, confirm if Docker is running: sudo systemctl status docker

Install docker on Windows machine

If you're using a Windows machine, follow this awesome guide by Andrew Lock

Install docker on Mac os

There are different methods of installing Docker on a Mac OS, depending on your Mac type (Intel or silicon-based Mac). Follow these steps in the Docker documentation to install Docker.

Preparing Machine Learning Model

In this article, we will build a customer churn prediction model. The objective of this model is to predict whether a customer is likely to discontinue utilising the services provided by a telecommunications company based on the customer's specific details. To develop an effective customer churn prediction model, we need training data encompassing both instances of past customers who have discontinued the service and those who have retained it. The dataset is available for download on Kaggle.

Let's train the model:

# Importing the Pandas library for data manipulation
import pandas as pd

# Importing the NumPy library for numerical operations
import numpy as np

# Importing the train_test_split function for splitting datasets
from sklearn.model_selection import train_test_split

# Importing the KFold class for K-Fold cross-validation
from sklearn.model_selection import KFold

# Importing the DictVectorizer class for feature extraction
from sklearn.feature_extraction import DictVectorizer

# Importing the LogisticRegression class for logistic regression modeling
from sklearn.linear_model import LogisticRegression

# Importing the roc_auc_score function for evaluating model performance
from sklearn.metrics import roc_auc_score


# Reading the CSV file into a DataFrame and converting column names to lowercase with underscores
telecom_data = pd.read_csv("WA_Fn-UseC_-Telco-Customer-Churn.csv")
telecom_data.columns = telecom_data.columns.str.lower().str.replace(' ', '_')

# Identifying categorical columns
categorical_columns = list(telecom_data.dtypes[telecom_data.dtypes == 'object'].index)

# Cleaning and preprocessing categorical data
for column in categorical_columns:
    telecom_data[column] = telecom_data[column].str.lower().str.replace(' ', '_')

# Converting 'totalcharges' to numeric, handling errors, and filling NaN values with 0
telecom_data.totalcharges = pd.to_numeric(telecom_data.totalcharges, errors='coerce')
telecom_data.totalcharges = telecom_data.totalcharges.fillna(0)

# Converting 'churn' to binary (0 or 1)
telecom_data.churn = (telecom_data.churn == 'yes').astype(int)

# Splitting the dataset into training and testing sets
X_train_full, X_test = train_test_split(telecom_data, test_size=0.3, random_state=1)
y_test = X_test["churn"]

# Defining numerical and categorical features
numerical_features = ['tenure', 'monthlycharges', 'totalcharges']

categorical_features = [
    'gender',
    'seniorcitizen',
    'partner',
    'dependents',
    'phoneservice',
    'multiplelines',
    'internetservice',
    'onlinesecurity',
    'onlinebackup',
    'deviceprotection',
    'techsupport',
    'streamingtv',
    'streamingmovies',
    'contract',
    'paperlessbilling',
    'paymentmethod',
]

# Function to train the logistic regression model
def train_model(X_train, y_train, C=1.0):
    dicts = X_train[categorical_features + numerical_features].to_dict(orient='records')

    dv = DictVectorizer(sparse=False)
    X_train_transformed = dv.fit_transform(dicts)

    model = LogisticRegression(C=C, max_iter=1000)
    model.fit(X_train_transformed, y_train)

    return dv, model

# Function to predict churn using the trained model
def predict_churn(df, dv, model):
    dicts = df[categorical_features + numerical_features].to_dict(orient='records')

    X = dv.transform(dicts)
    y_pred_prob = model.predict_proba(X)[:, 1]

    return y_pred_prob

# Model training and evaluation using K-Fold cross-validation
C_value = 1.0
num_splits = 5
kfold = KFold(n_splits=num_splits, shuffle=True, random_state=1)

auc_scores = []

for train_idx, val_idx in kfold.split(X_train_full):
    X_train = X_train_full.iloc[train_idx]
    X_val = X_train_full.iloc[val_idx]

    y_train = X_train["churn"]
    y_val = X_val["churn"]

    dv, model = train_model(X_train, y_train, C=C_value)
    y_pred_prob = predict_churn(X_val, dv, model)

    auc = roc_auc_score(y_val, y_pred_prob)
    auc_scores.append(auc)

# Displaying the mean and standard deviation of AUC scores for different folds
print('C=%s %.3f +- %.3f' % (C_value, np.mean(auc_scores), np.std(auc_scores)))

# Training the final model on the full training set and evaluating on the test set
dv, model = train_model(X_train_full, X_train_full["churn"], C=C_value)
y_pred_prob_test = predict_churn(X_test, dv, model)

# Calculating and displaying the AUC score for the test set
auc_test = roc_auc_score(y_test, y_pred_prob_test)
print("AUC on Test Set: %.3f" % auc_test)


Enter fullscreen mode Exit fullscreen mode

After training a model, the next step is to save this trained model so that it can be used to make predictions on new customers, determining whether they are likely to churn or not.

import joblib
# Save the model using joblib
joblib.dump((dv, model), 'churn_model.pkl')
print('The model is saved to model.pkl')
Enter fullscreen mode Exit fullscreen mode

After saving the model to your computer, go ahead and create a directory for this project. Creating a project directory is important because it helps in organising project files, managing dependencies, and implementing version control, among other benefits. Move the saved churn_model.pkl model file to the project directory.

Building a Web Service for the Model with Flask

Building a web service for the customer churn prediction model with Flask involves using the Flask web framework to create an API endpoint for making predictions. This facilitates integration of the churn model into applications or systems, enabling real-time predictions on customer churn.

Flask's simplicity and versatility make it an ideal choice for deploying machine learning models as web services, providing an efficient and scalable solution for delivering predictive insights.
To build a web service for your model, copy and save this Python script in your project directory and name it churn_predict.

import joblib
from flask import Flask, request, jsonify

# Load the model and dv from the file
dv, model = joblib.load('churn_model.pkl')

# Create a Flask application
app = Flask('churn')

# Define an endpoint for predictions
@app.route('/predict', methods=['POST'])
def predict():
    # Get customer data from the request in JSON format
    customer = request.get_json()

    # Transform the customer data using the loaded data vectorizer
    X = dv.transform([customer])

    # Make predictions using the loaded model
    y_pred = model.predict_proba(X)[0, 1]

    # Determine churn status based on the prediction probability threshold
    churn = y_pred >= 0.5

    # Prepare the result in JSON format
    result = {
        'churn_probability': float(y_pred),
        'churn': bool(churn)
    }

    # Return the result as a JSON response
    return jsonify(result)

# Run the Flask application
if __name__ == "__main__":
    # Run the app in debug mode on all available network interfaces
    app.run(debug=True, host='0.0.0.0', port=9696)


Enter fullscreen mode Exit fullscreen mode

Here is a breakdown of every part of the code:

  • Import libraries
import joblib
from flask import Flask, request, jsonify
Enter fullscreen mode Exit fullscreen mode

joblib: Library for efficient serialisation and deserialisation of the churn prediction model
Flask: Web framework for building web applications in Python.
request: Part of Flask to handle HTTP requests.
jsonify: Converts the churn prediction model into JSON format for HTTP responses.

  • Load model and data Vectorizer:
dv, model = joblib.load('churn_model.pkl')
Enter fullscreen mode Exit fullscreen mode

Loads a data vectorizer (dv) and a machine learning model (model) from the 'churn_model.pkl' file, they were saved together using joblib.dump.

  • Create Flask application:
app = Flask('churn')
Enter fullscreen mode Exit fullscreen mode

Creates a Flask application named 'churn'.

  • Define prediction endpoint:
@app.route('/predict', methods=['POST'])
def predict():
Enter fullscreen mode Exit fullscreen mode

Defines a route '/predict' that accepts HTTP POST requests for making predictions.

  • Retrieve customer Data:
customer = request.get_json()

Enter fullscreen mode Exit fullscreen mode

Retrieves customer data from the JSON payload of the incoming POST request.

  • Transform data and make predictions:
X = dv.transform([customer])
y_pred = model.predict_proba(X)[0, 1]
Enter fullscreen mode Exit fullscreen mode

Transforms the customer data using the loaded data vectorizer (dv).
Makes predictions using the loaded machine learning model (model).

  • Determine churn status:
churn = y_pred >= 0.5
Enter fullscreen mode Exit fullscreen mode

Determines the churn status based on the prediction probability threshold (0.5).

  • Prepare and return result:
result = {
 'churn_probability': float(y_pred),
 'churn': bool(churn)
}
return jsonify(result)

Enter fullscreen mode Exit fullscreen mode

Prepares the customer churn prediction result in JSON format.
Returns the customer churn prediction result as a JSON response.

  • Run Flask application:
if __name__ == "__main__":
 app.run(debug=True, host='0.0.0.0', port=9696)

Enter fullscreen mode Exit fullscreen mode

Runs the Flask application in debug mode on all available network interfaces ('0.0.0.0') and port 9696.

In summary, this Python script builds a Flask web service that exposes a prediction endpoint for the churn model. It loads the trained model and data vectorizer, accepts customer data as JSON through a POST request, makes predictions, and returns the results in JSON format. The application is then run on a specified host and port.

After building the web service, the next step is to create an inference script that will be used to make predictions. Copy and save this Python script in your project directory, naming it inference.

import requests
# URL for the prediction endpoint
url = 'http://localhost:9696/predict'

# Unique identifier for the customer
customer_id = 'user_x'

# Customer data for making predictions
customer = {
    "gender": "female",
    "seniorcitizen": 0,
    "partner": "yes",
    "dependents": "no",
    "phoneservice": "no",
    "multiplelines": "no_phone_service",
    "internetservice": "dsl",
    "onlinesecurity": "no",
    "onlinebackup": "yes",
    "deviceprotection": "no",
    "techsupport": "no",
    "streamingtv": "no",
    "streamingmovies": "no",
    "contract": "month-to-month",
    "paperlessbilling": "yes",
    "paymentmethod": "electronic_check",
    "tenure": 24,
    "monthlycharges": 29.85,
    "totalcharges": (24 * 29.85)
}

# Make a POST request to the prediction endpoint with customer data
response = requests.post(url, json=customer).json()

# Print the prediction response
print(response)

# Check if the predicted churn status is True and take action
if response['churn'] == True:
    print('This user will churn. Send promotional email to user with ID: %s' % customer_id)
else:
    print('This user will continue using our service. ID: %s' % customer_id)


Enter fullscreen mode Exit fullscreen mode

Here is a breakdown of every part of the code:

  • Import requests module:
import requests
Enter fullscreen mode Exit fullscreen mode

Imports the requests module, which is used for making HTTP requests.

  • Define endpoint URL and customer ID:
url = 'http://localhost:9696/predict'
customer_id = 'user_x'
Enter fullscreen mode Exit fullscreen mode

Sets the URL for the prediction endpoint.
Assigns a unique identifier (customer_id) for the customer.

  • Define customer data for prediction:
customer = {
 # ... (customer feature values)
}
Enter fullscreen mode Exit fullscreen mode

Defines a dictionary (customer) containing feature values for a customer, used for making predictions.

  • Make POST request to the prediction Endpoint:
response = requests.post(url, json=customer).json()
Enter fullscreen mode Exit fullscreen mode

Sends a POST request to the specified URL (url) with customer data in JSON format.
Converts the response to a JSON format and stores it in the response variable.

  • Print Prediction Response:
print(response)

Enter fullscreen mode Exit fullscreen mode

Prints the prediction response received from the prediction endpoint.

  • Check churn prediction and take action:
if response['churn'] == True:
 # Action if churn is predicted
 print('This user will churn. Send promotional email to user with ID: %s' % customer_id)
else:
 # Action if no churn is predicted
 print('This user will continue using our service. ID: %s' % customer_id)
Enter fullscreen mode Exit fullscreen mode

Checks if the predicted churn status is True (indicating churn).
Prints appropriate messages and takes actions based on the predicted churn status.

In summary, this Python script sends a customer's data to the Flask web service endpoint for churn prediction, receives and prints the prediction response, and takes actions based on whether churn is predicted or not.

Setting up a virtual environment

So far, we have developed a churn prediction model and a web service for our model. When executing code for our project, it relies on the modules and library versions installed in our system. However, challenges may arise when attempting to run the same Python code on another system with different library versions.
For example, if our code was developed using NumPy version 1.4 and the new system has NumPy version 1.6, it can lead to significant errors due to version disparities.

To address this issue, it is best practise to create a virtual environment. A virtual environment acts as an isolated workspace, allowing the project to have its own set of dependencies independent of the system-wide library versions.
By doing so, we ensure consistency and prevent potential conflicts arising from varying library versions between different systems. This isolation mechanism is particularly useful for maintaining the reproducibility and portability of Python projects across diverse environments.

To create a virtual environment, we will use pipenv. pipenv is a standard Python tool for creating and managing virtual environments. It provides a streamlined and consistent approach to isolating project dependencies, ensuring that the project operates within its own environment, separate from the system-wide Python packages.
Follow these steps to create a virtual environment:

  • Open a terminal or command prompt and run this command to install Pipenv.
pip install pipenv

Enter fullscreen mode Exit fullscreen mode
  • Change to the directory where your project is located.
cd path/to/your/project

Enter fullscreen mode Exit fullscreen mode
  • Run this command to create a virtual environment and install project dependencies.
pipenv install numpy scikit-learn flask gunicorn

Enter fullscreen mode Exit fullscreen mode

By following these steps, you create a virtual environment for your project using pipenv, isolating your project's dependencies from the system-wide Python packages. This helps maintain a clean and reproducible development environment for your project.

Building a Docker image

After creating a virtual environment for our project, we can proceed to deploy the churn prediction model and web service to a Docker container. To deploy to a Docker container, we first need to create a Docker image file. This Docker image file contains the settings and dependencies present in our project.
Docker images enhance the deployment of ML models by providing a standardised and efficient way to package, distribute, and run models in various environments. They address challenges related to consistency, reproducibility, and scalability, contributing to the successful deployment and management of machine learning applications.

To create a Docker image file, open your code editor and create a new file called Dockerfile
Next, copy and paste this code into the file.

FROM python:3.9.17-slim

RUN pip install pipenv

WORKDIR /app

COPY ["Pipfile", "Pipfile.lock", "./"]

RUN pipenv install --system --deploy

COPY ["*.py", "churn_model.pkl", "./"]

EXPOSE 9696

ENTRYPOINT ["gunicorn", "--bind=0.0.0.0:9696", "churn_predict:app"]

Enter fullscreen mode Exit fullscreen mode

Here is a breakdown of the Docker image file:

FROM python:3.9.17-slim:

Enter fullscreen mode Exit fullscreen mode

This specifies the base image to use for the Docker image. In this case, it's a slim version of Python 3.9.17, which is a lightweight base image.

RUN pip install pipenv:

Enter fullscreen mode Exit fullscreen mode

This installs the pipenv tool within the Docker image.

WORKDIR /app

Enter fullscreen mode Exit fullscreen mode

Sets the working directory inside the Docker image to /app. This is the directory where subsequent commands will be executed.

COPY ["Pipfile", "Pipfile.lock", "./"]

Enter fullscreen mode Exit fullscreen mode

Copies the Pipfile and Pipfile.lock from the host machine (the directory containing the Dockerfile) to the /app directory inside the Docker image.

RUN pipenv install --system --deploy

Enter fullscreen mode Exit fullscreen mode

Installs the Python dependencies specified in the Pipfile using pipenv.

COPY ["*.py", "churn_model.pkl", "./"]

Enter fullscreen mode Exit fullscreen mode

This line copies all files with a .py extension and the churn_model.pkl file from the host directory to the /app directory inside the Docker image.

EXPOSE 9696

Enter fullscreen mode Exit fullscreen mode

Informs Docker that the application inside the container will use port 9696.

ENTRYPOINT ["gunicorn", "--bind=0.0.0.0:9696", "predict:app"]

Enter fullscreen mode Exit fullscreen mode

Specifies the command to run when the container starts. In this case, it uses Gunicorn to serve the Flask application (predict:app) and binds it to 0.0.0.0:9696, making it accessible from outside the container.

Building the Docker image

After creating the Docker image file, you need to build the Docker image for the container. Before running the following command, make sure Docker daemon is running:

docker build -t customer_churn_prediction .

Enter fullscreen mode Exit fullscreen mode

Next, run your Docker container to use your machine learning model and prediction web service.

docker run -it -p 9696:9696 customer_churn_prediction:latest

Enter fullscreen mode Exit fullscreen mode

Congratulations! You have deployed your machine learning model to a Docker container!

To use the deployed model within the Docker container to generate customer predictions. Follow these steps:

  1. Open your terminal and navigate to the directory where the project is stored.
  2. Run the inference script using the command: python inference.py

You will get the output of your model prediction:

Image description

To take your machine learning model deployment further and leverage the advantages of cloud computing, you can consider deploying your Dockerized ML model to a cloud platform.

Closing remarks

Thanks for reading this article thus far, In this article, you've acquired the knowledge to successfully deploy a machine learning model, enabling you to run this project seamlessly on any system. This approach eliminates concerns about environmental and dependency conflicts, providing a robust and portable solution for your machine learning. applications.
If you have any questions or spot any errors, please feel free to reach me via email at victorkingoshimua@gmail.com. Your feedback will be highly appreciated.

Source code

Reference:

DataTalksClub MLzoomcamp

Top comments (0)