Mendy Kevin

Posted on Nov 2 • Edited on Nov 5

Complete Guide to Deploying Machine Learning Models with Flask and Docker(NO fluff configure and run like a pro)

#machinelearning #docker #containers #pipenv

Hello all! Welcome. This article addresses the technical aspects of deploying Machine Learning models that use Logistic Regression, a linear model used to make predictions based on trained data. I promise you'll be technical like a pro in configuring machine learning models , so stick around till the end.

What You'll Learn

Packaging models with Pickle
Serving ML models with Flask
Containerizing apps with Docker

- Exposing inference endpoints in Docker

The Big Picture: Understanding ML Model Deployment

Let's understand the overall workflow of deploying a machine learning model:

Save the Model: Start by taking your Jupyter notebook where the model resides and save it to a file with a .bin extension.
Load as a Web Service: Load this model from a different process (using a Python script) in a web service—for example, a "churn service" that predicts a customer's churn rate. We'll use Flask to transform the model into a web service.
Isolate Python Dependencies: Use pipenv (similar to conda or pip) to isolate the dependencies for this service and prevent interference with other services on your machine.
Isolate System Dependencies: Add another layer using Docker to isolate system dependencies.
Deploy to the Cloud: Once the local setup is complete, deploy the service to the cloud. You can use any cloud platform, but we'll use AWS Elastic Beanstalk (EB) for this tutorial.

Setting Up the Environment

Before training our model, we need to ensure our development environment has the right dependencies without interfering with other projects (which might require different versions of scikit-learn, pandas, etc.). We'll use pipenv for this.

Installing Pipenv

Note: If you have Anaconda installed and added to your system variables, it will automatically activate the (base) conda environment when you open a new shell/terminal. We don't want to install pipenv inside conda. Instead, we'll install it globally.

# Deactivate conda
conda deactivate

# Install uv (a faster Python package manager)
pip install uv

# Install pipenv globally
uv pip install pipenv

Creating a Pipenv Virtual Environment

Once you're in your project directory, manage all Python libraries and dependencies via pipenv:

# Create a pipenv virtual environment
# This automatically creates Pipfile and Pipfile.lock
pipenv --python 3.12

# Activate the virtual environment
pipenv shell

# Install all requirements for your project
# (Using scikit-learn==1.5.1 because this project requires this specific version)
pipenv install flask scikit-learn==1.5.1 numpy pandas requests

# Note: pickle is built into Python, no need to install it separately

# Check dependencies
pipenv graph

# Update Pipfile.lock according to your current Pipfile
pipenv lock

Launching Jupyter Notebook

After installation, launch Jupyter notebook inside your virtual environment:

# If you prefer VS Code
code .

# Or if you have Anaconda installed
jupyter lab
# or
jupyter notebook

If done correctly, you'll see your virtual environment in the Jupyter launcher.

Training the Model

Let's look at how we trained our model. (This isn't the primary focus, so I'll keep it brief.)

Making Necessary Imports

import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold

from sklearn.feature_extraction import DictVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score

Data Preparation

# Read and prepare data
df = pd.read_csv('data-week-3.csv')

# Make column names homogeneous
df.columns = df.columns.str.lower().str.replace(' ', '_')

# Handle categorical columns
categorical_columns = list(df.dtypes[df.dtypes == 'object'].index)

for c in categorical_columns:
    df[c] = df[c].str.lower().str.replace(' ', '_')

# Handle numerical data
df.totalcharges = pd.to_numeric(df.totalcharges, errors='coerce')
df.totalcharges = df.totalcharges.fillna(0)

# Convert target variable to binary
df.churn = (df.churn == 'yes').astype(int)

# Define feature types
numerical = ['tenure', 'monthlycharges', 'totalcharges']

categorical = [
    'gender', 'seniorcitizen', 'partner', 'dependents',
    'phoneservice', 'multiplelines', 'internetservice',
    'onlinesecurity', 'onlinebackup', 'deviceprotection', 
    'techsupport', 'streamingtv', 'streamingmovies', 
    'contract', 'paperlessbilling', 'paymentmethod'
]

Data Splitting

# Split into training and test sets
df_full_train, df_test = train_test_split(df, test_size=0.2, random_state=1)

Define Training Function

Important: To apply the model later, we need to return both the DictVectorizer and the model. Otherwise, the function will return None.

def train(df_train, y_train, C=1.0):
    dicts = df_train[categorical + numerical].to_dict(orient='records')

    dv = DictVectorizer(sparse=False)
    X_train = dv.fit_transform(dicts)

    model = LogisticRegression(C=C, max_iter=1000)
    model.fit(X_train, y_train)

    return dv, model

Define Prediction Function

def predict(df, dv, model):
    dicts = df[categorical + numerical].to_dict(orient='records')

    X = dv.transform(dicts)
    y_pred = model.predict_proba(X)[:, 1]

    return y_pred

K-Fold Cross Validation

C = 1.0
n_splits = 5
kfold = KFold(n_splits=n_splits, shuffle=True, random_state=1)

scores = []

for train_idx, val_idx in kfold.split(df_full_train):
    df_train = df_full_train.iloc[train_idx]
    df_val = df_full_train.iloc[val_idx]

    y_train = df_train.churn.values
    y_val = df_val.churn.values

    dv, model = train(df_train, y_train, C=C)
    y_pred = predict(df_val, dv, model)

    auc = roc_auc_score(y_val, y_pred)
    scores.append(auc)

print('C=%s %.3f +- %.3f' % (C, np.mean(scores), np.std(scores)))

Saving the Model with Pickle

We use 'wb' (Write Binary) mode to save the model. Include the DictVectorizer in your file so that when you load the model in your churn service, you can convert customer data from a dictionary into a feature matrix (which the model requires for predictions).

import pickle

output_file = 'model_C=1.0.bin'

with open(output_file, 'wb') as f_out:
    pickle.dump((dv, model), f_out)

Creating the Churn Service

Use a Python script for this. Load the model and make sure to use the POST HTTP method since we need to send information to the web service.

predict.py:

import pickle
from flask import Flask, request, jsonify

model_file = 'model_C=1.0.bin'

# Load the model
with open(model_file, 'rb') as f_in:
    dv, model = pickle.load(f_in)

app = Flask('churn')

@app.route('/predict', methods=['POST'])
def predict():
    # Get customer data from JSON request
    customer = request.get_json()

    # Transform and predict
    X = dv.transform([customer])
    y_pred = model.predict_proba(X)[0, 1]
    churn = y_pred >= 0.5

    result = {
        'churn_probability': float(y_pred),  # Convert to native Python type
        'churn': bool(churn)  # Convert to native Python type
    }

    return jsonify(result)

if __name__ == "__main__":
    app.run(debug=True, host='0.0.0.0', port=9696)

Running the Service

Launch your app on your local server. The --host=0.0.0.0 flag makes the server publicly available. The --debug flag auto-reloads the app when you save changes.

flask --app predict.py run --debug --host=0.0.0.0

Querying the Service

Now we'll send a POST request to our server with customer details and receive a churn prediction. Use a Jupyter notebook or separate Python script:

import requests

url = 'http://localhost:9696/predict'

# Sample test data from the test dataset
customer = {
    'customerid': '4183-myfrb',
    'gender': 'female',
    'seniorcitizen': 0,
    'partner': 'no',
    'dependents': 'no',
    'tenure': 21,
    'phoneservice': 'yes',
    'multiplelines': 'no',
    'internetservice': 'fiber_optic',
    'onlinesecurity': 'no',
    'onlinebackup': 'yes',
    'deviceprotection': 'yes',
    'techsupport': 'no',
    'streamingtv': 'no',
    'streamingmovies': 'yes',
    'contract': 'month-to-month',
    'paperlessbilling': 'yes',
    'paymentmethod': 'electronic_check',
    'monthlycharges': 90.05,
    'totalcharges': 1862.9
}

response = requests.post(url, json=customer).json()
print(response)

if response['churn'] == True:
    print(f'Sending promo email to {customer["customerid"]}')
else:
    print(f'Not sending promo email to {customer["customerid"]}')

The server responds with a 200 OK response:

The query was successful:

Docker Time!

Understanding Docker Components

DOCKERFILE: A text file (usually named Dockerfile) containing a series of instructions for building Docker images. Each line represents a new instruction, forming a stack of layers. Each layer is cacheable—when you build an image twice, it uses the cache. When you change a line, it rebuilds all instructions after and including the change.

IMAGE: The output of building a Dockerfile. Think of it as an executable—just like clicking an icon launches an application, you start an image to launch a container. The image encapsulates your application code and all dependencies, ensuring consistency across environments.

CONTAINER: A dynamic, running instance of a Docker image. One image can spawn many containers. On Linux, containers run as processes on the host machine. On Windows/macOS, Docker runs in a VM. Containers share the kernel but have isolated file systems—they appear like VMs but are much lighter.

Creating a Dockerfile

# Create a new Dockerfile
touch Dockerfile

# Open in your editor
code Dockerfile

Important Notes:

Make sure your Python script name matches the name in the ENTRYPOINT layer
Gunicorn is only for Unix-based systems; use Waitress for Windows
Leave a space after every line of code

Dockerfile:

FROM python:3.12-slim

RUN pip install pipenv

WORKDIR /app

# Copy dependency files
COPY ["Pipfile", "Pipfile.lock", "./"]

RUN pipenv install --system --deploy

# Copy application files
COPY ["LOAD.py", "model_C=1.0.bin", "./"]

EXPOSE 9696

# Start the application with Gunicorn
ENTRYPOINT ["gunicorn", "--bind=0.0.0.0:9696", "LOAD:app"]

CMD vs ENTRYPOINT

ENTRYPOINT defines the main command that must always run—it's like the container's executable.

ENTRYPOINT ["python", "app.py"]

Running docker run myapp executes python app.py. You can pass parameters: docker run myapp --debug becomes python app.py --debug.

CMD defines default arguments or a fallback command that can be completely overridden.

CMD ["python", "app.py"]

Running docker run myapp bash overrides CMD and runs bash instead.

Combining Both:

ENTRYPOINT ["python", "app.py"]
CMD ["--port=9696"]

docker run myapp → python app.py --port=9696
docker run myapp --debug → python app.py --debug (overrides CMD only)

ENTRYPOINT can only be overridden with the --entrypoint flag.

Multistage Dockerfiles

For Machine Learning applications, you often need a large environment to build/train your model but only a small runtime environment to serve predictions. Multistage Dockerfiles help create:

Smaller images – no unused dependencies
More secure – fewer libraries = less attack surface
Faster deployment – smaller images push/pull faster
Better maintainability – clean separation of concerns

Building and Running the Container

Build the Docker image:
churn-prediction is just a name for the docker image you can literally call it anything. But make sure the name of the build image is the same you pass to the docker run command.

docker build -t churn-prediction .

Run the container:
PS: The errors you are seing on terminal of a running docker instance is because I somehow installed scikit-learn==1.6.1 instead of 1.5.1 but otherwise our model is working just fine.

docker run -it --rm -p 9696:9696 churn-prediction

Docker Commands Reference

Managing Images

# List all local images
docker image ls

# Run an image
docker run churn-prediction

Docker first looks at the local registry for images. If not found locally, it checks Docker Hub. You can also use custom registries:

# Pull from a custom registry
docker run https://registrydomain.com/repository-server:0.1.0

Managing Containers

List containers:

# List running containers
docker container ls
# or
docker ps

# List all containers (including stopped)
docker container ls --all

Start, stop, and remove containers:

# Stop a container
docker container stop <container-id>

# Restart a container
docker container restart <container-id>

# Remove a stopped container
docker container rm <container-id>

# Kill a running container
docker kill <container-id>

Cleanup Commands

# Remove unused containers, networks, and images
docker system prune

# Remove everything including unused images
docker system prune -a

# Also remove volumes (deletes data!)
docker system prune -a --volumes

Accessing Containers

For debugging purposes:

# Access a container's shell
docker exec -it <container-id> bash

Deploying to AWS Elastic Beanstalk

Now that we have our containerized application, let's deploy it to AWS Elastic Beanstalk using the EB CLI.

Prerequisites

Before deploying, ensure you have:

An AWS account
AWS credentials configured
A credit card to verify and activate your account
Your Docker image working locally

Other free alternatives include render, railway,fly.io, heroku etc. But the deployment process is almost pretty follows the same logic only that you have to check out their documentation to know the commands they use. However, for now let's use aws elastic beanstalk service.

Installing the EB CLI

First, install the Elastic Beanstalk CLI:

# Install EB CLI using pipenv (recommended for this project)
pipenv install awsebcli --dev

# Or install globally using pip
pip install awsebcli

Verify the installation:

eb --version

Configuring AWS Credentials

If you haven't configured your AWS credentials yet:

# Configure AWS CLI
aws configure

You'll be prompted to enter:

AWS Access Key ID
AWS Secret Access Key
Default region (e.g., us-east-1)
Default output format (e.g., json)

Initializing Elastic Beanstalk

Navigate to your project directory and initialize EB:

# Initialize Elastic Beanstalk
eb init -p docker -r us-east-1 churn-prediction

Flags explained:

-p docker: Specifies the platform (Docker)
-r us-east-1: AWS region (choose your preferred region)
churn-prediction: Your application name

You'll be prompted with questions:

Select your region
Enter application name (or accept default)
Choose whether to use CodeCommit (typically select "no")
Set up SSH for your instances (recommended for debugging)

Creating an Environment

Create an environment to run your application:

# Create an environment named 'churn-prediction-env'
eb create churn-prediction-env

This process takes several minutes as AWS:

Creates an EC2 instance
Sets up load balancers
Configures security groups
Deploys your Docker container

You'll see real-time logs of the deployment process.

Monitoring Deployment Status

Check the status of your environment:

# Check environment status
eb status

# View recent events
eb events

# Follow logs in real-time
eb logs --stream

Testing Your Deployed Application

Once deployment is complete, get your application's URL:

# Open your application in a browser
eb open

Or manually test the endpoint:

import requests

# Replace with your actual EB URL
url = 'http://churn-prediction-env.us-east-1.elasticbeanstalk.com/predict'

customer = {
    'customerid': '4183-myfrb',
    'gender': 'female',
    'seniorcitizen': 0,
    'partner': 'no',
    'dependents': 'no',
    'tenure': 21,
    'phoneservice': 'yes',
    'multiplelines': 'no',
    'internetservice': 'fiber_optic',
    'onlinesecurity': 'no',
    'onlinebackup': 'yes',
    'deviceprotection': 'yes',
    'techsupport': 'no',
    'streamingtv': 'no',
    'streamingmovies': 'yes',
    'contract': 'month-to-month',
    'paperlessbilling': 'yes',
    'paymentmethod': 'electronic_check',
    'monthlycharges': 90.05,
    'totalcharges': 1862.9
}

response = requests.post(url, json=customer).json()
print(response)

Updating Your Application

When you make changes to your code:

# Deploy updates
eb deploy

# Monitor deployment
eb status

Environment Configuration

You can modify environment variables and settings:

# Set environment variables
eb setenv FLASK_ENV=production MODEL_VERSION=1.0

# View current configuration
eb config

Scaling Your Application

Scale your application based on traffic:

# Enable auto-scaling (via AWS Console or CLI)
# Minimum 1 instance, maximum 4 instances
eb scale 2  # Set to 2 instances

# Or configure auto-scaling through the console

Monitoring and Logs

Access logs and monitoring:

# Download logs
eb logs

# Stream logs in real-time
eb logs --stream

# View health status
eb health

Troubleshooting Common Issues

Issue: Deployment fails

# Check detailed logs
eb logs

# SSH into the instance for debugging
eb ssh

Issue: Health status is degraded

Check if the application is responding on the correct port (9696)
Verify the Docker container is running
Check environment variables

Issue: Connection timeout

Ensure security groups allow inbound traffic on port 80/443
Verify the load balancer health check settings

Cost Management

Important: Elastic Beanstalk environments incur costs. To avoid unnecessary charges:

# Terminate the environment when not in use
eb terminate churn-prediction-env

# Or just stop it temporarily (still incurs some costs)
eb stop

Configuration Files

For more control, create an .ebextensions directory with configuration files:

.ebextensions/01_flask.config:

option_settings:
  aws:elasticbeanstalk:application:environment:
    PYTHONPATH: "/var/app/current:$PYTHONPATH"
  aws:elasticbeanstalk:container:python:
    WSGIPath: predict:app

Best Practices

Use environment variables for sensitive data (API keys, database credentials)
Set up health checks to ensure your application is responding correctly
Enable logging to CloudWatch for better monitoring
Use HTTPS by configuring SSL certificates
Implement auto-scaling based on your traffic patterns
Tag your resources for better cost tracking

Useful EB CLI Commands Summary

# Initialize EB in your project
eb init

# Create a new environment
eb create <environment-name>

# Deploy application updates
eb deploy

# Open application in browser
eb open

# Check environment status
eb status

# View logs
eb logs

# SSH into instance
eb ssh

# Terminate environment
eb terminate

# List all environments
eb list

# Set environment variables
eb setenv KEY=VALUE

# Scale application
eb scale <number-of-instances>

Conclusion

You've successfully:

Trained a machine learning model
Packaged it with Pickle
Created a Flask web service
Containerized the application with Docker
Deployed it to AWS Elastic Beanstalk

The containerized approach ensures consistency across environments, and AWS Elastic Beanstalk handles scaling, monitoring, and infrastructure management automatically. Congratulations on learning how to deploy your models. See you in the next one.