DEV Community

teresa kungu
teresa kungu

Posted on

Deploying a Customer Lifetime Value (CLV) Prediction Model Using FastAPI

Introduction

Businesses want to know which customers are most valuable to them. Some customers spend more money, stay longer, and interact more with the business than others. If a company can predict this early, it can make better decisions about marketing, customer support, and retention.

This is where Customer Lifetime Value (CLV) becomes important. In this project, a machine learning model was built to predict CLV using customer data. The trained model was then deployed using FastAPI, allowing predictions to be made through a simple API.

Project Objectives

  • Explain what Customer Lifetime Value (CLV) is
  • Build a regression model to predict CLV
  • Compare models and choose the best one
  • Save the trained model for future use
  • Deploy the model using FastAPI
  • Predict CLV through an APIProject Objectives
  • Test the API to confirm it works

Understanding Customer Lifetime Value (CLV)

  • Customer Lifetime Value is the total amount of money a business expects to earn from a customer during their entire relationship with the company.
  • For example, if a customer spends a small amount every month but stays for many years, their CLV can be high. On the other hand, a customer who spends a lot once but never comes back may have a low CLV.

Predicting CLV helps businesses to:

  • Identify high-value customers
  • Spend marketing money wisely
  • Improve customer retention
  • Plan better customer engagement strategies

Because CLV is a number, predicting it is a regression problem in machine learning.

Dataset and Business Problem

The dataset used in this project is called customer_lifetime.csv. Each row represents one customer, and each column describes something about that customer.

Important columns include:

  • Customer_Age – Age of the customer
  • Annual_Income – Yearly income
  • Tenure_Months – How long the customer has been active
  • Monthly_Spend – Average monthly spending
  • Visits_Per_Month – Number of visits per month
  • Avg_Basket_Value – Average value per purchase
  • Support_Tickets – Number of support issues raised
  • CLV – Customer Lifetime Value (target variable)

The main goal is to predict CLV for new customers before spending money on marketing or retention.

Data Preparation

The first step was to load and explore the dataset to understand the data types and structure. The CLV column was identified as the target, while the remaining columns were used as input features.

The data was then split into:

  1. Training data – used to teach the model
  2. Testing data – used to check how well the model performs

Splitting the data is important because it shows how the model will perform on new, unseen customers.

Building the Regression Models

Since CLV is a continuous number, regression models were used.

Two models were built:

  1. Linear Regression

This model was used as a baseline. It is simple, fast, and easy to understand, but it may not capture complex customer behavior.

  1. Random Forest Regressor

This model uses many decision trees to make predictions. It handles complex relationships better and usually gives more accurate results.

Regression is suitable for this problem because the goal is to predict a numeric value, not categories.

Model Evaluation and Selection

Both models were evaluated using regression metrics such as:

  • Mean Absolute Error (MAE)
  • Mean Squared Error (MSE)
  • R-squared (R²)

The Random Forest model performed better than Linear Regression because it captured more complex patterns in the data. For this reason, it was selected as the final model.

Saving the Model

The final trained model was saved using Joblib, along with the list of features used during training.

Saving the model is important because:

  • The model does not need to be retrained every time
  • Predictions are consistent
  • The API runs faster and more efficiently

The saved files are stored in a saved_model folder.

Deploying the Model Using FastAPI

FastAPI was used to deploy the model as a web API. FastAPI is a Python framework that makes it easy to create APIs and automatically validates input data.

The model is loaded when the API starts. The model is not retrained inside the API, which follows best practices for production systems.

How the API Works

The API has two main endpoints:

  1. Health Check Endpoint

GET /
This endpoint confirms that the API is running correctly.

2.CLV Prediction Endpoint

POST /predict-clv

This endpoint:

  • Accepts customer data in JSON format
  • Checks if the input data is valid
  • Sends the data to the trained model
  • Returns the predicted CLV value

Testing the API

The API was tested using:

  • FastAPI Swagger UI
  • Postman

Successful testing showed that the API correctly accepts input data and returns CLV predictions.

Conclusion and Future Improvements

This project shows a complete machine learning process, from understanding customer data to deploying a working prediction API. Predicting CLV helps businesses make smarter and more data-driven decisions.

In a real business setting, the model could be improved by:

  • Adding more customer behavior data
  • Monitoring predictions over time
  • Retraining the model with new data

Overall, this project demonstrates how machine learning models can be moved from development into real-world applications using FastAPI.

Top comments (0)