Most Machine Learning tutorials have a fatal flaw: They stop at the Notebook.
You train a model, get a nice accuracy score, and then... nothing. The model sits in a .ipynb file gathering digital dust.
I wanted to change that. I recently built an end-to-end Customer Conversion System that takes raw data, predicts purchasing behavior, and triggers automated marketing actions via a live API.
Here is the journey from "Localhost" to "Production"โincluding how I accidentally built a 3.3GB Docker container and how I slashed it by 65%.
๐๏ธ The Tech Stack
We aren't just fitting curves; we are shipping code.
Model: XGBoost (Classification + Regression)
Backend: Flask (Python)
Container: Docker
Cloud: Google Cloud Run (Serverless)
Frontend: Streamlit
Phase 1: The Logic (Beyond "0.85 Accuracy")
A raw probability score isn't actionable. Marketing teams don't want to know "User 123 has a 0.82 score." They want to know what to do.
I wrapped my XGBoost model in a "Decision Engine" function inside Python:
Python
def determine_action(prob, days_to_buy, value):
# High probability, High spender
if prob > 0.8 and value > 2000:
return f"VIP ALERT: Send Early Access Catalog. (Expected buy in {int(days_to_buy)} days)"
# High probability, Low spender
elif prob > 0.8:
return "PROMO: Send 'Bundle Discount' to increase basket size."
# Low probability, High historic value (Churn Risk)
elif prob < 0.3 and value > 2000:
return "RISK: Trigger Personal Outreach Call."
else:
return "NURTURE: Add to General Newsletter."
Now the API returns business strategy, not just math.
Phase 2: The Docker Nightmare ๐ณ
This was the biggest hurdle. I wrote a standard Dockerfile to wrap up my Flask API.
I ran docker build, went to grab coffee, came back, and saw this:
Bash
Successfully built...
Image size: 3.36 GB
3.36 GB. For a simple API? Thatโs unacceptable. It makes deployment slow and storage expensive.
๐ต๏ธโโ๏ธ The Investigation
I ran a deep scan inside the container to see where the fat was hiding:
Bash
docker run --rm my-app du -ah /usr/local/lib/python3.9/site-packages | sort -rh | head -n 10
The output was shocking:
900MB+ in nvidia/ drivers.
1GB+ in my local .venv folder that I accidentally copied over.
๐ ๏ธ The Fixes
- The .dockerignore File I was lazy and didn't create a .dockerignore file, so Docker copied my local virtual environment (.venv), git history, and raw data into the image.
Fix: Added .venv, .git, and data/ to .dockerignore.
- The XGBoost/NVIDIA Trap It turns out that pip install xgboost (latest version) often bundles massive NVIDIA CUDA drivers, even if you are only running on a CPU.
Fix: I pinned the version to a lighter release in requirements.txt:
xgboost==1.7.6
The Result: The image dropped from 3.36GB -> 1.2GB. Much better.
Phase 3: Serverless Deployment (Google Cloud Run)
I love Cloud Run for side projects. You give it a container, and it gives you a HTTPS URL. It scales to zero when no one is using it, meaning it costs $0/month for low traffic.
Deploying was just three commands:
Bash
1. Tag the image
docker tag conversion-api gcr.io/my-project/conversion-api
2. Push to Google Container Registry
docker push gcr.io/my-project/conversion-api
3. Deploy
gcloud run deploy conversion-service --image gcr.io/my-project/conversion-api --platform managed
Boom. A live API endpoint accessible from anywhere in the world.
Phase 4: The Frontend
To make this usable for non-technical users, I threw together a Streamlit dashboard in about 50 lines of Python.
It connects to the Cloud Run API and provides a UI for testing customer profiles.
๐ Key Takeaways
ML isn't done until it's deployed. A model in a notebook delivers zero value.
Watch your dependencies. pip install is dangerous if you don't check what's being installed. That single XGBoost line cost me 1GB of space.
Context matters. Transforming a probability score into a "Next Best Action" makes your model 10x more valuable to stakeholders.
Have you ever struggled with massive Docker images in Python? Let me know in the comments!
Top comments (0)