Shavon Harris

Posted on Sep 8

The Complete Guide to Deploying AI Models: From Notebook to Production

Deploying a Sentiment Analysis Model with FastAPI and Docker 🚀

You’ve trained a machine learning model. ✅

It works on your test set. ✅

And now... you want to make it useful in the real world.

This post walks through how I deployed distilbert-base-uncased-finetuned-sst-2-english using:

🧠 Hugging Face transformers
⚡ FastAPI + Uvicorn
📦 Docker (multi-stage build!)

The result is an API that can analyze sentimentin real time or in batches with health checks, usage metrics, and a production-ready container.

Wait... What Is "Model Deployment"?

Deployment just means you made your model callable.

It’s no longer stuck in your notebook now other apps (or people) can send it text and get predictions back.

In this case, the model predicts sentiment positive or negative — and I added extra flavor to guess if the person is feeling:

😤 frustrated
😍 excited
😎 confident
😕 uncertain

Imagine plugging this into a GitHub bot that flags angry PRs. Or tracking customer sentiment over time. Or piping it into a Slackbot for fun. Lots of ways to use this.

Two Ways to Use the API

1. `/predict` — Real-Time

Send in one piece of text. Get a response like:

{
  "sentiment": "positive",
  "confidence": 0.9987,
  "emotions": {
    "excited": 0.2,
    "frustrated": 0.0
  }
}

2. `/predict-batch` — Batch Mode

Send up to 100 texts at once.

Great for processing reviews, survey responses, Slack logs, etc.

Core Endpoint Logic (FastAPI + Transformers)

@app.post("/predict", response_model=SentimentResponse)
async def predict_sentiment(input_data: TextInput):
    result = sentiment_pipeline(input_data.text)[0]
    return SentimentResponse(
        text=input_data.text,
        sentiment=result['label'].lower(),
        confidence=round(result['score'], 4),
        emotions=detect_emotions(input_data.text, result['score']),
        timestamp=datetime.now().isoformat()
    )

Running Locally

Start the server:

uvicorn app.main:app --host 0.0.0.0 --port 8000

Test It Out

Try it with curl:

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"text":"I love this project!"}'

Get back something like:

{
  "text": "I love this project!",
  "sentiment": "positive",
  "confidence": 0.9987,
  "emotions": {
    "frustrated": 0,
    "excited": 0.2,
    "confident": 0,
    "uncertain": 0
  },
  "timestamp": "2025-09-08T12:00:00"
}

Dockerize It 🐳

Multi-stage Dockerfile = smaller, cleaner container:

FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /home/appuser/.local
COPY . .
USER appuser
ENV PATH=/home/appuser/.local/bin:$PATH
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Build and run:

docker build -t sentiment-api .
docker run -p 8080:8000 sentiment-api

Now your app is running inside a container, on port 8080.

Health Checks + Monitoring

I added two simple extras that make this feel like a real service:

/health → returns model status and timestamp
/metrics → counts how many times each endpoint has been used

Great for dashboards, uptime checks, or just curiosity.

Scaling Ideas

If you want to scale this:

Run 2+ containers behind a load balancer
Add CPU-based autoscaling
Rate-limit requests
Add API key auth
Log usage data to a database or S3

Why I Chose These Tools

Hugging Face Transformers → pretrained sentiment model
FastAPI → fast, async, auto-validates input
Uvicorn → ASGI server with great performance
Docker → portable, clean environments

What It's Doing Now

✅ Loads the Hugging Face model once

✅ Serves real-time and batch requests

✅ Gives back structured JSON with extra emotion tagging

✅ Tracks usage

✅ Runs in a single Docker container

What’s Next

🔒 Add API key auth

🧾 Add structured logging

🔁 Add CI/CD to auto-deploy updates

If you end up using this API for something cool — let me know! Always curious how people remix small projects like this 🙌🏾

DEV Community

The Complete Guide to Deploying AI Models: From Notebook to Production

Deploying a Sentiment Analysis Model with FastAPI and Docker 🚀

Wait... What Is "Model Deployment"?

Two Ways to Use the API

1. `/predict` — Real-Time

2. `/predict-batch` — Batch Mode

Core Endpoint Logic (FastAPI + Transformers)

Running Locally

Test It Out

Dockerize It 🐳

Health Checks + Monitoring

Scaling Ideas

Why I Chose These Tools

What It's Doing Now

What’s Next

Top comments (0)

Deploying a Sentiment Analysis Model with FastAPI and Docker 🚀

Wait... What Is "Model Deployment"?

Two Ways to Use the API

1. /predict — Real-Time

2. /predict-batch — Batch Mode

Core Endpoint Logic (FastAPI + Transformers)

Running Locally

Test It Out

Dockerize It 🐳

Health Checks + Monitoring

Scaling Ideas

Why I Chose These Tools

What It's Doing Now

What’s Next

1. `/predict` — Real-Time

2. `/predict-batch` — Batch Mode