DEV Community

Shavon Harris
Shavon Harris

Posted on

The Complete Guide to Deploying AI Models: From Notebook to Production

Deploying a Sentiment Analysis Model with FastAPI and Docker πŸš€

You’ve trained a machine learning model. βœ…

It works on your test set. βœ…

And now... you want to make it useful in the real world.

This post walks through how I deployed distilbert-base-uncased-finetuned-sst-2-english using:

  • 🧠 Hugging Face transformers
  • ⚑ FastAPI + Uvicorn
  • πŸ“¦ Docker (multi-stage build!)

The result is an API that can analyze sentimentin real time or in batches with health checks, usage metrics, and a production-ready container.


Wait... What Is "Model Deployment"?

Deployment just means you made your model callable.

It’s no longer stuck in your notebook now other apps (or people) can send it text and get predictions back.

In this case, the model predicts sentiment positive or negative β€” and I added extra flavor to guess if the person is feeling:

  • 😀 frustrated
  • 😍 excited
  • 😎 confident
  • πŸ˜• uncertain

Imagine plugging this into a GitHub bot that flags angry PRs. Or tracking customer sentiment over time. Or piping it into a Slackbot for fun. Lots of ways to use this.


Two Ways to Use the API

1. /predict β€” Real-Time

Send in one piece of text. Get a response like:

{
  "sentiment": "positive",
  "confidence": 0.9987,
  "emotions": {
    "excited": 0.2,
    "frustrated": 0.0
  }
}
Enter fullscreen mode Exit fullscreen mode

2. /predict-batch β€” Batch Mode

Send up to 100 texts at once.

Great for processing reviews, survey responses, Slack logs, etc.


Core Endpoint Logic (FastAPI + Transformers)

@app.post("/predict", response_model=SentimentResponse)
async def predict_sentiment(input_data: TextInput):
    result = sentiment_pipeline(input_data.text)[0]
    return SentimentResponse(
        text=input_data.text,
        sentiment=result['label'].lower(),
        confidence=round(result['score'], 4),
        emotions=detect_emotions(input_data.text, result['score']),
        timestamp=datetime.now().isoformat()
    )
Enter fullscreen mode Exit fullscreen mode

Running Locally

Start the server:

uvicorn app.main:app --host 0.0.0.0 --port 8000
Enter fullscreen mode Exit fullscreen mode

Test It Out

Try it with curl:

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"text":"I love this project!"}'
Enter fullscreen mode Exit fullscreen mode

Get back something like:

{
  "text": "I love this project!",
  "sentiment": "positive",
  "confidence": 0.9987,
  "emotions": {
    "frustrated": 0,
    "excited": 0.2,
    "confident": 0,
    "uncertain": 0
  },
  "timestamp": "2025-09-08T12:00:00"
}
Enter fullscreen mode Exit fullscreen mode

Dockerize It 🐳

Multi-stage Dockerfile = smaller, cleaner container:

FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /home/appuser/.local
COPY . .
USER appuser
ENV PATH=/home/appuser/.local/bin:$PATH
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Enter fullscreen mode Exit fullscreen mode

Build and run:

docker build -t sentiment-api .
docker run -p 8080:8000 sentiment-api
Enter fullscreen mode Exit fullscreen mode

Now your app is running inside a container, on port 8080.


Health Checks + Monitoring

I added two simple extras that make this feel like a real service:

  • /health β†’ returns model status and timestamp
  • /metrics β†’ counts how many times each endpoint has been used

Great for dashboards, uptime checks, or just curiosity.


Scaling Ideas

If you want to scale this:

  • Run 2+ containers behind a load balancer
  • Add CPU-based autoscaling
  • Rate-limit requests
  • Add API key auth
  • Log usage data to a database or S3

Why I Chose These Tools

  • Hugging Face Transformers β†’ pretrained sentiment model
  • FastAPI β†’ fast, async, auto-validates input
  • Uvicorn β†’ ASGI server with great performance
  • Docker β†’ portable, clean environments

What It's Doing Now

βœ… Loads the Hugging Face model once

βœ… Serves real-time and batch requests

βœ… Gives back structured JSON with extra emotion tagging

βœ… Tracks usage

βœ… Runs in a single Docker container


What’s Next

πŸ”’ Add API key auth

🧾 Add structured logging

πŸ” Add CI/CD to auto-deploy updates


If you end up using this API for something cool β€” let me know! Always curious how people remix small projects like this πŸ™ŒπŸΎ

Top comments (0)