<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aniket Singh</title>
    <description>The latest articles on DEV Community by Aniket Singh (@aniket_singh17).</description>
    <link>https://dev.to/aniket_singh17</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3895375%2F7296af23-86f5-4bf8-970e-e6276616ba76.png</url>
      <title>DEV Community: Aniket Singh</title>
      <link>https://dev.to/aniket_singh17</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aniket_singh17"/>
    <language>en</language>
    <item>
      <title>🌸 Iris Classifier ML Pipeline — Complete Tutorial &amp; Instructions Manual</title>
      <dc:creator>Aniket Singh</dc:creator>
      <pubDate>Fri, 24 Apr 2026 06:07:05 +0000</pubDate>
      <link>https://dev.to/aniket_singh17/iris-classifier-ml-pipeline-complete-tutorial-instructions-manual-2j9j</link>
      <guid>https://dev.to/aniket_singh17/iris-classifier-ml-pipeline-complete-tutorial-instructions-manual-2j9j</guid>
      <description>&lt;h1&gt;
  
  
  Iris Classifier ML Pipeline — Complete Tutorial &amp;amp; Instructions Manual
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Who this is for:&lt;/strong&gt; Beginners and intermediate developers who want to understand how a real-world ML project is structured and run — from a cloned repository to a fully running system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What you'll learn:&lt;/strong&gt; Virtual environments, dependency management, project structure, MLflow experiment tracking, FastAPI inference servers, Docker containerisation, and automated CI/CD.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites:&lt;/strong&gt; Python installed, VS Code installed, internet connection. That's it.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  📋 Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What This Project Does&lt;/li&gt;
&lt;li&gt;Understanding the Project Structure&lt;/li&gt;
&lt;li&gt;One-Time Setup: Install Required Tools&lt;/li&gt;
&lt;li&gt;Step 1 — Clone the Project&lt;/li&gt;
&lt;li&gt;Step 2 — Create a Python Virtual Environment&lt;/li&gt;
&lt;li&gt;Step 3 — Install Dependencies&lt;/li&gt;
&lt;li&gt;Step 4 — Configure Environment Variables&lt;/li&gt;
&lt;li&gt;Step 5 — Run the Training Pipeline&lt;/li&gt;
&lt;li&gt;Step 6 — Explore MLflow Experiment Tracking&lt;/li&gt;
&lt;li&gt;Step 7 — Start the FastAPI Inference Server&lt;/li&gt;
&lt;li&gt;Step 8 — Make Predictions via the API&lt;/li&gt;
&lt;li&gt;Step 9 — Run the Test Suite&lt;/li&gt;
&lt;li&gt;Step 10 — Run Everything with Docker&lt;/li&gt;
&lt;li&gt;Step 11 — Schedule Training with Cron&lt;/li&gt;
&lt;li&gt;Architecture Deep Dive&lt;/li&gt;
&lt;li&gt;How the Code Flows Together&lt;/li&gt;
&lt;li&gt;Common Errors &amp;amp; How to Fix Them&lt;/li&gt;
&lt;li&gt;Extending the Project&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  1. What This Project Does
&lt;/h2&gt;

&lt;p&gt;This project simulates a &lt;strong&gt;production-grade machine learning system&lt;/strong&gt;. Here is the high-level picture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────────────┐
│                        ML PIPELINE OVERVIEW                         │
│                                                                     │
│  [Cron / CLI]                                                       │
│       │                                                             │
│       ▼                                                             │
│  ┌─────────────┐    trains    ┌─────────────────┐                  │
│  │  Training   │ ──────────►  │  Saved Model    │                  │
│  │  Pipeline   │              │  (models/*.pkl) │                  │
│  └─────────────┘              └────────┬────────┘                  │
│       │                                │                            │
│       │ logs everything                │ loaded by                  │
│       ▼                                ▼                            │
│  ┌─────────────┐              ┌─────────────────┐                  │
│  │   MLflow    │              │   FastAPI       │◄── HTTP requests │
│  │  Tracking   │              │   Inference     │                  │
│  │     UI      │              │   Server        │──► predictions   │
│  └─────────────┘              └─────────────────┘                  │
│                                                                     │
│  All three services run together inside Docker Compose             │
└─────────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The dataset:&lt;/strong&gt; &lt;a href="https://en.wikipedia.org/wiki/Iris_flower_data_set" rel="noopener noreferrer"&gt;Iris&lt;/a&gt; — 150 flower measurements, 3 species (setosa, versicolor, virginica). A classic beginner dataset that's perfect for showcasing pipeline architecture without the model itself becoming the focus.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The model:&lt;/strong&gt; &lt;code&gt;RandomForestClassifier&lt;/code&gt; inside a &lt;code&gt;sklearn Pipeline&lt;/code&gt; with &lt;code&gt;StandardScaler&lt;/code&gt;. A &lt;code&gt;GridSearchCV&lt;/code&gt; automatically searches for the best hyperparameters.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Understanding the Project Structure
&lt;/h2&gt;

&lt;p&gt;Before touching any code, read this section. Understanding &lt;em&gt;why&lt;/em&gt; the project is structured this way is what separates a portfolio project from a "notebook dumped into a repo."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ml-pipeline/
│
├── src/                    ← The core Python library (business logic)
│   ├── config.py           ← ALL settings live here (paths, hyperparams, env vars)
│   ├── data/
│   │   └── loader.py       ← Loads Iris dataset, splits train/test
│   ├── training/
│   │   ├── pipeline.py     ← Builds the sklearn Pipeline object
│   │   └── trainer.py      ← Orchestrates training: GridSearchCV + MLflow logging
│   ├── evaluation/
│   │   └── metrics.py      ← Accuracy, F1, confusion matrix — pure functions
│   └── inference/
│       └── predictor.py    ← Loads saved model, exposes predict() method
│
├── api/                    ← The FastAPI web application
│   ├── main.py             ← Creates and configures the FastAPI app
│   ├── schemas.py          ← Defines the shape of API requests and responses
│   └── routers/
│       └── predict.py      ← The actual /predict HTTP endpoints
│
├── scripts/
│   ├── train.py            ← CLI entry point: `python scripts/train.py`
│   └── run_pipeline.sh     ← Bash wrapper used by cron
│
├── tests/                  ← Automated tests
│   ├── test_data_loader.py
│   ├── test_metrics.py
│   ├── test_predictor.py
│   └── test_api.py
│
├── docker/
│   ├── Dockerfile.train    ← Container for the training job
│   └── Dockerfile.api      ← Container for the FastAPI server
│
├── models/                 ← Created automatically — stores .pkl files
├── mlruns/                 ← Created automatically — stores MLflow data
├── logs/                   ← Created automatically — stores training logs
│
├── docker-compose.yml      ← Wires all Docker services together
├── requirements.txt        ← Python dependencies
├── Makefile                ← Shortcuts: `make train`, `make serve`, etc.
├── crontab.txt             ← Cron schedule definition
├── pyproject.toml          ← Tool config (pytest, ruff, mypy)
└── .env.example            ← Template for environment variables
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key design principle:&lt;/strong&gt; &lt;code&gt;src/&lt;/code&gt; contains zero web framework code. &lt;code&gt;api/&lt;/code&gt; contains zero ML logic. They communicate through &lt;code&gt;src/inference/predictor.py&lt;/code&gt;. This makes every layer independently testable.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  3. One-Time Setup: Install Required Tools
&lt;/h2&gt;

&lt;p&gt;You need three tools installed on your machine. Do this before anything else.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Verify Python is installed
&lt;/h3&gt;

&lt;p&gt;Open a terminal (on Linux/Mac) or Command Prompt / PowerShell (on Windows):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;span class="c"&gt;# or on some systems:&lt;/span&gt;
python3 &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see &lt;code&gt;Python 3.10.x&lt;/code&gt; or higher. If not, download it from &lt;a href="https://www.python.org/downloads/" rel="noopener noreferrer"&gt;python.org&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;On Linux/Ubuntu&lt;/strong&gt;, you may need: &lt;code&gt;sudo apt update &amp;amp;&amp;amp; sudo apt install python3 python3-pip python3-venv&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  3.2 Install Git (to push to GitHub later)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If not installed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ubuntu/Debian:&lt;/strong&gt; &lt;code&gt;sudo apt install git&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mac:&lt;/strong&gt; &lt;code&gt;xcode-select --install&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windows:&lt;/strong&gt; Download from &lt;a href="https://git-scm.com/" rel="noopener noreferrer"&gt;git-scm.com&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3.3 Install Docker Desktop (for the containerised workflow)
&lt;/h3&gt;

&lt;p&gt;Docker lets you run the entire stack — API + MLflow UI + trainer — with a single command, without installing anything else on your machine.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://docs.docker.com/get-docker/" rel="noopener noreferrer"&gt;docs.docker.com/get-docker&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Download and install &lt;strong&gt;Docker Desktop&lt;/strong&gt; for your OS&lt;/li&gt;
&lt;li&gt;Open Docker Desktop and wait for it to show "Docker is running"&lt;/li&gt;
&lt;li&gt;Verify in the terminal:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nt"&gt;--version&lt;/span&gt;
docker compose version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;On Linux&lt;/strong&gt;, after installing Docker Engine, add your user to the docker group so you don't need &lt;code&gt;sudo&lt;/code&gt;:&lt;/p&gt;


&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;usermod &lt;span class="nt"&gt;-aG&lt;/span&gt; docker &lt;span class="nv"&gt;$USER&lt;/span&gt;
newgrp docker
&lt;/code&gt;&lt;/pre&gt;

&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  4. Step 1 — Clone the Project
&lt;/h2&gt;

&lt;h3&gt;
  
  
  4.1 Clone the repository from Github
&lt;/h3&gt;

&lt;p&gt;Clone the  &lt;code&gt;Iris-Classifier-ML-Pipeline&lt;/code&gt; to a location of your choice, for example &lt;code&gt;~/projects/&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/aniket-1177/Iris-Classifier-ML-Pipeline.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.2 Open in VS Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;code &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or open VS Code manually → &lt;strong&gt;File → Open Folder&lt;/strong&gt; → select &lt;code&gt;Iris-Classifier-ML-Pipeline&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Install the recommended VS Code extension for Python: when VS Code prompts you, click &lt;strong&gt;Install&lt;/strong&gt;. If it doesn't prompt, press &lt;code&gt;Ctrl+Shift+X&lt;/code&gt;, search &lt;strong&gt;Python&lt;/strong&gt;, and install the Microsoft extension.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Step 2 — Create a Python Virtual Environment
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is a virtual environment and why do we need one?
&lt;/h3&gt;

&lt;p&gt;A virtual environment is an isolated Python installation just for this project. Without it, every project on your machine would share the same packages — which leads to version conflicts. With a venv, installing &lt;code&gt;scikit-learn==1.4.0&lt;/code&gt; here won't affect any other project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your machine
│
├── System Python (don't touch this)
│
└── projects/
    └── ml-pipeline/
        └── .venv/          ← A private Python just for this project
            ├── bin/python
            └── lib/
                ├── scikit-learn
                ├── fastapi
                ├── mlflow
                └── ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create the virtual environment
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Make sure you are inside the ml-pipeline directory&lt;/span&gt;
&lt;span class="nb"&gt;pwd&lt;/span&gt;
&lt;span class="c"&gt;# Should print something like: /home/yourname/projects/ml-pipeline&lt;/span&gt;

&lt;span class="c"&gt;# Create the venv (this creates a .venv folder)&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Activate the virtual environment
&lt;/h3&gt;

&lt;p&gt;You must activate the venv every time you open a new terminal window.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Linux / Mac:&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate

&lt;span class="c"&gt;# Windows (Command Prompt):&lt;/span&gt;
.venv&lt;span class="se"&gt;\S&lt;/span&gt;cripts&lt;span class="se"&gt;\a&lt;/span&gt;ctivate.bat

&lt;span class="c"&gt;# Windows (PowerShell):&lt;/span&gt;
.venv&lt;span class="se"&gt;\S&lt;/span&gt;cripts&lt;span class="se"&gt;\A&lt;/span&gt;ctivate.ps1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After activation, your terminal prompt changes to show &lt;code&gt;(.venv)&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;(.venv) username@os:~/projects/Iris-Classifier-ML-Pipeline$&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;VS Code tip:&lt;/strong&gt; Press &lt;code&gt;Ctrl+Shift+P&lt;/code&gt; → type "Python: Select Interpreter" → choose the one that says &lt;code&gt;.venv&lt;/code&gt;. VS Code will now automatically activate the venv in all new integrated terminals.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  6. Step 3 — Install Dependencies
&lt;/h2&gt;

&lt;p&gt;With your venv activated, install all required packages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--upgrade&lt;/span&gt; pip
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will install approximately 15 packages. It may take 2–5 minutes on the first run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's being installed:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;scikit-learn&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Machine learning — our model, pipeline, and grid search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;pandas&lt;/code&gt; / &lt;code&gt;numpy&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Data manipulation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mlflow&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Experiment tracking and model registry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;fastapi&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The web framework for our inference API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;uvicorn&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The ASGI web server that runs FastAPI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;pydantic&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Data validation for API requests/responses&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Verify installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"import sklearn, mlflow, fastapi; print('All good!')"&lt;/span&gt;
&lt;span class="c"&gt;# Should print: All good!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  7. Step 4 — Configure Environment Variables
&lt;/h2&gt;

&lt;p&gt;Environment variables let you change settings (like which MLflow server to use) without editing code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create your &lt;code&gt;.env&lt;/code&gt; file
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;.env&lt;/code&gt; in VS Code. For local development, the defaults are fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;MLFLOW_TRACKING_URI&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;file://./mlruns&lt;/span&gt;
&lt;span class="py"&gt;MLFLOW_EXPERIMENT_NAME&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;iris-classifier&lt;/span&gt;
&lt;span class="py"&gt;API_HOST&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;
&lt;span class="py"&gt;API_PORT&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;8000&lt;/span&gt;
&lt;span class="py"&gt;LOG_LEVEL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;INFO&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What does &lt;code&gt;file://./mlruns&lt;/code&gt; mean?&lt;/strong&gt; It tells MLflow to store all experiment data in a local folder called &lt;code&gt;mlruns/&lt;/code&gt; instead of connecting to a remote server. Perfect for development.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  8. Step 5 — Run the Training Pipeline
&lt;/h2&gt;

&lt;p&gt;This is the core of the project. Let's run it and understand what happens at each step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python scripts/train.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will see output like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2024-06-01 10:23:15 | INFO     | __main__ | =======================================================
2024-06-01 10:23:15 | INFO     | __main__ |   ML Pipeline Training Run
2024-06-01 10:23:15 | INFO     | __main__ |   Experiment : iris-classifier
2024-06-01 10:23:15 | INFO     | __main__ | =======================================================
2024-06-01 10:23:15 | INFO     | src.data.loader | Loading Iris dataset...
2024-06-01 10:23:15 | INFO     | src.data.loader | Dataset loaded | samples=150 | features=4 | classes=['setosa', 'versicolor', 'virginica']
2024-06-01 10:23:15 | INFO     | src.data.loader | Data split | train=120 | test=30
2024-06-01 10:23:16 | INFO     | src.training.trainer | MLflow run started | run_id=abc123...
2024-06-01 10:23:18 | INFO     | src.training.trainer | Best params: {'classifier__max_depth': None, 'classifier__n_estimators': 100}
2024-06-01 10:23:18 | INFO     | src.training.trainer | ─────────────────────────────────────────────
2024-06-01 10:23:18 | INFO     | src.training.trainer | accuracy                       0.9667
2024-06-01 10:23:18 | INFO     | src.training.trainer | macro_f1                       0.9667
...
2024-06-01 10:23:19 | INFO     | __main__ | Training finished successfully.
2024-06-01 10:23:19 | INFO     | __main__ |   Accuracy   : 0.9667
2024-06-01 10:23:19 | INFO     | __main__ |   Model path : /home/.../models/iris_classifier.pkl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What just happened internally?
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scripts/train.py
    └── calls run_training() in src/training/trainer.py
            │
            ├── 1. load_dataset()       → loads 150 Iris rows from scikit-learn
            ├── 2. split_data()         → 120 train, 30 test (stratified)
            ├── 3. build_pipeline()     → StandardScaler + RandomForestClassifier
            ├── 4. GridSearchCV.fit()   → tries 18 hyperparameter combinations (5-fold CV each)
            ├── 5. compute_metrics()    → accuracy, F1, etc. on held-out test set
            ├── 6. mlflow.log_*()       → saves params + metrics + model to mlruns/
            └── 7. pickle.dump()        → saves best model to models/iris_classifier.pkl
                                           saves label encoder to models/label_encoder.pkl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verify the output artifacts
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls &lt;/span&gt;models/
&lt;span class="c"&gt;# iris_classifier.pkl   label_encoder.pkl&lt;/span&gt;

&lt;span class="nb"&gt;ls &lt;/span&gt;mlruns/
&lt;span class="c"&gt;# 0/   (experiment folder)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CLI flags
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Custom experiment name&lt;/span&gt;
python scripts/train.py &lt;span class="nt"&gt;--experiment&lt;/span&gt; my-experiment-v2

&lt;span class="c"&gt;# Save results to a JSON file&lt;/span&gt;
python scripts/train.py &lt;span class="nt"&gt;--output-json&lt;/span&gt; results/run1.json

&lt;span class="c"&gt;# See all options&lt;/span&gt;
python scripts/train.py &lt;span class="nt"&gt;--help&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  9. Step 6 — Explore MLflow Experiment Tracking
&lt;/h2&gt;

&lt;p&gt;MLflow automatically captured everything about the training run. Let's view it.&lt;/p&gt;

&lt;p&gt;Open a &lt;strong&gt;new terminal&lt;/strong&gt; (keep your first terminal free for the API later). Activate the venv:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
mlflow ui &lt;span class="nt"&gt;--backend-store-uri&lt;/span&gt; mlruns &lt;span class="nt"&gt;--port&lt;/span&gt; 5000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open your browser and go to &lt;strong&gt;&lt;a href="http://localhost:5000" rel="noopener noreferrer"&gt;http://localhost:5000&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  What you'll see in the MLflow UI
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Experiments list:&lt;/strong&gt; You'll see &lt;code&gt;iris-classifier&lt;/code&gt; with one run logged.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inside the run, explore:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Parameters tab&lt;/strong&gt; — the hyperparameter values GridSearchCV chose as best:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  cv_folds              5
  test_size             0.2
  classifier__n_estimators    100
  classifier__max_depth       None
  classifier__min_samples_split  2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Metrics tab&lt;/strong&gt; — all evaluation scores:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  cv_best_score         0.9583
  test_accuracy         0.9667
  test_macro_f1         0.9667
  test_macro_precision  0.9683
  test_macro_recall     0.9667
  f1_setosa             1.0000
  f1_versicolor         0.9333
  f1_virginica          0.9667
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Artifacts tab&lt;/strong&gt; — the saved model files and a preview of the input schema&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Models tab&lt;/strong&gt; (top menu) — the &lt;code&gt;IrisClassifier&lt;/code&gt; registered model with version history&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why does this matter for a portfolio?&lt;/strong&gt; In a real company, dozens of engineers run hundreds of experiments. MLflow lets you compare them all — which model was best? What were its settings? What data was it trained on? This is how ML teams avoid the "I don't know which model is in production" problem.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Run training a second time and compare
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Back in your first terminal:&lt;/span&gt;
python scripts/train.py &lt;span class="nt"&gt;--experiment&lt;/span&gt; iris-classifier
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now refresh the MLflow UI. You'll see two runs side by side. Click the checkboxes on both and hit &lt;strong&gt;Compare&lt;/strong&gt; to see a diff of parameters and metrics.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Step 7 — Start the FastAPI Inference Server
&lt;/h2&gt;

&lt;p&gt;The trained model is now saved to disk. Let's serve it as a REST API.&lt;/p&gt;

&lt;p&gt;In your first terminal (with venv activated):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uvicorn api.main:app &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 8000 &lt;span class="nt"&gt;--reload&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--reload&lt;/code&gt; flag means the server restarts automatically when you edit code — great for development.&lt;/p&gt;

&lt;p&gt;You should see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INFO:     Started server process [12345]
INFO:     Waiting for application startup.
INFO:     Starting Iris Classifier API v1.0.0
INFO:     Model ready | classes=['setosa', 'versicolor', 'virginica']
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open your browser at &lt;strong&gt;&lt;a href="http://localhost:8000/docs" rel="noopener noreferrer"&gt;http://localhost:8000/docs&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Interactive API Docs (Swagger UI)
&lt;/h3&gt;

&lt;p&gt;FastAPI automatically generates an interactive documentation page from your code. You don't write this HTML — it's created from your Pydantic schemas and route definitions.&lt;/p&gt;

&lt;p&gt;You'll see two endpoints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;POST /predict/&lt;/code&gt; — single flower prediction&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /predict/batch&lt;/code&gt; — multiple flowers at once&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GET /health&lt;/code&gt; — is the model loaded?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Understanding the URL &lt;code&gt;http://0.0.0.0:8000&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://  0.0.0.0  :  8000  /docs
  │         │          │       │
  │         │          │       └── Path (Swagger UI page)
  │         │          └────────── Port number
  │         └───────────────────── "All network interfaces" = accessible from anywhere on this machine
  └─────────────────────────────── Protocol
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;0.0.0.0&lt;/code&gt; as the host means "listen on all network interfaces." When you open it in a browser, you use &lt;code&gt;localhost&lt;/code&gt; or &lt;code&gt;127.0.0.1&lt;/code&gt; instead.&lt;/p&gt;




&lt;h2&gt;
  
  
  11. Step 8 — Make Predictions via the API
&lt;/h2&gt;

&lt;p&gt;You have three ways to call the API. Try all three — each is used in different real-world scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  Method A — Swagger UI (browser)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;strong&gt;&lt;a href="http://localhost:8000/docs" rel="noopener noreferrer"&gt;http://localhost:8000/docs&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click on &lt;code&gt;POST /predict/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Try it out&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Replace the request body with:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"sepal_length"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;5.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"sepal_width"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"petal_length"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"petal_width"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Click &lt;strong&gt;Execute&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Scroll down to see the response&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Method B — curl (terminal)
&lt;/h3&gt;

&lt;p&gt;Open a third terminal and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/predict/ &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "sepal_length": 5.1,
    "sepal_width": 3.5,
    "petal_length": 1.4,
    "petal_width": 0.2
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"predicted_class"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"setosa"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.98&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"class_probabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"setosa"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.98&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"versicolor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"virginica"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Method C — Python requests (script)
&lt;/h3&gt;

&lt;p&gt;Create a quick test script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# test_request.py  (create this in the project root)
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8000/predict/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sepal_length&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;6.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sepal_width&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;3.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;petal_length&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;6.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;petal_width&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;2.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;requests   &lt;span class="c"&gt;# if not already installed&lt;/span&gt;
python test_request.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Method D — Batch prediction
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/predict/batch &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "samples": [
      {"sepal_length": 5.1, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2},
      {"sepal_length": 6.3, "sepal_width": 3.3, "petal_length": 6.0, "petal_width": 2.5},
      {"sepal_length": 7.0, "sepal_width": 3.2, "petal_length": 4.7, "petal_width": 1.4}
    ]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding what happens on each request
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HTTP POST /predict/
         │
         ▼ api/routers/predict.py
         │  Pydantic validates the JSON (correct types? in range?)
         │  If invalid → 422 Unprocessable Entity (automatic)
         │
         ▼ Depends(get_predictor)
         │  FastAPI calls get_predictor() to inject the Predictor object
         │  lru_cache means the model is NOT reloaded on every request
         │
         ▼ predictor.predict([5.1, 3.5, 1.4, 0.2])
         │  Builds a pandas DataFrame with the correct column names
         │  Runs pipeline.predict() — scaler transforms, then RF predicts
         │  Runs pipeline.predict_proba() — gets probability per class
         │
         ▼ Returns PredictResponse
            FastAPI serializes it to JSON and sends HTTP 200
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Testing input validation
&lt;/h3&gt;

&lt;p&gt;FastAPI + Pydantic automatically validates every request. Try sending a bad value:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/predict/ &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"sepal_length": -5, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll get a &lt;code&gt;422 Unprocessable Entity&lt;/code&gt; with a clear error message — no custom error handling code needed. This is the power of Pydantic.&lt;/p&gt;




&lt;h2&gt;
  
  
  12. Step 9 — Run the Test Suite
&lt;/h2&gt;

&lt;p&gt;Stop the API server for now (&lt;code&gt;Ctrl+C&lt;/code&gt;). Let's run the automated tests.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run all tests with verbose output&lt;/span&gt;
pytest tests/ &lt;span class="nt"&gt;-v&lt;/span&gt;

&lt;span class="c"&gt;# Run with a coverage report&lt;/span&gt;
pytest tests/ &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="nt"&gt;--cov&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;src &lt;span class="nt"&gt;--cov&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;api &lt;span class="nt"&gt;--cov-report&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;term-missing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the test output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tests/test_data_loader.py::TestLoadDataset::test_returns_dataframe_and_series PASSED
tests/test_data_loader.py::TestLoadDataset::test_correct_shape PASSED
tests/test_data_loader.py::TestSplitData::test_split_sizes PASSED
...
tests/test_api.py::TestPredictEndpoint::test_valid_request_200 PASSED
tests/test_api.py::TestPredictEndpoint::test_missing_field_422 PASSED
tests/test_api.py::TestPredictEndpoint::test_negative_value_422 PASSED
...

---------- coverage: src ----------
src/config.py              28     3    89%
src/data/loader.py         32     2    94%
src/training/trainer.py    58    12    79%
src/inference/predictor.py 55     4    93%
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What each test file covers
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;What it tests&lt;/th&gt;
&lt;th&gt;Key technique&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;test_data_loader.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Shape, columns, no nulls, stratification&lt;/td&gt;
&lt;td&gt;Direct assertion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;test_metrics.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Perfect vs imperfect predictions, rounding&lt;/td&gt;
&lt;td&gt;Parametrized fixtures&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;test_predictor.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Model loading, predict output, error cases&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;unittest.mock.patch&lt;/code&gt; to fake disk paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;test_api.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;HTTP status codes, response schema, validation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;FastAPI TestClient&lt;/code&gt; — no real server needed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why &lt;code&gt;test_predictor.py&lt;/code&gt; uses mock patches:&lt;/strong&gt;&lt;br&gt;
The &lt;code&gt;Predictor&lt;/code&gt; class loads &lt;code&gt;.pkl&lt;/code&gt; files from disk. In tests, we don't want to depend on a pre-trained model existing. Instead, we use &lt;code&gt;unittest.mock.patch&lt;/code&gt; to replace the file paths with a temp directory containing a freshly trained mini-model. This makes the tests fast, isolated, and runnable in CI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Run a single test file
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pytest tests/test_api.py &lt;span class="nt"&gt;-v&lt;/span&gt;
pytest tests/test_data_loader.py &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Run tests matching a pattern
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pytest tests/ &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="s2"&gt;"test_valid_request"&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt;
pytest tests/ &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="s2"&gt;"batch"&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  13. Step 10 — Run Everything with Docker
&lt;/h2&gt;

&lt;p&gt;So far we've been running services manually in separate terminals. Docker Compose lets you run the &lt;strong&gt;entire stack with one command&lt;/strong&gt; and tear it all down just as easily.&lt;/p&gt;

&lt;h3&gt;
  
  
  Make sure Docker Desktop is running
&lt;/h3&gt;

&lt;p&gt;Check the Docker Desktop taskbar icon — it should say "Docker Desktop is running."&lt;/p&gt;

&lt;h3&gt;
  
  
  Build and start all services
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;--build&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first build takes 3–5 minutes (it downloads base images and installs packages). Subsequent starts are fast.&lt;/p&gt;

&lt;p&gt;Watch the output — you'll see three services starting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;mlflow_server  | [INFO] Starting MLflow server...
mlflow_server  | [INFO] Listening on http://0.0.0.0:5000
ml_trainer     | [INFO] Loading Iris dataset...
ml_trainer     | [INFO] Training finished. Accuracy: 0.9667
ml_trainer     | [INFO] Model saved to /app/models/iris_classifier.pkl
ml_trainer exited with code 0       ← trainer exits after one run (this is normal)
ml_api         | [INFO] Model ready | classes=['setosa', 'versicolor', 'virginica']
ml_api         | [INFO] Uvicorn running on http://0.0.0.0:8000
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now open:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API docs:&lt;/strong&gt; &lt;a href="http://localhost:8000/docs" rel="noopener noreferrer"&gt;http://localhost:8000/docs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MLflow UI:&lt;/strong&gt; &lt;a href="http://localhost:5000" rel="noopener noreferrer"&gt;http://localhost:5000&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why the trainer exits
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;trainer&lt;/code&gt; service is configured with &lt;code&gt;restart: "no"&lt;/code&gt; — it runs the training job once and exits. This is intentional. In production, you'd trigger retraining on a schedule (via cron or a CI job), not keep a process running forever.&lt;/p&gt;

&lt;h3&gt;
  
  
  Re-run training inside Docker (without rebuilding)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose run &lt;span class="nt"&gt;--rm&lt;/span&gt; trainer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This spins up a fresh trainer container, trains the model, saves it to the shared volume, and exits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stop all services
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose down
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding Docker volumes
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;models/&lt;/code&gt; directory is shared between the trainer and the API using a Docker &lt;strong&gt;named volume&lt;/strong&gt; called &lt;code&gt;models_vol&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────┐         ┌──────────────────┐
│  trainer       │ writes  │  models_vol      │
│  container     │────────►│  (Docker volume) │
└────────────────┘         └────────┬─────────┘
                                    │ reads
                           ┌────────▼─────────┐
                           │  api             │
                           │  container       │
                           └──────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means you can retrain the model and the running API picks up the new model &lt;strong&gt;without rebuilding or redeploying the API image.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Useful Docker commands
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# See running containers&lt;/span&gt;
docker ps

&lt;span class="c"&gt;# See logs from the API container&lt;/span&gt;
docker logs ml_api &lt;span class="nt"&gt;-f&lt;/span&gt;

&lt;span class="c"&gt;# Open a shell inside the API container (for debugging)&lt;/span&gt;
docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; ml_api bash

&lt;span class="c"&gt;# Remove everything including volumes (full reset)&lt;/span&gt;
docker compose down &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  14. Step 11 — Schedule Training with Cron
&lt;/h2&gt;

&lt;p&gt;Cron is a Unix tool that runs commands on a schedule. We've included a &lt;code&gt;crontab.txt&lt;/code&gt; that runs the training pipeline every Monday at 2 AM.&lt;/p&gt;

&lt;h3&gt;
  
  
  View the schedule
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;crontab.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Retrain model every Monday at 02:00 AM&lt;/span&gt;
0 2 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; 1 &lt;span class="nb"&gt;cd&lt;/span&gt; /app &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; bash scripts/run_pipeline.sh &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/ml_pipeline_cron.log 2&amp;gt;&amp;amp;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding cron syntax
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0   2   *   *   1
│   │   │   │   │
│   │   │   │   └── Day of week: 1 = Monday (0=Sun, 6=Sat)
│   │   │   └────── Month: * = every month
│   │   └────────── Day of month: * = every day
│   └────────────── Hour: 2 = 2 AM
└────────────────── Minute: 0 = on the hour
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Install the crontab (Linux/Mac only)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crontab crontab.txt

&lt;span class="c"&gt;# Verify it's installed&lt;/span&gt;
crontab &lt;span class="nt"&gt;-l&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Test the pipeline script manually
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bash scripts/run_pipeline.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This produces timestamped log files in &lt;code&gt;logs/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;logs/
├── train_20240601_102315.log
└── results_20240601_102315.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cron inside Docker
&lt;/h3&gt;

&lt;p&gt;To run cron &lt;em&gt;inside&lt;/em&gt; the Docker trainer container instead of on the host machine, change the CMD in &lt;code&gt;docker-compose.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;trainer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cron -f&lt;/span&gt;   &lt;span class="c1"&gt;# runs cron daemon in foreground (keeps container alive)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  15. Architecture Deep Dive
&lt;/h2&gt;

&lt;p&gt;This section explains the key architectural decisions — the "why" behind the code. This is exactly what interviewers and video viewers want to understand.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why &lt;code&gt;src/config.py&lt;/code&gt; is the single source of truth
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/config.py
&lt;/span&gt;&lt;span class="n"&gt;MODEL_PATH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODELS_DIR&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.pkl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;MLFLOW_TRACKING_URI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MLFLOW_TRACKING_URI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MLRUNS_DIR&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every path, every setting, every environment variable lives here. No other file hardcodes a path or reads an env var. If you need to change where models are stored, you change &lt;strong&gt;one line&lt;/strong&gt; in &lt;code&gt;config.py&lt;/code&gt; and it propagates everywhere.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why the &lt;code&gt;sklearn Pipeline&lt;/code&gt; prevents data leakage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/training/pipeline.py
&lt;/span&gt;&lt;span class="nc"&gt;Pipeline&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;scaler&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;StandardScaler&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;classifier&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;RandomForestClassifier&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without a Pipeline, you might do this (which is wrong):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ❌ WRONG — data leakage
&lt;/span&gt;&lt;span class="n"&gt;scaler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;X_train_scaled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scaler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;X_test_scaled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scaler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# GridSearchCV also scales training data, but it already "saw" the test fold stats
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With a Pipeline inside GridSearchCV, the scaler is fit &lt;em&gt;only on the training portion of each fold&lt;/em&gt;, never on the validation data. This gives you an honest estimate of real-world performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why &lt;code&gt;lru_cache&lt;/code&gt; on &lt;code&gt;get_predictor()&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/inference/predictor.py
&lt;/span&gt;&lt;span class="nd"&gt;@lru_cache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_predictor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Predictor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Predictor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;lru_cache&lt;/code&gt; memoises the function — after the first call, it returns the cached result without calling the function again.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without it:&lt;/strong&gt; Every HTTP request would load &lt;code&gt;iris_classifier.pkl&lt;/code&gt; from disk → slow.&lt;br&gt;
&lt;strong&gt;With it:&lt;/strong&gt; The model loads once at startup, all requests share the same in-memory instance → fast.&lt;/p&gt;
&lt;h3&gt;
  
  
  Why the Predictor is separate from the API
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;src/inference/predictor.py&lt;/code&gt; has zero imports from &lt;code&gt;fastapi&lt;/code&gt;. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You can import and use &lt;code&gt;Predictor&lt;/code&gt; in a Celery worker, a CLI script, or a Jupyter notebook without FastAPI&lt;/li&gt;
&lt;li&gt;You can unit-test it with &lt;code&gt;pytest&lt;/code&gt; without starting a web server&lt;/li&gt;
&lt;li&gt;You could swap FastAPI for Flask or gRPC and the predictor code would be unchanged&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Why Pydantic schemas are worth the boilerplate
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# api/schemas.py
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PredictRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sepal_length&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;20.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;code&gt;ge=0.0&lt;/code&gt; means "greater than or equal to 0." &lt;code&gt;le=20.0&lt;/code&gt; means "less than or equal to 20."&lt;/p&gt;

&lt;p&gt;For free, you get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatic HTTP 422 if a user sends &lt;code&gt;"sepal_length": "hello"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Automatic HTTP 422 if a user sends &lt;code&gt;"sepal_length": -5&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Auto-generated OpenAPI documentation at &lt;code&gt;/docs&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Type hints that your IDE understands&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  16. How the Code Flows Together
&lt;/h2&gt;

&lt;p&gt;Here is a complete trace of what happens when you run &lt;code&gt;python scripts/train.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scripts/train.py
│
│  parse_args() — reads --experiment, --tracking-uri from CLI
│  sets os.environ for config.py to pick up
│
└── run_training()                               [src/training/trainer.py]
    │
    ├── _configure_mlflow()
    │     mlflow.set_tracking_uri(...)
    │     mlflow.set_experiment("iris-classifier")
    │
    ├── load_dataset()                           [src/data/loader.py]
    │     load_iris(as_frame=True)
    │     map integer targets → "setosa", "versicolor", "virginica"
    │     returns X: DataFrame(150×4), y: Series(150,)
    │
    ├── split_data(X, y)
    │     train_test_split(stratify=y, test_size=0.2)
    │     returns X_train(120×4), X_test(30×4), y_train, y_test
    │
    ├── get_label_encoder(y)
    │     LabelEncoder().fit(["setosa","versicolor","virginica"])
    │
    ├── build_pipeline()                         [src/training/pipeline.py]
    │     Pipeline([StandardScaler(), RandomForestClassifier()])
    │
    ├── GridSearchCV(pipeline, HYPERPARAMETER_GRID, cv=5)
    │
    ├── with mlflow.start_run():
    │     │
    │     ├── grid_search.fit(X_train, y_train)
    │     │     Tries 18 combinations × 5 folds = 90 model fits
    │     │     Retrains best params on full X_train
    │     │
    │     ├── mlflow.log_params(best_params)
    │     │
    │     ├── best_model.predict(X_test) → y_pred
    │     │
    │     ├── compute_metrics(y_test, y_pred)    [src/evaluation/metrics.py]
    │     │     accuracy_score, f1_score, precision_score, recall_score
    │     │
    │     ├── mlflow.log_metrics(metrics)
    │     │
    │     ├── mlflow.sklearn.log_model(best_model, registered_model_name="IrisClassifier")
    │     │     Saves model to mlruns/&amp;lt;experiment_id&amp;gt;/&amp;lt;run_id&amp;gt;/artifacts/model/
    │     │
    │     └── pickle.dump(best_model, "models/iris_classifier.pkl")
    │         pickle.dump(label_encoder, "models/label_encoder.pkl")
    │
    └── returns { run_id, best_params, metrics, model_path }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And when you call &lt;code&gt;POST /predict/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HTTP POST /predict/  {"sepal_length": 5.1, ...}
│
└── api/routers/predict.py: predict()
    │
    ├── Pydantic validates the request body
    │   PredictRequest(sepal_length=5.1, sepal_width=3.5, ...)
    │
    ├── Depends(get_predictor) → returns cached Predictor instance
    │
    ├── request.to_feature_list() → [5.1, 3.5, 1.4, 0.2]
    │
    └── predictor.predict([5.1, 3.5, 1.4, 0.2])
        │                                       [src/inference/predictor.py]
        ├── pd.DataFrame([[5.1, 3.5, 1.4, 0.2]], columns=FEATURE_NAMES)
        ├── pipeline.predict(X) → ["setosa"]
        ├── pipeline.predict_proba(X) → [[0.98, 0.01, 0.01]]
        └── return {
              "predicted_class": "setosa",
              "confidence": 0.98,
              "class_probabilities": {"setosa": 0.98, ...}
            }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  17. Common Errors &amp;amp; How to Fix Them
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ❌ &lt;code&gt;ModuleNotFoundError: No module named 'src'&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Running Python from the wrong directory, or the venv is not activated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Make sure you are in the project root&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; /path/to/ml-pipeline

&lt;span class="c"&gt;# Make sure the venv is activated (you should see (.venv) in your prompt)&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate

&lt;span class="c"&gt;# Then run again&lt;/span&gt;
python scripts/train.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ❌ &lt;code&gt;ModelNotFoundError: Model not found at '.../models/iris_classifier.pkl'&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; You started the API before running training. The model doesn't exist yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run training first&lt;/span&gt;
python scripts/train.py

&lt;span class="c"&gt;# Then start the API&lt;/span&gt;
uvicorn api.main:app &lt;span class="nt"&gt;--reload&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ❌ &lt;code&gt;Address already in use&lt;/code&gt; (port 8000 or 5000)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Something else is already using that port (possibly a previous server you didn't stop).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Find what's using port 8000&lt;/span&gt;
lsof &lt;span class="nt"&gt;-i&lt;/span&gt; :8000        &lt;span class="c"&gt;# Linux/Mac&lt;/span&gt;
netstat &lt;span class="nt"&gt;-ano&lt;/span&gt; | findstr :8000   &lt;span class="c"&gt;# Windows&lt;/span&gt;

&lt;span class="c"&gt;# Kill it (replace PID with the process ID from above)&lt;/span&gt;
&lt;span class="nb"&gt;kill&lt;/span&gt; &lt;span class="nt"&gt;-9&lt;/span&gt; &amp;lt;PID&amp;gt;

&lt;span class="c"&gt;# Or use a different port&lt;/span&gt;
uvicorn api.main:app &lt;span class="nt"&gt;--port&lt;/span&gt; 8001
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ❌ &lt;code&gt;docker: command not found&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Docker is not installed, or Docker Desktop is not running.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Open Docker Desktop and wait for it to say "Docker is running."&lt;/p&gt;




&lt;h3&gt;
  
  
  ❌ &lt;code&gt;Permission denied&lt;/code&gt; when running &lt;code&gt;run_pipeline.sh&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; The script is not marked as executable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x scripts/run_pipeline.sh
bash scripts/run_pipeline.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ❌ MLflow UI shows nothing / empty experiments
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; The &lt;code&gt;mlruns/&lt;/code&gt; folder doesn't exist yet (training hasn't been run), or you're pointing at the wrong URI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Make sure you train first&lt;/span&gt;
python scripts/train.py

&lt;span class="c"&gt;# Then start MLflow pointing at the right folder&lt;/span&gt;
mlflow ui &lt;span class="nt"&gt;--backend-store-uri&lt;/span&gt; mlruns &lt;span class="nt"&gt;--port&lt;/span&gt; 5000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ❌ &lt;code&gt;422 Unprocessable Entity&lt;/code&gt; from the API
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Your request body is missing a field or has an invalid value (e.g., a negative measurement).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Check the error response body — FastAPI tells you exactly which field is wrong:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"detail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"loc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sepal_length"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"msg"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ensure this value is greater than or equal to 0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"value_error.number.not_ge"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  18. Extending the Project
&lt;/h2&gt;

&lt;p&gt;Once you're comfortable with the project, here are ways to make it even more impressive:&lt;/p&gt;

&lt;h3&gt;
  
  
  Swap in a different dataset
&lt;/h3&gt;

&lt;p&gt;Replace the Iris loader in &lt;code&gt;src/data/loader.py&lt;/code&gt; with any CSV:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_dataset&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data/raw/your_dataset.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Everything else — training, MLflow logging, FastAPI — works unchanged.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add a new model (XGBoost)
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;src/training/pipeline.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;xgboost&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;XGBClassifier&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_pipeline&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Pipeline&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;scaler&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;StandardScaler&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;classifier&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;XGBClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;use_label_encoder&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eval_metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;logloss&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Update &lt;code&gt;HYPERPARAMETER_GRID&lt;/code&gt; in &lt;code&gt;src/config.py&lt;/code&gt; to match XGBoost params.&lt;/p&gt;

&lt;h3&gt;
  
  
  Promote the best model in MLflow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# scripts/promote_best_model.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;mlflow&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mlflow.tracking&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MlflowClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MlflowClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;runs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search_runs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metrics.test_accuracy DESC&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;best_run_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;runs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transition_model_version_stage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IrisClassifier&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stage&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Production&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Add GitHub Actions CI
&lt;/h3&gt;

&lt;p&gt;Push your project to GitHub — the &lt;code&gt;.github/workflows/ci.yml&lt;/code&gt; file is already written. It will automatically run lint → tests → training smoke test → Docker build on every push.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git init
git add &lt;span class="nb"&gt;.&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"feat: initial ML pipeline"&lt;/span&gt;
git remote add origin https://github.com/YOUR_USERNAME/ml-pipeline.git
git push &lt;span class="nt"&gt;-u&lt;/span&gt; origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Quick Reference Card
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────────┐
│                    QUICK REFERENCE                              │
├────────────────────────────┬────────────────────────────────────┤
│  Setup                     │  source .venv/bin/activate         │
│  Train model               │  python scripts/train.py           │
│  Start API                 │  uvicorn api.main:app --reload      │
│  MLflow UI                 │  mlflow ui --backend-store-uri mlruns│
│  Run tests                 │  pytest tests/ -v                  │
│  Run tests + coverage      │  pytest tests/ --cov=src --cov=api │
│  Docker (all services)     │  docker compose up --build         │
│  Docker (retrain only)     │  docker compose run --rm trainer   │
│  Docker (stop)             │  docker compose down               │
│  Install cron              │  crontab crontab.txt               │
├────────────────────────────┼────────────────────────────────────┤
│  API docs                  │  http://localhost:8000/docs        │
│  API health check          │  http://localhost:8000/health      │
│  MLflow UI                 │  http://localhost:5000             │
└────────────────────────────┴────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;em&gt;Manual version 1.0 — Iris Classifier ML Pipeline&lt;/em&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
