empsoft

Posted on Mar 30

Fine-Tuning YOLO with Colab MCP Claude Code — No Local GPU Required

#ai #programming #devops #datascience

Fine-Tuning YOLO with Colab MCP × Claude Code — No Local GPU Required

TL;DR

Used Google's official Colab MCP Server to give Claude Code direct access to Colab's GPU for YOLO fine-tuning
Ran the entire ML pipeline — data preprocessing, training config, training execution, evaluation, and model conversion — without leaving the terminal
Built a custom on-device model for a mobile traffic counting app using nothing but a Mac with no GPU

Introduction

In March 2026, Google officially released the Colab MCP Server.

It's an open-source bridge that lets MCP-compatible AI agents like Claude Code and Gemini CLI programmatically control Google Colab's GPU runtimes. In practice, this means you can issue commands from your local terminal, and Claude Code will create cells in a Colab notebook, write code, execute it on a GPU, and return the results — all without touching a browser.

I used this to fine-tune a YOLO model for a traffic counting app I'm building.

On a Mac with no GPU.

The Problem with the Old Workflow

Before Colab MCP, the typical indie developer's Colab workflow looked like this:

1. Prepare data locally
2. Upload to Google Drive
3. Open Colab in browser
4. Copy-paste code into cells
5. Run → Error → Fix → Re-run (manual loop)
6. Download trained model from Drive
7. Convert and evaluate locally

The problem is constant context switching. Terminal → Browser → Drive → Browser → Terminal. Your focus breaks every time, and the manual error-fix-rerun loop quietly eats hours.

Fine-tuning is fundamentally about iteration — adjust parameters → train → evaluate → adjust again. The more friction in this loop, the fewer iterations you run, and the worse your model ends up.

How Colab MCP Changes Everything

With Colab MCP Server, the workflow becomes:

1. Give Claude Code an instruction
2. Claude Code creates a cell in Colab → writes code → executes it
3. Results come back to your terminal
4. Claude Code interprets the results and proposes the next action
5. Approve, and it immediately runs the next step

No browser needed. You can run the entire fine-tuning pipeline from your terminal, on Colab's GPU.

The Fine-Tuning Pipeline

Step 1: Environment Setup

First, set up the training environment on Colab.

Prompt to Claude Code:

Create a new cell in the Colab notebook and install:
- ultralytics (YOLOv8)
- albumentations (for data augmentation)
Verify GPU availability and show nvidia-smi output.

Claude Code creates the cell via Colab MCP, writes the code, and executes it. Once it confirms a Tesla T4 GPU is available, we move on.

Step 2: Data Preprocessing

Have Claude Code handle annotation format conversion.

There's a COCO-format annotation JSON in /content/drive/MyDrive/dataset/.
Convert it to YOLOv8 format and run the script on Colab.
Class mapping:
- car → 0, truck → 1, bus → 2, motorcycle → 3, bicycle → 4, person → 5
Split into train/val at 80:20.

Claude Code writes the conversion script into a Colab cell and executes it — including bounding box clipping and format validation.

Step 3: Data Augmentation

Real-world traffic counting happens in all conditions — clear sky, overcast, backlit, nighttime.

Run an albumentations augmentation pipeline on Colab:
- Random brightness adjustment (±30%)
- Contrast adjustment
- Light motion blur
- Horizontal flip
Sync annotation coordinates with transformations.
Original:augmented ratio = 1:3.

Step 4: Fine-Tuning

The main event.

Fine-tune a YOLOv8s model for 6 classes.
Config:
- Input size: 640
- Epochs: 100
- Batch size: 16 (to fit in T4 VRAM)
- Learning rate: 0.01 with cosine scheduler
- Early stopping at 10 epochs
- Save training curve plots

Claude Code generates the YAML config and runs model.train() in Colab cells.

Training progress is relayed back through Claude Code as it fetches cell outputs. A 100-epoch run on a T4 takes a few hours, but you can check progress along the way.

Step 5: Evaluation

After training, evaluate the model.

Run validation on the trained model.
Output mAP@0.5, per-class Precision/Recall, and Confusion Matrix.
Pay special attention to truck vs car confusion rate.

Claude Code fetches the results and analyzes weak points: "Truck Recall is low — likely insufficient truck samples in the training data."

Based on this, you decide whether to add more data and retrain or adjust parameters. Thanks to Colab MCP, this entire iteration loop stays in your terminal.

Step 6: Mobile Model Conversion

Finally, convert the model for on-device inference.

Convert best.pt to TFLite and CoreML formats.
Apply int8 quantization.
Report model size and mAP difference before/after conversion.

What I Liked About Colab MCP

1. Zero Context Switching

Never leaving the terminal is the biggest win. Without the friction of switching between browser and terminal, your train of thought stays intact and iteration speed goes way up.

2. Automated Error Handling

When an error occurs in Colab, Claude Code fetches the cell output → analyzes the cause → proposes a fix → re-executes on approval. No more manually reading stack traces and Googling.

3. Claude Code Knows ML Best Practices

Fine-tuning has a lot of "conventions" — directory structure, config file formats, data splitting strategies. Claude Code handles these automatically, following best practices without you having to look them up.

4. No Local GPU Needed

Whether you're on an M1 Mac or a Windows laptop, you can access Colab's T4/A100 GPUs remotely. For indie developers, not having to buy expensive GPU hardware is a major advantage.

Gotchas

Colab Session Limits

Free Colab disconnects after 90 minutes idle or 12 hours max. For long training runs, consider Colab Pro. If your session drops mid-training, you can ask Claude Code to resume from the last checkpoint.

File Transfer

As of March 2026, direct file upload via Colab MCP Server is still limited. The reliable approach is to use Google Drive for data transfer.

Data Quality Is Still on You

Claude Code writes scripts perfectly, but whether your annotations are accurate and your training data is representative — that's your responsibility. Garbage in, garbage out applies to AI-driven development too.

What I Built

The fine-tuned model powers AI Foot Traffic Survey, a mobile app that counts vehicles and pedestrians in real time using your smartphone camera.

It detects 6 vehicle types (car, truck, bus, motorcycle, bicycle) plus pedestrians, runs all AI inference on-device, and never saves any images — privacy-first by design.

Free for up to 10 minutes per session.

iOS: App Store
Android: Google Play

Q&A

Q: Does Colab MCP cost anything?
A: The Colab MCP Server itself is free and open source. You can use Colab's free tier for T4 GPU access, but for longer training sessions, Colab Pro is recommended due to session limits. Claude Code requires an Anthropic subscription.

Q: What do I need besides Colab MCP?
A: Just uv (Python package manager) and git installed locally. No GPU required on your machine.

Q: How long does fine-tuning take?
A: Depends on dataset size and epochs. With ~5,000 images and 100 epochs, expect a few hours on a Colab T4 GPU.

Q: Can I use Colab MCP without Claude Code?
A: Yes — it works with Gemini CLI and any other MCP-compatible agent. That said, the experience of building an ML pipeline entirely through natural language instructions was smoothest with Claude Code.

Q: How much does fine-tuning improve accuracy over a pretrained model?
A: The improvement was most noticeable for domain-specific cases — distinguishing small trucks from vans, handling backlit conditions, and reducing bicycle/motorcycle confusion. Specific mAP numbers vary by dataset and app version, but real-world counting accuracy improved significantly.

Q: How did you collect training data?
A: Started with public datasets (COCO, etc.) and supplemented with annotated footage from real traffic environments. Data augmentation (brightness, blur, flipping) helped the model generalize across varied conditions.

Q: What happens if the Colab session disconnects?
A: Training is interrupted, but checkpoints are saved to Google Drive. After reconnecting, you can tell Claude Code to "resume from the last checkpoint" and training picks up where it left off.

Q: Where can I download the app?
A: Available on both iOS (App Store) and Android (Google Play). Search for "AI Foot Traffic Survey". Free for up to 10 minutes per session.

Takeaways

Colab MCP × Claude Code let me run the full YOLO fine-tuning pipeline from a Mac with no GPU
No browser needed — everything from data preprocessing to model conversion happened in the terminal
Colab MCP was just released in March 2026 — it's applicable far beyond ML, to any workflow involving Colab
Fine-tuning ML models is no longer exclusive to ML engineers

DEV Community