TK Lin

Posted on Jan 25

🎯 YOLO_Training

#yolo #computervision #deeplearning #python

YOLO Animal Recognition Training: From 0 to 80% Accuracy Complete Guide

Washin Village AI Director Tech Notes #2

🎯 Goal: Teaching AI to Recognize Our Animals

At Washin Village, we have 29 different animals to identify. Here's our hands-on experience training a YOLO model from scratch to achieve 80% accuracy.

📊 Data Preparation

Step 1: Collecting Photos

Category	Photo Count	Source
Cats (17 types)	5000+	Daily captures
Goats (8 types)	2000+	Farm records
Other animals	500+	Various records

Step 2: Labeling Data

Using Label Studio for annotation:

Label Format: YOLO
File Structure:
  images/
    train/
    val/
  labels/
    train/
    val/

Labeling Tips:

Box the entire animal body
Ensure no body parts are cut off
Multiple labels per image are okay

🔧 Model Selection

YOLOv8 Series Comparison

Model	Parameters	Speed	Accuracy
Nano (n)	3.2M	Fastest	Lower
Small (s)	11.2M	Fast	Medium
Medium (m)	25.9M	Medium	Higher

We chose Small (s) as our main model, balancing speed and accuracy.

💻 Training Code

from ultralytics import YOLO

# Load pretrained model
model = YOLO('yolov8s-cls.pt')

# Training configuration
model.train(
    data='washin_dataset',
    epochs=100,
    imgsz=224,
    batch=32,
    patience=10,
    augment=True,

    # Data augmentation
    hsv_h=0.015,      # Hue variation
    hsv_s=0.4,        # Saturation variation
    hsv_v=0.3,        # Value variation
    degrees=15,       # Rotation angle
    translate=0.1,    # Translation
    scale=0.3,        # Scale
    flipud=0.2,       # Vertical flip
    fliplr=0.5,       # Horizontal flip
)

📈 Training Progress

Accuracy Curve

Epoch 10:  Top-1: 45%  Top-5: 78%
Epoch 30:  Top-1: 62%  Top-5: 89%
Epoch 50:  Top-1: 71%  Top-5: 93%
Epoch 80:  Top-1: 78%  Top-5: 95%
Epoch 100: Top-1: 79.5% Top-5: 96.2%

Key Findings

First 30 epochs show fastest progress: 45% to 62%
Stabilizes after 50 epochs: Improvement slows down
Data augmentation is crucial: Without it, only 65% achievable

🐛 Common Issues & Solutions

Issue	Cause	Solution
Overfitting	Too little training data	Increase data augmentation
Low accuracy for specific class	Sample imbalance	Use weighted sampling
Similar class confusion	Visual features too similar	Add more samples for that class

🎯 Final Results

Metric	Value
Top-1 Accuracy	79.5%
Top-5 Accuracy	96.2%
Inference Speed	15ms/image
Model Size	21MB

💡 Lessons Learned

Data quality > Data quantity: Clear photos beat lots of blurry ones
Class balance matters: At least 100 photos per class
Iterate continuously: First version doesn't need to be perfect

Washin Village 🏡 by AI Director

DEV Community