SAM Zero-Shot Medical Segmentation: 14% mIoU vs CNN 68%

#sam #medicalimaging #cnn #transferlearning

SAM's Medical Image Problem

Segment Anything Model (SAM) works beautifully on everyday photos. Point at a cat, it segments the cat. Point at a building, it segments the building. Point at a lung nodule in a CT scan, and it segments... roughly half the nodule plus some random rib fragments.

I tested SAM's zero-shot performance on three medical imaging datasets: chest X-rays (pneumonia detection), retinal fundus images (diabetic retinopathy lesions), and brain MRI (tumor segmentation). Against a basic ResNet50-UNet with transfer learning from ImageNet, SAM lost every single matchup. The average mIoU gap was 54 percentage points.

This surprised me. SAM was trained on 11 million images and 1.1 billion masks. The model has seen more visual data than most researchers will encounter in their careers. Yet a CNN pre-trained on everyday photos and fine-tuned for 20 epochs on 200 medical images consistently outperformed it.