DEV Community

TildAlice
TildAlice

Posted on • Originally published at tildalice.io

SAM Zero-Shot Medical Segmentation: 14% mIoU vs CNN 68%

SAM's Medical Image Problem

Segment Anything Model (SAM) works beautifully on everyday photos. Point at a cat, it segments the cat. Point at a building, it segments the building. Point at a lung nodule in a CT scan, and it segments... roughly half the nodule plus some random rib fragments.

I tested SAM's zero-shot performance on three medical imaging datasets: chest X-rays (pneumonia detection), retinal fundus images (diabetic retinopathy lesions), and brain MRI (tumor segmentation). Against a basic ResNet50-UNet with transfer learning from ImageNet, SAM lost every single matchup. The average mIoU gap was 54 percentage points.

This surprised me. SAM was trained on 11 million images and 1.1 billion masks. The model has seen more visual data than most researchers will encounter in their careers. Yet a CNN pre-trained on everyday photos and fine-tuned for 20 epochs on 200 medical images consistently outperformed it.

Intravenous fluid bags hanging in front of medical imaging screens in a hospital setting.

Photo by Furkan İnce on Pexels

Why Foundation Models Struggle on Medical Data


Continue reading the full article on TildAlice

Top comments (0)