SAM 2 should be faster than SAM. So why is your inference running at 4 FPS instead of 12?
Meta's Segment Anything Model 2 (SAM 2) promises better accuracy and efficiency over the original SAM. The paper reports 6x faster inference on video tasks, lower memory usage, and improved segmentation quality. But when I benchmarked SAM 2 on a batch of 500 production images (1024×1024 medical scans), I got 3.8 FPS versus SAM's 11.2 FPS on the same RTX 3090.
Turns out, most SAM 2 inference pipelines blindly cargo-cult preprocessing steps from SAM tutorials without accounting for SAM 2's different encoder architecture and default predictor modes. Three config changes brought SAM 2 from 3.8 FPS to 14.1 FPS — faster than the original SAM.
Here's what actually matters: predictor multimask mode, image preprocessing caching, and memory layout for the new Hiera encoder. I'll show you the bottlenecks, the fixes, and real benchmark numbers.
The Default SAM 2 Setup Is Configured for Video, Not Images
Continue reading the full article on TildAlice

Top comments (0)