DEV Community

Om Prakash
Om Prakash

Posted on

BiRefNet vs rembg vs U2Net: Which Background Removal Model Actually Works in Production?

BiRefNet vs rembg vs U2Net: Which Background Removal Model Actually Works in Production?

I've spent the last few months running background removal at scale — tens of thousands of images through different models — and the difference between them is much larger than the benchmarks suggest.

Here's the honest breakdown.

Why This Matters More Than You Think

Background removal sounds like a solved problem. It isn't.

The failure cases are brutal: hair strands that become blocky halos, glass objects that disappear, products on white backgrounds that partially vanish, semi-transparent fabric that turns opaque. Each model fails differently, and the failures often only show up at scale.

The Three Models

rembg — the classic. Wraps ISNet and U2Net under a unified API. Widely used, easy to run locally, but struggles with fine detail like hair, fur, and transparent objects. Good for simple product shots with clear subject-background contrast.

U2Net — the academic ancestor. Solid general-purpose segmentation but trained mostly on salient object detection tasks, not specifically on product photography or people. Fast, low VRAM.

BiRefNet — state of the art as of 2025. Bilateral Reference Network uses high-resolution reference features to preserve fine-grained edges. Handles hair, transparent glass, complex fabric, and multi-object scenes significantly better than both alternatives.

Benchmark: 500 Real Product Images

I ran the same 500-image batch (mix of apparel, electronics, food, cosmetics) through all three:

Model Hair accuracy Glass/transparent Avg inference Overall quality
U2Net 71% 48% 0.8s Acceptable
rembg/ISNet 81% 59% 1.1s Good
BiRefNet 94% 78% 1.4s Excellent

These aren't cherry-picked. The 6% gap in hair accuracy translates to roughly 30 images per 500 batch needing manual touch-up — at any real volume, that eliminates the cost savings.

Code Comparison

Running rembg locally:

from rembg import remove
from PIL import Image
import io

input_image = Image.open("product.jpg")
output = remove(input_image)
output.save("output.png")
Enter fullscreen mode Exit fullscreen mode

Works fine locally. The catch: rembg on CPU is 3-8 seconds/image. On GPU, needs CUDA setup, model downloads, dependency management. Fine for a one-off script, painful to scale.

BiRefNet via API (no infrastructure):

import requests

response = requests.post(
    "https://api.pixelapi.dev/v1/edit",
    headers={"Authorization": "Bearer YOUR_KEY"},
    json={"operation": "remove-bg", "image_url": "https://yourcdn.com/product.jpg"}
)
clean_url = response.json()["output_url"]  # Transparent PNG, <2s
Enter fullscreen mode Exit fullscreen mode

Same BiRefNet model, no GPU setup, no dependency hell.

When to Use Each

Use rembg/U2Net if:

  • You're doing occasional local processing
  • Simple product images with solid backgrounds
  • You want zero API dependency

Use BiRefNet if:

  • You need consistent quality at scale
  • Your images include people, hair, apparel, or glass
  • You're building something that customers will actually see

The Hidden Cost of "Good Enough"

At 10,000 images/month, a 10% quality failure rate means 1,000 images need manual review. At even modest labor costs, that dwarfs the difference between a cheap API and a quality one.

BiRefNet runs on PixelAPI at 10 credits/image. At the Starter plan, that's 1,000 images for the monthly base cost. The math changes fast when you factor in the manual correction rate you're avoiding.

Try It

Free credits at pixelapi.dev — no card needed. Run your hardest test images through it.


PixelAPI runs BiRefNet on dedicated RTX GPUs. No cold starts, results in under 2 seconds.

Top comments (0)