Part of an ongoing build-in-public series on Convertify, a free image/file converter I build solo. This week: background removal. The honest version, with the walls I hit.
Most "remove image background" tutorials end with pip install rembg and a happy screenshot. Mine started with a constraint: my whole backend is Rust, and I did not want to bolt a Python process onto it just to run one model.
Here is how the week went. The good parts, and the three or four times I stared at a compiler error wondering if the constraint was worth it.
The starting point
Convertify is a free image converter. The backend is Rust + Axum + libvips, the model has to run CPU-only on a modest VPS, and there is no GPU anywhere in the budget. The obvious path for background removal is rembg, which is excellent, but it is Python and ships as a separate server process. Adding it would mean a second runtime, a second thing to deploy, a second thing to crash at 3am.
So the question for the week was simple: can I run the same models rembg uses, but natively in Rust?
Short answer: yes. rembg is, under the hood, just ONNX models plus some image pre and post processing. The models (u2net, isnet, silueta) are all .onnx files. If I can run ONNX in Rust and do the image work in libvips (which I already have), there is no Python in the picture at all.
The plan
The pipeline for background removal is not magic, it is five boring steps:
- Decode the image, resize a copy to the model input size
- Normalize the pixels into a tensor
- Run inference, get a mask (one value per pixel: subject or background)
- Normalize the mask, resize it back to the original size
- Composite the mask onto the original as an alpha channel, export a transparent PNG
Steps 1, 4, 5 are libvips, which I already use everywhere. Step 3 is ONNX Runtime via the ort crate. Step 2 is a tight Rust loop. No Python anywhere.
[dependencies]
ort = { version = "=2.0.0-rc.12", features = ["download-binaries"] }
The download-binaries feature pulls a CPU build of ONNX Runtime at build time, so there is nothing to install on the box. That alone deleted half the "deploy a Python service" anxiety.
Wall #1: the model name lied about its size
I grabbed isnet-general-use.onnx from the rembg releases, expecting ~44 MB. What landed was 171 MB. My first thought was a broken download or an HTML error page renamed to .onnx. Quick check:
file models/isnet-general-use.onnx
head -c 200 models/isnet-general-use.onnx | xxd | head
The header showed a real pytorch 1.13.1 signature and tensor names like input_image and conv_in.weight. So it was a valid ONNX model, just heavier than the name suggested. Lesson: verify the file is actually what you think before you spend an hour debugging "why is RAM so high."
Wall #2: ort errors are not Send + Sync
First compile against anyhow and I get hit with this:
the trait `Sync` is not implemented for `NonNull<OrtSessionOptions>`
required for `anyhow::Error` to implement `From<ort::Error<SessionBuilder>>`
anyhow::Error wants Send + Sync. The ort error type holds raw pointers into the ONNX Runtime C++ session, which are not Sync. So ? straight into anyhow does not compile.
The fix is to stringify the error at the boundary. Display gives you a String, and String is Send + Sync:
let session = build(model_path, intra_threads)
.map_err(|e| anyhow!("ort session init: {e}"))?;
The pointer never leaves, only the message does. Once I understood why, the pattern was mechanical: every ort ? that crosses into anyhow gets a .map_err(|e| anyhow!("...: {e}"))?.
Wall #3: run takes &mut self
This one actually changed my architecture. In this ort version, Session::run takes &mut self. I had the session behind an Arc in my Axum app state so it could be shared. You cannot get &mut through an Arc.
cannot borrow `self.session` as mutable, as it is behind a `&` reference
Options were a session pool, or a Mutex. Since my traffic is low and I gate inference to one at a time anyway, I wrapped the session in a Mutex:
pub struct BgRemover {
session: Mutex<Session>,
}
remove(&self) stays &self, so Arc<BgRemover> still works in app state. The Mutex hands out the &mut for the single inference call. With a one-permit semaphore in front, the mutex never even contends. When traffic grows, the upgrade path is a pool of sessions, but that is a future-me problem.
Wall #4: *mut VipsImage is not Send
libvips image pointers are not Send, which means they cannot be held across an .await. If I ran inference directly in the async handler, the borrow checker would stop me, and even if it did not, a multi-second CPU inference on an async worker thread would freeze the whole runtime.
The answer is spawn_blocking. The entire libvips + inference chain runs on a dedicated blocking thread and returns finished PNG bytes (which are Send):
let png = tokio::task::spawn_blocking(move || {
let _permit = permit; // hold the semaphore for the whole job
remover.remove(&bytes)
}).await??;
Every VipsImage is created and dropped inside that closure, never crossing an await point. The async runtime stays free to serve everything else while one image is being cut out.
The part that surprised me: privacy came for free
Because the handler returns the PNG straight in the HTTP response, the image is never written to disk. It comes in as multipart bytes, gets processed in memory, and the result streams back. Nothing is stored, nothing is queued, nothing to clean up.
I did not plan that as a feature, it fell out of the architecture. But "your photo is processed in memory and never saved" is a genuinely strong thing to be able to say, and it is true, not marketing.
Does it actually work?
Yes. First real test through Postman with a HEIC photo: 200 OK, transparent PNG out. The model is ISNet (the IS-Net dichotomous segmentation architecture), and on clean subjects, products, people, logos, the cutout is sharp.
What I would tell past-me
-
rembgis "just" ONNX + image ops. If you already have an image library, you can skip the Python entirely withort. - The
ort2.0 API churns between rc versions. Pin the exact version and expect to fix one or two method names. -
spawn_blockingis not optional for CPU-heavy, non-Sendwork. It is the whole reason the server stays responsive. - Constraints ("no Python") are annoying in the moment and clarifying in hindsight. The Rust-native version is one binary, one deploy, nothing to babysit.
If you want to see the result, background removal is live and free (no signup, no watermark) on Convertify. Upload a photo, get a transparent PNG. It runs the exact pipeline above.
Next week: turning one tool into a set of use-case pages (passport photos, product shots) without drowning in duplicate content. That one is more SEO than Rust, but the build-in-public log continues.
What would you have done differently on the &mut self session problem? A pool, a mutex, something smarter? Curious how others handle shared ONNX sessions under load.
Top comments (0)