Most background removal tools work like this: upload your photo to a server, wait for an AI model to process it, download the result. Your image sits on someone else's infrastructure. You hope they delete it.
I built one that works differently. The AI model runs in your browser tab. Your image never leaves your device. And I just open-sourced the core logic — two files, zero dependencies beyond a CDN import.
Here's how it works under the hood.
The Pipeline
The full flow from "user drops an image" to "transparent PNG download" goes through five stages:
Upload → ONNX Model Load → WebAssembly Inference → Mask Generation → Canvas Compositing
Each stage runs entirely client-side. Let me walk through them.
Stage 1: Loading the AI Model in the Browser
The backbone is @imgly/background-removal, an open-source library that bundles an ONNX segmentation model with ONNX Runtime Web (WebAssembly backend).
const LIB_CDN = 'https://cdn.jsdelivr.net/npm/@imgly/background-removal@1.5.5';
async function loadLibrary() {
const module = await import(LIB_CDN + '/+esm');
removeBackgroundFn = module.removeBackground;
}
The first call downloads ~40MB of model weights. That sounds heavy, but:
- The browser caches it automatically
- Subsequent uses load instantly from cache
- No server round-trip on any future use
This is the same trade-off FFmpeg.wasm makes — big initial download, but then your browser becomes a local processing powerhouse.
Stage 2: Running AI Inference Locally
Once the model is loaded, inference is straightforward:
const imageBlob = await new Promise(r => canvas.toBlob(r, 'image/png'));
const resultBlob = await removeBackgroundFn(imageBlob, {
model: 'medium',
output: { format: 'image/png' },
progress: (key, current, total) => {
// Update loading UI
}
});
What's happening behind the scenes:
- The library resizes your image to the model's input dimensions
- Pixel data is converted to a tensor
- ONNX Runtime Web runs the segmentation model via WebAssembly
- The output tensor (a per-pixel foreground probability map) is converted back to an image with transparent background
The medium model balances quality and speed. On a decent laptop, inference takes 2-5 seconds for a typical photo. On a phone, maybe 8-15 seconds. Acceptable for a free, private tool.
Stage 3: Building the Editable Mask
Here's where it gets interesting. The AI output isn't final — it's a starting point. I extract the alpha channel from the AI result and build an editable grayscale mask:
async function buildMaskFromResult() {
const w = originalImage.naturalWidth;
const h = originalImage.naturalHeight;
// Draw AI result to a temporary canvas
const resultCanvas = document.createElement('canvas');
resultCanvas.width = w;
resultCanvas.height = h;
const rCtx = resultCanvas.getContext('2d');
rCtx.drawImage(resultImg, 0, 0);
const resultData = rCtx.getImageData(0, 0, w, h);
// Extract alpha channel → grayscale mask
// White = foreground (keep), Black = background (remove)
maskCanvas = document.createElement('canvas');
maskCanvas.width = w;
maskCanvas.height = h;
maskCtx = maskCanvas.getContext('2d');
const maskData = maskCtx.createImageData(w, h);
for (let i = 0; i < resultData.data.length; i += 4) {
const alpha = resultData.data[i + 3];
maskData.data[i] = alpha; // R
maskData.data[i + 1] = alpha; // G
maskData.data[i + 2] = alpha; // B
maskData.data[i + 3] = 255; // A (mask itself is always opaque)
}
maskCtx.putImageData(maskData, 0, 0);
}
Why a separate mask canvas?
Because users need to fix the AI's mistakes. Hair edges, transparent objects, similar-colored backgrounds — no AI gets these perfect 100% of the time. The mask canvas becomes a paintable surface.
Stage 4: Manual Refinement with Brush & Eraser
This is the feature that separates a toy demo from a usable tool. Users can:
- Brush (paint white on mask) → restore foreground areas the AI removed
- Eraser (paint black on mask) → remove background areas the AI missed
function paintOnMask(e) {
const rect = editCanvas.getBoundingClientRect();
const x = (e.clientX - rect.left) / rect.width * maskCanvas.width;
const y = (e.clientY - rect.top) / rect.height * maskCanvas.height;
const brushSize = parseInt(brushSizeEl.value);
const softness = parseInt(brushSoftEl.value) / 100;
maskCtx.lineCap = 'round';
maskCtx.lineWidth = brushSize;
// Softness = CSS filter blur on the mask canvas context
if (softness > 0) {
maskCtx.filter = `blur(${Math.round(brushSize * softness * 0.3)}px)`;
}
if (currentTool === 'brush') {
maskCtx.globalCompositeOperation = 'lighter';
maskCtx.strokeStyle = '#ffffff';
} else {
maskCtx.globalCompositeOperation = 'source-over';
maskCtx.strokeStyle = '#000000';
}
maskCtx.beginPath();
maskCtx.moveTo(lastX, lastY);
maskCtx.lineTo(x, y);
maskCtx.stroke();
}
Key details:
- Coordinate mapping: The edit canvas is CSS-scaled to fit the viewport, but the mask operates at full image resolution. Every mouse position gets mapped from display coordinates to mask coordinates.
-
Edge softness: Uses Canvas 2D
filter: blur()on the stroke — this creates feathered edges instead of hard cuts. -
Undo stack: Each mousedown saves a full
ImageDatasnapshot of the mask. Up to 20 undo levels.
The brush cursor is a position: fixed div that follows the mouse, sized to match the display-scaled brush diameter. The actual canvas cursor is set to none.
Stage 5: Compositing the Final Output
To generate the downloadable PNG, the mask is applied to the original image:
function applyMaskToOriginal() {
const origData = origCtx.getImageData(0, 0, w, h);
const mData = maskCtx.getImageData(0, 0, w, h);
const outData = oCtx.createImageData(w, h);
for (let i = 0; i < origData.data.length; i += 4) {
outData.data[i] = origData.data[i]; // R — original
outData.data[i + 1] = origData.data[i + 1]; // G — original
outData.data[i + 2] = origData.data[i + 2]; // B — original
outData.data[i + 3] = mData.data[i]; // A — from mask R channel
}
oCtx.putImageData(outData, 0, 0);
return outCanvas;
}
The mask's R channel (which equals G and B since it's grayscale) becomes the alpha channel of the output. White mask pixels → fully opaque. Black → fully transparent. Gray → semi-transparent (useful for hair and soft edges).
The Refine Mode Overlay
In refine mode, users see the original image with a semi-transparent red overlay on removed areas:
function renderMaskOverlay() {
editCtx.drawImage(maskCanvas, 0, 0, dw, dh);
const overlayData = editCtx.getImageData(0, 0, dw, dh);
for (let i = 0; i < overlayData.data.length; i += 4) {
const maskVal = overlayData.data[i];
if (maskVal < 128) {
// Removed area → semi-transparent red
overlayData.data[i] = 220; // R
overlayData.data[i + 1] = 50; // G
overlayData.data[i + 2] = 50; // B
overlayData.data[i + 3] = 120; // A
} else {
// Kept area → fully transparent (show original underneath)
overlayData.data[i + 3] = 0;
}
}
editCtx.putImageData(overlayData, 0, 0);
}
This gives immediate visual feedback — you can see exactly what the AI removed and paint corrections in real time.
Performance Considerations
- Memory: Three full-resolution canvases live in memory (original, mask, output). For a 4000×3000 photo, that's ~144MB of pixel data. Mobile devices with <4GB RAM may struggle.
-
Real-time rendering: Every brush stroke triggers
renderPreview()viarequestAnimationFrame. This redraws the preview canvas + overlay from the mask. On large images, there's a noticeable lag. -
Touch support: Full touch event handling with
passive: falseto prevent scroll interference.
What I Stripped for the Open-Source Version
The production version on ToolKnit includes:
- Daily usage limits (fair-use throttling)
- Analytics tracking
- Self-hosted model weights (faster loading from our CDN)
- Sound effects on completion
- Site navigation and SEO shell
The open-source version strips all of that down to two files:
-
index.html— standalone UI (~250 lines) -
app.js— core logic (~380 lines)
You can clone it, run npx serve ., and have a working background remover in 30 seconds.
What's Next
Some ideas for anyone who wants to fork and extend:
- Background replacement — solid color or custom image behind the subject
- Batch processing — drop multiple images, process all sequentially
- WebGPU acceleration — ONNX Runtime Web supports WebGPU; inference could be 3-5x faster
- Edge feathering controls — post-process the mask with adjustable blur radius
- Before/after slider — drag to compare original and result
Try It
- Live tool: toolknit.com/tools/background-remover.html
- Open source: github.com/2645149786-dotcom/toolknit
- All 61 tools: toolknit.com
If you've ever needed to remove a background without uploading your photo to a random website — this is it. Clone it, use it, break it, improve it.
Built by Zihang Dong. Building browser-first tools at ToolKnit.
Top comments (0)