I launched IDPhotoSnap on Product Hunt today. It's a free passport, visa, and ID photo maker for 85+ countries. Here's the technical writeup of why it runs 100% in the browser, what that bought me, and where the tradeoffs were.
Why client-side
The straightforward way to build this product would have been:
- User uploads photo → server
- Server runs image processing (crop, resize, background)
- Server sends back result
This is the architecture every paid competitor uses. It also costs money to run, requires user accounts to manage abuse, and creates a privacy concern: somewhere on a server is a database of passport photos.
Client-side processing flips all three:
- Cost: $0 compute. The user's phone does the work.
- Abuse model: there's nothing to abuse. There's no server to overload, no API to rate-limit.
- Privacy: the photo never leaves the device. This is verifiable — open DevTools and watch the network tab.
What's actually running in the browser
The core processing pipeline:
async function processPhoto(file, countrySpec) {
const img = await loadImage(file)
const faceBox = await detectFace(img)
const cropBox = computeCrop(faceBox, countrySpec)
const canvas = renderCrop(img, cropBox, countrySpec.dimensions)
return canvas.toDataURL('image/jpeg', 0.92)
}
Three steps that matter:
1. Face detection uses the FaceDetector API where available (Chrome on Android, recent Safari) and falls back to a small TensorFlow.js model on browsers that don't support it. The fallback adds about 4MB to the initial load but only loads on demand.
2. Crop computation is country-specific. Each country has documented requirements like "face must occupy 70-80% of the frame, vertically centered, eyes at 60% from the bottom." These are encoded as JSON specs:
{
"country": "US",
"dimensions": { "width_mm": 51, "height_mm": 51, "dpi": 300 },
"face": { "min_height_pct": 50, "max_height_pct": 69, "vertical_center_pct": 56 },
"background": "#FFFFFF",
"head_position": "centered"
}
Getting these specs right was the actual hard work. Some governments publish them well. Most don't. I ended up cross-referencing consulate PDFs in the local language for about half the countries.
3. Canvas rendering is just drawImage with the computed bounding box, then toDataURL for export. No magic.
Background removal
This was the part where I almost gave up on client-side. Background removal historically meant a U-Net or similar segmentation model — too heavy for the browser.
The answer was the MediaPipe Selfie Segmentation model. It's about 256KB, runs at 30fps on a mid-range phone, and produces a soft alpha mask good enough for passport photo backgrounds. After segmentation, I composite over a white canvas. Done.
What I lost by going client-side
Three real tradeoffs:
- No analytics on uploaded photos. Useful for debugging but obviously not viable here.
- Initial load is heavier. First visit fetches face detection fallback + segmentation model. Total: ~5MB. After cache, instant.
- No batch processing. Can't queue 1000 photos through. But this is a passport photo tool — one photo at a time is the use case.
What I gained
- Hosting cost is $0. Just a static site on Vercel.
- No GDPR exposure. No user data is collected because none is transmitted.
- Genuinely free. Because there's no compute cost, the product can stay free forever without ads, subscriptions, or tier-locking.
On Product Hunt
If you want to see the result, the launch is here: https://www.producthunt.com/products/idphotosnap
The site itself is at https://idphotosnap.com.
FAQ
Q: How does this make money?
It doesn't yet. If traffic grows, I'll add unobtrusive ads. No subscription tier planned.
Q: Why not WebAssembly for face detection?
The FaceDetector API is fast enough on modern phones and the TF.js fallback handles older browsers. WASM would be a reasonable optimization later but isn't blocking anything.
Q: What if I want to verify the photo doesn't leave my device?
Open DevTools → Network tab → upload a photo. You'll see no requests carrying image data.
Q: Can I self-host?
Not open source yet. Possibly in the future after the codebase stabilizes.
Feedback welcome. The hardest thing right now isn't the code, it's getting the country specs right — if you've used the tool for a country and the result was rejected, I want to know.
Top comments (0)