How I Built Browser-Based Face Analysis With Zero Server Costs
Every face analysis tool on the internet works the same way: upload your photo to a server, wait for results, hope they don't store your image.
I wanted to build something different — an AI face analyzer that runs entirely in the browser. No server processing, no cloud uploads, no data collection. Your photos never leave your device.
Here's how I built RealSmile, what I learned, and why browser-based AI is the future for privacy-sensitive tools.
The Problem With Server-Side Face Analysis
Most face analysis apps follow this flow:
- User uploads photo
- Photo sent to server
- Server runs ML model (TensorFlow, PyTorch, etc.)
- Results sent back
- Photo sits on server... forever?
This creates three problems:
- Privacy — users don't know what happens to their photos
- Cost — GPU instances are expensive ($50-500/mo depending on traffic)
- Latency — upload time + processing time + download time
The Browser-Based Alternative
What if the AI model ran on the user's device instead?
User uploads photo → Browser loads ML model → Processing on user's CPU/GPU → Results displayed
No round trip. No server. No privacy concerns. And crucially — no compute costs for me.
The Tech Stack
Here's what powers RealSmile:
- face-api.js (@vladmandic/face-api) — Pre-trained models for face detection, landmark detection, and expression recognition
- Next.js — React framework for the app shell
- 68-point facial landmark detection — Maps facial geometry for proportion analysis
- TinyFaceDetector — Lightweight model optimized for browser performance
The key insight: face-api.js models are ~5MB total and load from a CDN (jsdelivr). Once loaded, all processing happens in the browser's JavaScript engine.
Loading Models Efficiently
The biggest UX challenge is model loading. 5MB is nothing for a server, but noticeable for a first-time user. Here's how I handle it:
let modelsLoaded = false
let modelsLoading = false
async function ensureModels() {
if (modelsLoaded) return // Already loaded, instant
if (modelsLoading) {
// Another call is loading, wait for it
while (modelsLoading) {
await new Promise(r => setTimeout(r, 200))
}
return
}
modelsLoading = true
const MODEL_URL = 'https://cdn.jsdelivr.net/npm/@vladmandic/face-api/model'
await Promise.all([
faceapi.nets.tinyFaceDetector.loadFromUri(MODEL_URL),
faceapi.nets.faceLandmark68Net.loadFromUri(MODEL_URL),
faceapi.nets.faceExpressionNet.loadFromUri(MODEL_URL),
])
modelsLoaded = true
modelsLoading = false
}
Key decisions:
- Lazy loading — models only load when the user actually uploads a photo
- Singleton pattern — prevents duplicate loading if multiple components request models
- CDN delivery — jsdelivr caches globally, so load times are fast worldwide
Smile Detection: The Duchenne Method
One of RealSmile's features is detecting whether a smile is genuine. This is based on the Duchenne smile — a 150-year-old discovery that genuine smiles activate muscles around the eyes (orbicularis oculi), not just the mouth.
const detection = await faceapi
.detectSingleFace(canvas, new faceapi.TinyFaceDetectorOptions({
inputSize: 512,
scoreThreshold: 0.3
}))
.withFaceLandmarks()
.withFaceExpressions()
const expressions = detection.expressions
const happyScore = expressions.happy || 0
const isGenuine = happyScore > 0.6
The expression model returns probability scores for 7 emotions. A high "happy" score combined with landmark analysis around the eye region gives a reasonable approximation of Duchenne smile detection.
Golden Ratio Face Analysis
The more interesting technical challenge was measuring facial proportions against the golden ratio (φ = 1.618).
With 68 facial landmarks, you can calculate ratios between key measurements:
const PHI = 1.618
// Key landmark points
const faceTop = landmarks[19] // Between eyebrows
const chin = landmarks[8] // Bottom of chin
const leftFace = landmarks[0] // Left jawline
const rightFace = landmarks[16] // Right jawline
const faceHeight = distance(faceTop, chin)
const faceWidth = distance(leftFace, rightFace)
// How close is this ratio to the golden ratio?
const ratio = faceHeight / faceWidth
const score = Math.max(0, (1 - Math.abs(ratio - PHI) / PHI)) * 100
I measure 6 ratios total: face height/width, eye spacing, mouth/nose ratio, nose width, eye width, and nose shape. The overall score is an average of how close each ratio is to its ideal phi-based value.
Making It Embeddable
The architecture that makes this free and private also makes it embeddable. Since there's no server dependency, anyone can add the analyzer to their site with one line of code:
<iframe
src="https://realsmile.online/embed/smile"
width="380"
height="400"
style="border:none;border-radius:16px;"
loading="lazy"
></iframe>
The iframe loads a compact version of the analyzer. Models load from jsdelivr's CDN (not my server). Processing happens in the visitor's browser. My server just serves a static page.
This means even if 10,000 sites embed the widget, my hosting costs stay near zero.
Check out all 4 embeddable widgets at realsmile.online/widget.
Performance Numbers
On a modern laptop:
- Model load: ~2-3 seconds (first time), instant after (cached)
- Face detection: ~33ms
- Full analysis (landmarks + expressions): ~80ms
- Total from upload to results: < 1 second
On mobile (iPhone 13):
- Model load: ~4-5 seconds first time
- Full analysis: ~150ms
Perfectly usable on both platforms without any server infrastructure.
Cost Breakdown
Here's what it costs to run:
| Component | Cost |
|---|---|
| Vercel hosting (static pages) | $0 (free tier) |
| jsdelivr CDN (model files) | $0 (free) |
| AI processing | $0 (runs on user's device) |
| Domain | ~$12/year |
| Total | ~$1/month |
Compare that to running TensorFlow Serving on a GPU instance: $50-500/month depending on traffic. The browser-based approach is essentially free at any scale.
Tradeoffs
It's not all sunshine:
- Model size — 5MB is acceptable but not tiny. Users on slow connections notice.
- Device dependent — Old phones or low-end laptops will be slower.
- Model limitations — Browser-compatible models are smaller and less accurate than server-side alternatives.
- No training — You can't collect data to improve models (which is the privacy point, but it limits improvement).
For this use case — entertainment-grade face analysis — the tradeoffs are worth it. For medical imaging or security applications, you'd want server-side processing.
What I'd Do Differently
- WebAssembly — ONNX Runtime Web or TFLite WASM would be faster than pure JS inference
- WebGPU — When browser support improves, GPU acceleration will close the gap with server-side
- Progressive loading — Load the smallest model first for instant basic results, then upgrade
Try It
- Full tools: realsmile.online — smile analyzer, golden ratio test, face score, photo ranker
- Embeddable widgets: realsmile.online/widget — add AI face analysis to your site with one line of code
- All tools are free, no signup required, and 100% private
If you're building privacy-sensitive tools, consider browser-based ML. The models are good enough for many use cases, the cost savings are dramatic, and users genuinely appreciate knowing their data stays on their device.
Happy to answer questions in the comments!
Top comments (1)
Happy to answer questions about the architecture or share more code snippets