DEV Community

alphadisjunkt
alphadisjunkt

Posted on

I Built AI Face Analysis That Runs Entirely in the Browser — Here's How (Zero Server Costs)

How I Built Browser-Based Face Analysis With Zero Server Costs

Every face analysis tool on the internet works the same way: upload your photo to a server, wait for results, hope they don't store your image.

I wanted to build something different — an AI face analyzer that runs entirely in the browser. No server processing, no cloud uploads, no data collection. Your photos never leave your device.

Here's how I built RealSmile, what I learned, and why browser-based AI is the future for privacy-sensitive tools.

The Problem With Server-Side Face Analysis

Most face analysis apps follow this flow:

  1. User uploads photo
  2. Photo sent to server
  3. Server runs ML model (TensorFlow, PyTorch, etc.)
  4. Results sent back
  5. Photo sits on server... forever?

This creates three problems:

  • Privacy — users don't know what happens to their photos
  • Cost — GPU instances are expensive ($50-500/mo depending on traffic)
  • Latency — upload time + processing time + download time

The Browser-Based Alternative

What if the AI model ran on the user's device instead?

User uploads photo → Browser loads ML model → Processing on user's CPU/GPU → Results displayed
Enter fullscreen mode Exit fullscreen mode

No round trip. No server. No privacy concerns. And crucially — no compute costs for me.

The Tech Stack

Here's what powers RealSmile:

  • face-api.js (@vladmandic/face-api) — Pre-trained models for face detection, landmark detection, and expression recognition
  • Next.js — React framework for the app shell
  • 68-point facial landmark detection — Maps facial geometry for proportion analysis
  • TinyFaceDetector — Lightweight model optimized for browser performance

The key insight: face-api.js models are ~5MB total and load from a CDN (jsdelivr). Once loaded, all processing happens in the browser's JavaScript engine.

Loading Models Efficiently

The biggest UX challenge is model loading. 5MB is nothing for a server, but noticeable for a first-time user. Here's how I handle it:

let modelsLoaded = false
let modelsLoading = false

async function ensureModels() {
  if (modelsLoaded) return // Already loaded, instant
  if (modelsLoading) {
    // Another call is loading, wait for it
    while (modelsLoading) {
      await new Promise(r => setTimeout(r, 200))
    }
    return
  }

  modelsLoading = true
  const MODEL_URL = 'https://cdn.jsdelivr.net/npm/@vladmandic/face-api/model'

  await Promise.all([
    faceapi.nets.tinyFaceDetector.loadFromUri(MODEL_URL),
    faceapi.nets.faceLandmark68Net.loadFromUri(MODEL_URL),
    faceapi.nets.faceExpressionNet.loadFromUri(MODEL_URL),
  ])

  modelsLoaded = true
  modelsLoading = false
}
Enter fullscreen mode Exit fullscreen mode

Key decisions:

  • Lazy loading — models only load when the user actually uploads a photo
  • Singleton pattern — prevents duplicate loading if multiple components request models
  • CDN delivery — jsdelivr caches globally, so load times are fast worldwide

Smile Detection: The Duchenne Method

One of RealSmile's features is detecting whether a smile is genuine. This is based on the Duchenne smile — a 150-year-old discovery that genuine smiles activate muscles around the eyes (orbicularis oculi), not just the mouth.

const detection = await faceapi
  .detectSingleFace(canvas, new faceapi.TinyFaceDetectorOptions({
    inputSize: 512,
    scoreThreshold: 0.3
  }))
  .withFaceLandmarks()
  .withFaceExpressions()

const expressions = detection.expressions
const happyScore = expressions.happy || 0
const isGenuine = happyScore > 0.6
Enter fullscreen mode Exit fullscreen mode

The expression model returns probability scores for 7 emotions. A high "happy" score combined with landmark analysis around the eye region gives a reasonable approximation of Duchenne smile detection.

Golden Ratio Face Analysis

The more interesting technical challenge was measuring facial proportions against the golden ratio (φ = 1.618).

With 68 facial landmarks, you can calculate ratios between key measurements:

const PHI = 1.618

// Key landmark points
const faceTop = landmarks[19]    // Between eyebrows
const chin = landmarks[8]         // Bottom of chin
const leftFace = landmarks[0]     // Left jawline
const rightFace = landmarks[16]   // Right jawline

const faceHeight = distance(faceTop, chin)
const faceWidth = distance(leftFace, rightFace)

// How close is this ratio to the golden ratio?
const ratio = faceHeight / faceWidth
const score = Math.max(0, (1 - Math.abs(ratio - PHI) / PHI)) * 100
Enter fullscreen mode Exit fullscreen mode

I measure 6 ratios total: face height/width, eye spacing, mouth/nose ratio, nose width, eye width, and nose shape. The overall score is an average of how close each ratio is to its ideal phi-based value.

Making It Embeddable

The architecture that makes this free and private also makes it embeddable. Since there's no server dependency, anyone can add the analyzer to their site with one line of code:

<iframe
  src="https://realsmile.online/embed/smile"
  width="380"
  height="400"
  style="border:none;border-radius:16px;"
  loading="lazy"
></iframe>
Enter fullscreen mode Exit fullscreen mode

The iframe loads a compact version of the analyzer. Models load from jsdelivr's CDN (not my server). Processing happens in the visitor's browser. My server just serves a static page.

This means even if 10,000 sites embed the widget, my hosting costs stay near zero.

Check out all 4 embeddable widgets at realsmile.online/widget.

Performance Numbers

On a modern laptop:

  • Model load: ~2-3 seconds (first time), instant after (cached)
  • Face detection: ~33ms
  • Full analysis (landmarks + expressions): ~80ms
  • Total from upload to results: < 1 second

On mobile (iPhone 13):

  • Model load: ~4-5 seconds first time
  • Full analysis: ~150ms

Perfectly usable on both platforms without any server infrastructure.

Cost Breakdown

Here's what it costs to run:

Component Cost
Vercel hosting (static pages) $0 (free tier)
jsdelivr CDN (model files) $0 (free)
AI processing $0 (runs on user's device)
Domain ~$12/year
Total ~$1/month

Compare that to running TensorFlow Serving on a GPU instance: $50-500/month depending on traffic. The browser-based approach is essentially free at any scale.

Tradeoffs

It's not all sunshine:

  • Model size — 5MB is acceptable but not tiny. Users on slow connections notice.
  • Device dependent — Old phones or low-end laptops will be slower.
  • Model limitations — Browser-compatible models are smaller and less accurate than server-side alternatives.
  • No training — You can't collect data to improve models (which is the privacy point, but it limits improvement).

For this use case — entertainment-grade face analysis — the tradeoffs are worth it. For medical imaging or security applications, you'd want server-side processing.

What I'd Do Differently

  1. WebAssembly — ONNX Runtime Web or TFLite WASM would be faster than pure JS inference
  2. WebGPU — When browser support improves, GPU acceleration will close the gap with server-side
  3. Progressive loading — Load the smallest model first for instant basic results, then upgrade

Try It

  • Full tools: realsmile.online — smile analyzer, golden ratio test, face score, photo ranker
  • Embeddable widgets: realsmile.online/widget — add AI face analysis to your site with one line of code
  • All tools are free, no signup required, and 100% private

If you're building privacy-sensitive tools, consider browser-based ML. The models are good enough for many use cases, the cost savings are dramatic, and users genuinely appreciate knowing their data stays on their device.

Happy to answer questions in the comments!

Top comments (1)

Collapse
 
alphadisjunkt profile image
alphadisjunkt • Edited

Happy to answer questions about the architecture or share more code snippets