How I built a browser-only face-rating app with Next.js + MediaPipe (no upload, $0 per scan)

#computervision #nextjs #privacy #showdev

Most "rate my face" tools do one of two sketchy things: they upload your selfie to a server, or they bill an AI vision API on every single scan. I wanted neither. So when I built PSLRate, the rule was: the score has to be computed entirely in the browser — no upload, no per-scan cost.

Turns out the part of facial-attractiveness scoring that's actually reproducible — geometry — maps perfectly to that constraint. Here's how the whole thing runs client-side, and where I drew the line between "free and infinite" and "costs money."

The key insight: geometry is client-side, opinions are server-side

When people say a face is "harmonious," a big chunk of it is measurable proportion: symmetry, the rule of facial thirds, the rule of fifths, eye tilt, facial width-to-height ratio. None of that needs a model on a server — it's just math on landmark coordinates.

So I split the product in two:

	Free tier	Paid tier
Where it runs	Browser (your device)	Server
What it does	Facial geometry → a 0–10 score + per-feature breakdown	An LLM writes a personality/"glow-up" reading
Cost per use	$0	An API call
Your photo	Never leaves the browser	Sent once, on unlock

The free tier being pure geometry is what makes it infinitely free. There's no API meter ticking. That single architectural decision is the whole trick.

Step 1: 478 face landmarks, in the browser

MediaPipe Tasks Vision ships a FaceLandmarker that runs on WebGL and returns 478 3D landmarks per face — entirely client-side. I self-host the model + wasm so there's no third-party CDN call (and it loads fine from regions where Google's CDN is flaky):

import { FaceLandmarker, FilesetResolver } from "@mediapipe/tasks-vision";

const vision = await FilesetResolver.forVisionTasks("/mediapipe/wasm");
const landmarker = await FaceLandmarker.createFromOptions(vision, {
  baseOptions: { modelAssetPath: "/mediapipe/face_landmarker.task" },
  runningMode: "IMAGE",
  numFaces: 1,
});

const result = landmarker.detect(imageElement);
const pts = result.faceLandmarks[0]; // 478 points, each {x, y, z} normalized 0..1

That pts array is everything I need. The image element is read straight from an <input type="file"> and a canvas — it never gets POSTed anywhere.

Step 2: turn coordinates into metrics

Every metric is "how close is this to an ideal, within a tolerance." So the core helper is just a clamped closeness function:

const clamp01 = (x: number) => Math.max(0, Math.min(1, x));
// 1.0 = exactly ideal, 0.0 = a full tolerance away or worse
const closeness = (deviation: number, tolerance: number) =>
  clamp01(1 - Math.abs(deviation) / tolerance);

Symmetry — reflect the left-side landmarks across the face's vertical midline and measure how far they miss their right-side partners:

function symmetry(pts) {
  const midX = (pts[33].x + pts[263].x) / 2; // midpoint between outer eye corners
  let totalDev = 0;
  for (const [l, r] of MIRROR_PAIRS) {
    const reflectedLeftX = 2 * midX - pts[l].x;
    totalDev += Math.abs(reflectedLeftX - pts[r].x);
  }
  return closeness(totalDev / MIRROR_PAIRS.length, SYMMETRY_TOL);
}

Facial thirds — forehead-to-brow, brow-to-nose-base, nose-base-to-chin should be roughly equal. Score the deviation from a perfect 1:1:1.

Facial fifths — the face should be about five eye-widths wide. Score the variance across those five segments.

Canthal tilt — the angle from inner to outer eye corner (a positive/upward tilt reads as "more attractive" in most studies). Computed per eye, then I penalize asymmetry between the two.

FWHR — facial width-to-height ratio, bizygomatic width over upper-face height.

Step 3: weight them into one score

Different metrics matter differently, so it's a weighted blend, then scaled to 0–10:

const dims = [
  { key: "symmetry", weight: 0.30 },
  { key: "thirds",   weight: 0.25 },
  { key: "fifths",   weight: 0.20 },
  { key: "tilt",     weight: 0.15 },
  { key: "fwhr",     weight: 0.10 },
];

const score10 =
  dims.reduce((sum, d) => sum + metrics[d.key] * d.weight, 0) * 10;

Because it's deterministic, the same photo always yields the same score — which, for a "rate me" tool, matters a lot for trust. (Vision-LLM scores drift between runs; geometry doesn't.)

Step 4: the part that does cost money

Numbers alone are a bit cold, so the paid tier sends the photo to a vision LLM once and gets back a written reading — archetype, personality notes, and a constructive, non-surgical "glow-up" direction. That's the only path where the image leaves the device, and it's gated behind a credit so the cost is covered. I keep the prompt firmly on the "fun + self-reflection" side and refuse anything that infers race/age or pushes surgery.

Lessons

Push the deterministic, cheap work to the client. It made my free tier genuinely $0 and privacy-preserving instead of a loss leader.
Self-host MediaPipe assets. Saved me from CDN reliability issues and a third-party request on every visit.
Reproducibility builds trust more than a fancier-but-wobbly score.

If you want to see the finished thing, it's live at PSLRate — upload a selfie and the geometry score comes back in a couple of seconds, all in your browser. Happy to answer questions about the MediaPipe or scoring bits in the comments.