<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jorge Mora</title>
    <description>The latest articles on DEV Community by Jorge Mora (@moraxh).</description>
    <link>https://dev.to/moraxh</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3919092%2Fe9d72805-3460-44f0-b24c-f3493e835376.jpeg</url>
      <title>DEV Community: Jorge Mora</title>
      <link>https://dev.to/moraxh</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/moraxh"/>
    <language>en</language>
    <item>
      <title>I built real-time glasses detection that runs entirely in the browser (ONNX + WebGPU)</title>
      <dc:creator>Jorge Mora</dc:creator>
      <pubDate>Fri, 08 May 2026 03:46:27 +0000</pubDate>
      <link>https://dev.to/moraxh/i-built-real-time-glasses-detection-that-runs-entirely-in-the-browser-onnx-webgpu-19a5</link>
      <guid>https://dev.to/moraxh/i-built-real-time-glasses-detection-that-runs-entirely-in-the-browser-onnx-webgpu-19a5</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5o2f78ipn13ik1vn7bce.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5o2f78ipn13ik1vn7bce.png" alt=" " width="800" height="403"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’ve been playing with browser-based computer vision for a while, and I ended up building something I didn’t expect to feel this fast in practice.&lt;/p&gt;

&lt;p&gt;It’s called FrameFind.&lt;/p&gt;

&lt;p&gt;The first module detects whether someone is wearing glasses in real time, but the interesting part isn’t the feature itself — it’s how it runs.&lt;/p&gt;

&lt;p&gt;Everything executes locally in the browser using ONNX Runtime Web. No backend, no uploads, no API calls. Just a camera feed and a model running on-device.&lt;/p&gt;

&lt;p&gt;What surprised me most was that instead of running inference on full frames, I started using MediaPipe FaceMesh landmarks to isolate just the eye region. That small change made a huge difference. The model only sees a 112x112 crop focused on the relevant area, which keeps things fast and stable.&lt;/p&gt;

&lt;p&gt;The current model is around 6.2MB and sits at roughly ~27ms per inference on my machine. It’s small enough that it loads quickly and can be cached for near-instant startup on repeat visits.&lt;/p&gt;

&lt;p&gt;The pipeline ends up looking something like:&lt;/p&gt;

&lt;p&gt;FaceMesh → eye ROI crop → tensor normalization → ONNX inference → smoothing over time&lt;/p&gt;

&lt;p&gt;Smoothing was necessary because raw predictions flicker a bit frame-to-frame, especially when lighting changes or the face is partially occluded.&lt;/p&gt;

&lt;p&gt;The stack behind it is fairly simple:&lt;br&gt;
ONNX Runtime Web for inference, MediaPipe for landmarks, and optional WebGPU acceleration depending on the environment. It also falls back gracefully when WebGPU isn’t available.&lt;/p&gt;

&lt;p&gt;I built a React hook on top of it because I wanted something you could drop into a UI without thinking too much about the underlying pipeline. There’s also a Node.js version for server-side image processing, but the browser version is the main focus.&lt;/p&gt;

&lt;p&gt;What I’m trying to explore with this project is less “glasses detection” and more whether small, specialized vision models can make real-time UI interactions more practical in the browser.&lt;/p&gt;

&lt;p&gt;Instead of sending frames to a server or relying on heavy cloud APIs, the idea is that more of this kind of computation can just live inside the client.&lt;/p&gt;

&lt;p&gt;There are obvious tradeoffs, but the latency and privacy advantages are hard to ignore when everything stays on-device.&lt;/p&gt;

&lt;p&gt;Live demo:&lt;br&gt;
&lt;a href="https://framefind.moraxh.dev/" rel="noopener noreferrer"&gt;https://framefind.moraxh.dev/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Source:&lt;br&gt;
&lt;a href="https://github.com/moraxh/FrameFind" rel="noopener noreferrer"&gt;https://github.com/moraxh/FrameFind&lt;/a&gt;&lt;/p&gt;

</description>
      <category>onnx</category>
      <category>machinelearning</category>
      <category>browerai</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
