DEV Community

Cover image for How I Built a Local ONNX AI Detector Inside a Chrome MV3 Extension
Duron Epps
Duron Epps

Posted on

How I Built a Local ONNX AI Detector Inside a Chrome MV3 Extension

Chrome's Manifest V3 killed a lot of the tricks extension developers relied on. No persistent background pages. No remote code execution. Strict CSP. And if you want to run a machine learning model inside a service worker β€” have fun, because nothing in the docs tells you how.

Here's how I got ONNX Runtime Web running inside a Chrome MV3 service worker for Faux Spy, an AI image detection extension.

Why Local Inference?

The naive approach: send every image to an API, get a result back. That works, but it's slow (network round trip), costs money for every scan, and fails when the API is down.

Our approach: run a local ONNX model as a pre-filter. If it's confident (>88% AI probability or <12%), return the result immediately. Only send ambiguous cases to the API. This reduces API calls by ~60% and makes obvious cases instant.

The Stack

  • ONNX Runtime Web (ort.min.js) runs ONNX models in browser JS
  • SMOGY AI Image Detector a 50MB binary classifier (class 0 = artificial, class 1 = human)
  • Chrome Cache API caches the model file so it only downloads once
  • OffscreenCanvas renders images in the service worker without DOM

The Hard Parts

  1. Loading the model in a service worker

Service workers can't use importScripts for WASM files the way regular pages can. We had to import ORT as a module and configure the WASM binary path explicitly:

js
import * as ort from './lib/ort.min.js';
ort.env.wasm.wasmPaths = chrome.runtime.getURL('lib/');

  1. Getting image pixels without canvas access Service workers don't have DOM or canvas access. We use OffscreenCanvas:

const bitmap = await createImageBitmap(blob, { resizeWidth: 224, resizeHeight: 224 });
const canvas = new OffscreenCanvas(224, 224);
const ctx = canvas.getContext('2d');
ctx.drawImage(bitmap, 0, 0);
const imageData = ctx.getImageData(0, 0, 224, 224);

  1. ImageNet normalization in CHW format Most ONNX vision models expect CHW (channels-height-width) format with ImageNet normalization. Converting from RGBA pixel arrays:

const mean = [0.485, 0.456, 0.406];
const std = [0.229, 0.224, 0.225];
const float32 = new Float32Array(3 * 224 * 224);

for (let i = 0; i < 224 * 224; i++) {
float32[i] = (pixels[i*4] / 255 - mean[0]) / std[0]; // R
float32[i + 224*224] = (pixels[i*4+1] / 255 - mean[1]) / std[1]; // G
float32[i + 224*224 * 2] = (pixels[i*4+2] / 255 - mean[2]) / std[2]; // B
}

  1. Label order β€” a bug that cost us a week The SMOGY model outputs [artificial_logit, human_logit] class 0 is AI, class 1 is real. We had them swapped. Every real image was getting "Definitely Faux" and every AI image was "No AI Detected." Check your label order before shipping.

The Result
Local inference adds ~200ms for the first run (model loading) and ~50ms for subsequent runs. The model achieves ~85% accuracy on easy cases, which is enough to filter before the API handles the hard ones. End-to-end for cached scans: under 100ms.

The extension is on the Chrome Web Store as Faux Spy. The detection architecture is open to questions drop a comment if you're building something similar.

Top comments (0)