byeval

Posted on Apr 22 • Originally published at happyimg.com

How To Auto-Detect QR Codes, Signatures, and License Plates In The Browser

#javascript #webdev #privacy #computervision

One of the easiest mistakes in privacy tooling is trying to solve every target type with one detector.

QR codes, signatures, and license plates all end up as "regions to hide," but technically they are different problems:

QR codes are machine-readable symbols
license plates are structured text with strong visual constraints
signatures are image shapes more than readable words

If you force all three through generic OCR, the output gets noisy fast.

The companion guide for this piece is here:

https://happyimg.com/guides/how-to-auto-detect-qr-codes-signatures-and-license-plates-in-the-browser

Mixed detection works better than one universal pipeline

From the product side, these features all look related. The user wants the tool to suggest privacy-sensitive regions automatically.

From the engineering side, they need different signals.

The more useful architecture is:

multiple detector functions
one normalized region format
one editor surface
one review step before export

That keeps the interaction model simple without pretending the detection problem is simple.

QR codes and barcodes: use the browser when the browser already knows

For QR codes and barcodes, the cleanest path is usually BarcodeDetector when the browser supports it.

const detector = new BarcodeDetector({
  formats: ["qr_code", "code_128", "ean_13", "pdf417"],
});

const results = await detector.detect(source);

That gives you native symbol detection plus bounding boxes you can pad into safer blur or redaction regions.

The product lesson here is mostly about failure modes. If BarcodeDetector is unavailable, the UI should say so explicitly. Silent failure is worse than no feature because it makes the user trust an absence of results that may be false.

License plates: OCR is useful, but only as a candidate generator

License plates are text, but not ordinary text. A raw OCR pass usually gives too much junk unless you filter aggressively.

The pattern we used is:

start from OCR blocks and lines
normalize candidate text to uppercase alphanumeric characters
require both letters and digits
filter by plausible string length
reject impossible aspect ratios
ignore text in unlikely vertical positions

That turns OCR into a candidate generator instead of pretending it understands the full context of a vehicle image.

This is often the right level of ambition for privacy tooling: narrow heuristics on top of a broad detector.

Signatures: image analysis beats text recognition

Signatures are the opposite case. OCR often performs poorly because handwriting is inconsistent and the goal is not to read the text anyway. The goal is to find the signed region.

So the better signal was image analysis on a scaled canvas:

const imageData = context.getImageData(0, 0, canvas.width, canvas.height);
const threshold = estimateSignatureThreshold(data);

From there, the detector walks connected dark components, measures each region, and filters by heuristics like:

width
height
fill ratio
relative position on the page

Then nearby candidates can be merged into one more useful region.

This is not a universal signature model, and that is exactly the point. It is a practical heuristic for one narrow job.

Detection quality depends on what happens after detection

Even when the detector logic is correct, the raw output is usually not ready for users.

The post-processing layer matters a lot:

add padding so the target is fully covered
merge nearby fragments
deduplicate overlapping results
normalize everything into the same region shape the editor understands

If you skip those steps, the result is usually a screen full of tiny, fragmented boxes that nobody trusts.

One review model for many detectors

The biggest architecture win was not in the detectors themselves. It was in the shared output model.

Every detector returns the same kind of region object, and every region is inserted into the same editor as a reviewable overlay.

That gives the product a stable interaction model:

QR detection can suggest a region
signature detection can suggest another
plate detection can suggest blur regions
the user still reviews all of them the same way

That is much easier to maintain than building a special-case UX for every detector type.

The practical lesson

Privacy-sensitive detection gets better when you stop looking for one perfect detector and start using the right signal for each target.

The useful stack is often not:

one model
one answer

It is:

multiple detectors
narrow heuristics
normalized region output
explicit human review before export

That combination is usually much more reliable than a single generalized pass.

More implementation details: