How QR Code Scanning Works: From Camera Pixels to Decoded Data

#javascript #webdev #mobile #algorithms

Your phone camera captures an image. Within 100 milliseconds, it identifies a QR code in the frame, corrects for perspective distortion, decodes binary data, applies error correction, and presents you with a URL. The speed makes it seem simple. The algorithm behind it is elegant.

Step 1: Finding the QR code

A QR code has three finder patterns in three corners. Large concentric squares: dark, light, dark, with a 1:1:3:1:1 ratio of module widths when scanned along any line through the center. This ratio is unique. No natural image produces this pattern accidentally.

The scanner sweeps horizontal and vertical scan lines across the image, looking for this 1:1:3:1:1 ratio in pixel brightness transitions. When it finds three instances arranged in the correct geometric relationship (forming a right angle with specific distance ratios), it has located a QR code.

The fourth corner (which has no finder pattern) is calculated from the positions of the other three, confirmed by the alignment pattern (a smaller concentric square present in Version 2+ QR codes).

Step 2: Perspective correction

You rarely scan a QR code perfectly straight-on. The camera angle, surface curvature, and distance all distort the image. The scanner uses the three finder patterns and the calculated fourth corner to define a perspective transform that maps the distorted quadrilateral back to a square grid.

This is a standard computer vision operation: compute the 3x3 homography matrix from four point correspondences (the four corners), then apply the inverse transform to rectify the image. The result is a clean grid where each cell corresponds to a module (black or white square).

Step 3: Reading the grid

The rectified image is sampled at each module center to determine if it's dark or light. This produces a binary matrix. The scanner then reads the format information (encoded around the finder patterns) to determine the error correction level and data mask pattern.

QR codes apply one of eight mask patterns to the data to ensure a good balance of dark and light modules (which improves scannability). The scanner XORs the mask pattern with the data to remove it, revealing the raw data bits.

Step 4: Error correction

The raw bits are divided into data codewords and error correction codewords. Reed-Solomon decoding checks for and corrects errors. At level H, up to 30% of codewords can be damaged or missing and the data is still recoverable.

This is the same error correction used in CDs, DVDs, and deep-space communication. It's mathematically elegant and computationally efficient, which is why QR codes can be scanned from business cards with coffee stains on them.

Step 5: Decoding

The corrected data bits are decoded according to the mode indicators embedded in the data stream. The result is the original string: a URL, a Wi-Fi configuration, a phone number, or whatever was encoded.

Browser-based scanning

Modern browsers support QR scanning through the getUserMedia API (for camera access) and libraries like jsQR or the newer BarcodeDetector API (supported in Chrome and Edge). The BarcodeDetector API delegates to the platform's native barcode scanning, which is optimized and fast.

const detector = new BarcodeDetector({ formats: ['qr_code'] });
const barcodes = await detector.detect(videoFrame);
if (barcodes.length > 0) {
  console.log(barcodes[0].rawValue);
}

I built a browser-based QR reader at zovo.one/free-tools/qr-reader that uses your camera or accepts uploaded images. No app install required. It handles standard and custom QR codes, and decodes all common data types.

I'm Michael Lip. I build free developer tools at zovo.one. 500+ tools, all private, all free.