From analyzing pixelmatch's bottlenecks to creating a faster algorithm with zero allocations and dynamic block sizing.
The Spark: Visual Testing Performance Pain
It started during a usual day of visual regression testing. I was watching a CI pipeline going through hundreds of screenshot comparisons, each one taking precious seconds.
I was using pixelmatch: the gold standard for pixel-level image comparison in JavaScript. It's an excellent library that's served the community well for years. But as my test suite grew and image resolutions increased, those milliseconds started adding up to minutes. I thought: "There has to be a better way."
Diving Deep: Analyzing pixelmatch's Architecture
Before jumping into optimization, I needed to understand what pixelmatch was actually doing. I dove into the source code and found a beautifully simple algorithm:
// Simplified pixelmatch flow
function pixelmatch(img1, img2, output, width, height, options) {
// 1. Check if images are completely identical (fast path)
const len = width * height;
const a32 = new Uint32Array(img1.buffer, img1.byteOffset, len);
const b32 = new Uint32Array(img2.buffer, img2.byteOffset, len);
let identical = true;
for (let i = 0; i < len; i++) {
if (a32[i] !== b32[i]) {
identical = false;
break;
}
}
if (identical) return 0; // Early exit
// 2. Pixel-by-pixel comparison using YIQ color space
for (let y = 0; y < height; y++) {
for (let x = 0; x < width; x++) {
// Complex color difference calculation...
}
}
}
Pixelmatch had a clever optimization for completely identical images. It would quickly scan through 32-bit chunks and exit early. But what about partially identical images?
In real-world visual testing scenarios:
- Screenshots often have large unchanged regions (headers, sidebars, backgrounds)
- Only small portions typically change (content areas, buttons, text)
- We were still processing every single pixel even when 80% of the image was identical
The "Aha!" Moment: Coarse-to-Fine Processing
The breakthrough came when I realized we could apply the "identical check" concept at a block level, rather than just the entire image.
What if we could:
- Divide the image into blocks (16x16, 32x32, etc.)
- Quickly identify which blocks are identical using the same 32-bit comparison trick
- Skip pixel-level processing entirely for identical blocks
- Only do the expensive YIQ color analysis on blocks that actually changed
This is a classic coarse-to-fine approach used in computer vision, but I hadn't seen it applied to web-based image diffing libraries.
From Idea to Implementation
Challenge #1: Dynamic Block Sizing
Fixed block sizes would be suboptimal – small images need fine granularity, large images can use bigger blocks for better cache performance.
function calculateOptimalBlockSize(width: number, height: number): number {
const area = width * height;
// Block size grows roughly with the square root of the image area
// Scale factor chosen to match your thresholds
const base = 16;
const scale = Math.sqrt(area) / 100; // 100 is a tuning constant
// Round to nearest power-of-two block size
const rawSize = base * Math.pow(scale, 0.5);
return Math.pow(2, Math.round(Math.log2(rawSize)));
}
Challenge #2: Efficient Memory Management
Initial prototypes were actually slower than pixelmatch because I was creating dynamic arrays and objects to store block information. The allocation overhead was killing performance.
The solution? Store changed coordinates only and preprocess blocks:
// Instead of storing full changed blocks in arrays...
const changedBlocks = [];
const identicalBlocks = [];
// Allocate memory for a fast Int32Array...
const maxBlocks = Math.ceil(width / 8) * Math.ceil(height / 8); // worst case
const changedBlockCoords = new Int32Array(maxBlocks * 4); // x,y,endX,endY
let changedCount = 0;
// Preprocess blocks...
for (let by = 0; by < blocksY; by++) {
for (let bx = 0; bx < blocksX; bx++) {
if (blockIsIdentical(startX, startY, endX, endY)) {
processIdenticalBlock(); // Draw gray pixels immediately
} else {
processChangedBlock(); // Add to changedBlockCoords
}
}
}
// Process only changed blocks...
for (let blockIdx = 0; blockIdx < changedCount; blockIdx++) {
}
Challenge #3: Maintaining Pixel-Perfect Accuracy
The block optimization couldn't change the final result – it had to produce identical output to pixelmatch. This meant:
- Same YIQ color space calculations
- Same anti-aliasing detection
- Same output pixel colors
- Only the processing order could change
Going Live
After solving the core algorithmic challenges, I released BlazeDiff as MIT licensed with a minimal ecosystem:
-
@blazediff/core
- The core algorithm -
@blazediff/pngjs-transformer
- PNG support using pngjs -
@blazediff/sharp-transformer
- High-performance image processing with Sharp -
@blazediff/bin
- Transformers + core algorithm via CLI or programmatically
This modular approach means you only install what you need, keeping bundle sizes minimal.
Built for Today's Ecosystem
While analyzing pixelmatch, I noticed it was built with the tooling standards of its era. Don't get me wrong – it works perfectly. But I saw an opportunity to leverage modern development practices that could improve both developer experience and performance.
pixelmatch approach:
- JavaScript with JSDoc comments
- npm for package management
- Basic build setup
- Single package architecture
BlazeDiff approach:
- TypeScript for type safety and better DX
- pnpm for efficient dependency management
- tsup for lightning-fast builds
- Monorepo architecture with multiple focused packages
Why does it matter?
- Crystal clear APIs and complete IntelliSense with TypeScript
- TypeScript's compile-time safety prevents common mistakes
- Bundle size wins with composable monorepo packages
-
@blazediff/core
is only 8.59 kB compared topixelmatch
's 19.4 kB
-
The Results: 20-60% Performance Improvement
After weeks of optimization and benchmarking:
- Around 20-25% boost when using
@blazediff/core
package - Around 60-99% boost when using
@blazediff/bin
with@blazediff/sharp-transformer
.
You can find benchmark results in the GitHub Action of the BlazeDiff repository.
Top comments (2)
Thanks for sharing—bookmarking this for later.
Hey great work, gonna try it out for my CI.
Do you plan on documenting how do you envision integrating it into regression testing environments?