DEV Community

Cover image for Building BlazeDiff: How I Made The Fastest Image Diff up-to 60% Faster with Block-Level Optimization
Teimur Gasanov
Teimur Gasanov

Posted on

Building BlazeDiff: How I Made The Fastest Image Diff up-to 60% Faster with Block-Level Optimization

From analyzing pixelmatch's bottlenecks to creating a faster algorithm with zero allocations and dynamic block sizing.

The Spark: Visual Testing Performance Pain

It started during a usual day of visual regression testing. I was watching a CI pipeline going through hundreds of screenshot comparisons, each one taking precious seconds.

I was using pixelmatch: the gold standard for pixel-level image comparison in JavaScript. It's an excellent library that's served the community well for years. But as my test suite grew and image resolutions increased, those milliseconds started adding up to minutes. I thought: "There has to be a better way."

Diving Deep: Analyzing pixelmatch's Architecture

Before jumping into optimization, I needed to understand what pixelmatch was actually doing. I dove into the source code and found a beautifully simple algorithm:

// Simplified pixelmatch flow
function pixelmatch(img1, img2, output, width, height, options) {
  // 1. Check if images are completely identical (fast path)
  const len = width * height;
  const a32 = new Uint32Array(img1.buffer, img1.byteOffset, len);
  const b32 = new Uint32Array(img2.buffer, img2.byteOffset, len);

  let identical = true;
  for (let i = 0; i < len; i++) {
    if (a32[i] !== b32[i]) { 
      identical = false; 
      break; 
    }
  }

  if (identical) return 0; // Early exit

  // 2. Pixel-by-pixel comparison using YIQ color space
  for (let y = 0; y < height; y++) {
    for (let x = 0; x < width; x++) {
      // Complex color difference calculation...
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Pixelmatch had a clever optimization for completely identical images. It would quickly scan through 32-bit chunks and exit early. But what about partially identical images?

In real-world visual testing scenarios:

  • Screenshots often have large unchanged regions (headers, sidebars, backgrounds)
  • Only small portions typically change (content areas, buttons, text)
  • We were still processing every single pixel even when 80% of the image was identical

The "Aha!" Moment: Coarse-to-Fine Processing

The breakthrough came when I realized we could apply the "identical check" concept at a block level, rather than just the entire image.

What if we could:

  1. Divide the image into blocks (16x16, 32x32, etc.)
  2. Quickly identify which blocks are identical using the same 32-bit comparison trick
  3. Skip pixel-level processing entirely for identical blocks
  4. Only do the expensive YIQ color analysis on blocks that actually changed

This is a classic coarse-to-fine approach used in computer vision, but I hadn't seen it applied to web-based image diffing libraries.

From Idea to Implementation

Challenge #1: Dynamic Block Sizing

Fixed block sizes would be suboptimal – small images need fine granularity, large images can use bigger blocks for better cache performance.

function calculateOptimalBlockSize(width: number, height: number): number {
  const area = width * height;

  // Block size grows roughly with the square root of the image area
  // Scale factor chosen to match your thresholds
  const base = 16;
  const scale = Math.sqrt(area) / 100; // 100 is a tuning constant

  // Round to nearest power-of-two block size
  const rawSize = base * Math.pow(scale, 0.5);
  return Math.pow(2, Math.round(Math.log2(rawSize)));
}
Enter fullscreen mode Exit fullscreen mode

Challenge #2: Efficient Memory Management

Initial prototypes were actually slower than pixelmatch because I was creating dynamic arrays and objects to store block information. The allocation overhead was killing performance.

The solution? Store changed coordinates only and preprocess blocks:

// Instead of storing full changed blocks in arrays...
const changedBlocks = [];
const identicalBlocks = [];

// Allocate memory for a fast Int32Array...
const maxBlocks = Math.ceil(width / 8) * Math.ceil(height / 8); // worst case
  const changedBlockCoords = new Int32Array(maxBlocks * 4); // x,y,endX,endY
let changedCount = 0;

// Preprocess blocks...
for (let by = 0; by < blocksY; by++) {
  for (let bx = 0; bx < blocksX; bx++) {
    if (blockIsIdentical(startX, startY, endX, endY)) {
      processIdenticalBlock(); // Draw gray pixels immediately
    } else {
      processChangedBlock();   // Add to changedBlockCoords
    }
  }
}

// Process only changed blocks...
for (let blockIdx = 0; blockIdx < changedCount; blockIdx++) {

}
Enter fullscreen mode Exit fullscreen mode

Challenge #3: Maintaining Pixel-Perfect Accuracy

The block optimization couldn't change the final result – it had to produce identical output to pixelmatch. This meant:

  • Same YIQ color space calculations
  • Same anti-aliasing detection
  • Same output pixel colors
  • Only the processing order could change

Going Live

After solving the core algorithmic challenges, I released BlazeDiff as MIT licensed with a minimal ecosystem:

This modular approach means you only install what you need, keeping bundle sizes minimal.

Built for Today's Ecosystem

While analyzing pixelmatch, I noticed it was built with the tooling standards of its era. Don't get me wrong – it works perfectly. But I saw an opportunity to leverage modern development practices that could improve both developer experience and performance.

pixelmatch approach:

  • JavaScript with JSDoc comments
  • npm for package management
  • Basic build setup
  • Single package architecture

BlazeDiff approach:

  • TypeScript for type safety and better DX
  • pnpm for efficient dependency management
  • tsup for lightning-fast builds
  • Monorepo architecture with multiple focused packages

Why does it matter?

  • Crystal clear APIs and complete IntelliSense with TypeScript
  • TypeScript's compile-time safety prevents common mistakes
  • Bundle size wins with composable monorepo packages
    • @blazediff/core is only 8.59 kB compared to pixelmatch's 19.4 kB

The Results: 20-60% Performance Improvement

After weeks of optimization and benchmarking:

  • Around 20-25% boost when using @blazediff/core package
  • Around 60-99% boost when using @blazediff/bin with @blazediff/sharp-transformer.

You can find benchmark results in the GitHub Action of the BlazeDiff repository.

Top comments (2)

Collapse
 
danchesko profile image
Dan Berdikulov

Thanks for sharing—bookmarking this for later.

Collapse
 
esen_s_3406d4cc683d0e9f9c profile image
Esen S

Hey great work, gonna try it out for my CI.
Do you plan on documenting how do you envision integrating it into regression testing environments?