DEV Community

Cover image for Building a Visual Regression Engine in Python with Playwright
Nijil
Nijil

Posted on

Building a Visual Regression Engine in Python with Playwright

Modern frontend applications are complex, responsive, and constantly evolving. Small CSS or layout changes can introduce subtle UI regressions that are hard to detect during manual review.

I wanted a deterministic way to:

  • Capture responsive screenshots across breakpoints
  • Compare changes against an approved baseline
  • Fail CI if layout drift exceeds a threshold

So I built PixelFrame - a CLI-based visual regression engine powered by Python and Playwright.

The Problem

Visual regressions are tricky because:

  • They may only appear on specific breakpoints
  • They may not break functionality
  • They are hard to catch in code review

Most teams rely on manual checks or expensive SaaS tools.

I wanted something:

  • Scriptable
  • CI-friendly
  • Deterministic
  • Open-source

Architecture Overview

PixelFrame is built around a few core components:

1. Screenshot Engine

Using Playwright's Chromium engine, PixelFrame:

  • Launches a headless browser
  • Emulates multiple breakpoints or device presets
  • Captures full-page screenshots

Each run produces a structured directory:

pixelframe-output/
  ├── screenshots/
  ├── composite/
  ├── diff/
  └── report/
Enter fullscreen mode Exit fullscreen mode

2. Visual Diff Engine

To detect regressions:

  • PixelFrame compares baseline vs current screenshots
  • Calculates similarity percentage
  • Generates red-highlighted diff overlays
  • Returns exit code 1 if similarity drops below threshold

Example:

pixelframe diff run ./baseline ./current --fail-under 99.0
Enter fullscreen mode Exit fullscreen mode

This makes it CI-ready.

3. Structured Reporting

Each run generates:

  • High-resolution PNGs
  • Composite grid preview
  • Self-contained HTML report
  • Optional PDF export

This makes regression review shareable and portable.

CI Integration

One of the main goals was CI gating.

In GitHub Actions:

- name: Visual Threshold Check
  run: |
    pixelframe diff run ./baseline ./current --fail-under 99.0
Enter fullscreen mode Exit fullscreen mode

If UI changes exceed the threshold, the workflow fails.

This turns visual regression into a first-class quality gate.

Device Emulation & Configuration

PixelFrame supports:

  • Manual breakpoints
  • Named device presets (iPhone, iPad, MacBook, 4K desktop)
  • YAML configuration files

Example:

url: https://example.com
full_page: true

devices:
  - "iPhone 15 Pro Max"
  - "MacBook Pro 14"
Enter fullscreen mode Exit fullscreen mode

This keeps regression suites version-controlled and reproducible.

Lessons Learned

Building a CLI devtool taught me a few things:

  • Deterministic output structure matters
  • CI-first design changes architecture decisions
  • Exit codes are critical for automation
  • Clear reporting dramatically improves usability

I initially underestimated baseline management. My first CI setup regenerated the baseline on every run, which completely defeated the purpose of regression testing. Fixing that forced me to rethink the workflow structure.

PixelFrame currently works best with publicly accessible sites. Handling authenticated flows is something I plan to improve.

Final Thoughts

Visual regression doesn't have to be complex or SaaS-dependent.

With Python + Playwright, it's possible to build a reproducible, CI-integrated testing engine that keeps UI drift under control.

PixelFrame is available on PyPI:

pip install pixelframe
Enter fullscreen mode Exit fullscreen mode

Source code:
https://github.com/nijil71/PixelFrame

Top comments (0)