Bala Paranj

Posted on May 18

Visual Regression Testing for CLIs with VHS

#go #cli #testing #tooling

How to use Charm's VHS to create GIF-based visual regression tests for your CLI's terminal output — catching formatting bugs that unit tests miss.

Your CLI's unit tests verify that the right data comes out. But they don't test what the user actually sees.

A missing newline. A table column that wraps at 80 characters. A progress spinner that bleeds into the output. An ANSI color code that renders as garbage on a light terminal theme. These are visual bugs that pass every unit test but make your CLI look broken.

VHS by Charm solves this by recording your terminal as a GIF from a script — and you can use those GIFs as visual regression tests.

Using VHS

VHS reads a .tape file that describes terminal interactions:

# demo.tape
Output demo.gif
Set Width 120
Set Height 40
Set Theme "Monokai"

Type "stave apply --controls ./controls --observations ./obs --format text"
Enter
Sleep 2s

Run it:

vhs demo.tape

Output: demo.gif — a pixel-perfect recording of what the terminal looks like when that command runs.

How This Differs from Asciinema

	Asciinema (.cast)	VHS (.gif/.png)
Output	Text-based replay (NDJSON)	Pixel-based image (GIF/PNG/WebM)
Renders	In a JavaScript player	As a static image anywhere
Tests	Text content correctness	Visual formatting correctness
Use case	Documentation, interactive replay	README badges, visual regression
File size	Small (text)	Large (image)
Searchable	Yes (it's text)	No (it's pixels)

Asciinema answers: "What text does the CLI produce?"
VHS answers: "What does the CLI look like?"

Both are useful. They test different things.

Visual Regression Testing Pattern

Step 1: Create a `.tape` file per workflow

# tapes/apply-violation.tape
Output testdata/screenshots/apply-violation.gif
Set Width 120
Set Height 40
Set FontSize 14
Set Theme "Catppuccin Mocha"

Type "stave apply --controls controls/s3 --observations observations --now 2026-01-15T00:00:00Z --format text"
Enter
Sleep 3s

Step 2: Generate the baseline

vhs tapes/apply-violation.tape

Commit testdata/screenshots/apply-violation.gif as the golden file.

Step 3: Compare in CI

# .github/workflows/visual.yml
- name: Generate screenshots
  run: |
    for tape in tapes/*.tape; do
      vhs "$tape"
    done

- name: Check for visual changes
  run: |
    git diff --exit-code testdata/screenshots/

If any GIF changes, the diff catches it. The developer reviews the visual change and either updates the golden file or fixes the formatting bug.

Step 4: Review with PR comments

For GitHub PRs, you can post the before/after GIF directly in a comment:

- name: Post visual diff
  if: failure()
  run: |
    echo "Visual regression detected. See the updated screenshots below."
    # Upload artifacts or post to PR

What Visual Tests Catch That Unit Tests Miss

Table alignment

CONTROL_ID          ASSET_ID              STATUS
CTL.S3.PUBLIC.001   my-very-long-bucket   NON_COMPLIANT
                    -name-that-wraps

A unit test checks that the data is correct. A visual test catches that the column wraps and breaks the alignment.

Color and formatting

[PASS] CTL.S3.ENCRYPT.001 — Server-Side Encryption
[FAIL] CTL.S3.PUBLIC.001 — No Public Read Access

A unit test sees [PASS] and [FAIL]. A visual test sees whether the ANSI color codes render correctly — green for pass, red for fail — or whether they produce \033[32m[PASS]\033[0m garbage.

Progress indicators

Running: evaluating controls... ⠋

A spinner that works in a real terminal but bleeds into piped output. A visual test with a fixed terminal size catches this.

Help text layout

Usage:
  stave apply [flags]

Flags:
  -i, --controls string   Path to control definitions (default "controls/s3")
  -o, --observations string
                          Path to observation snapshots (default "observations")

Does the flag help wrap correctly? Are the defaults aligned? Is the long description properly indented? Unit tests don't check layout. VHS checks layout.

VHS `.tape` Cheat Sheet

Output file.gif              # Output file (gif, png, webm, mp4)
Set Width 120                # Terminal width
Set Height 40                # Terminal height
Set FontSize 14              # Font size in pixels
Set Theme "Dracula"          # Terminal theme
Set TypingSpeed 50ms         # Delay between keystrokes

Type "command"               # Type text (simulated keystrokes)
Enter                        # Press Enter
Sleep 2s                     # Wait for output
Ctrl+C                       # Send interrupt
Tab                          # Press Tab (for completion testing)
Backspace 5                  # Delete 5 characters

Hide                         # Stop recording (for setup commands)
Show                         # Resume recording

Combining Both Tools

For a complete CLI testing strategy:

Layer	Tool	Tests
Unit tests	`go test`	Data correctness, error handling, exit codes
E2E golden files	`go test` + JSON comparison	Full output correctness, determinism
Text recordings	Custom asciicast generator	Documentation accuracy, demo freshness
Visual regression	VHS	Formatting, alignment, colors, layout

Each layer catches different bugs. Unit tests catch logic errors. Golden files catch output regressions. Asciicast recordings catch documentation drift. VHS catches visual formatting bugs.

Getting Started

# Install VHS (macOS)
brew install charmbracelet/tap/vhs

# Install VHS (Linux)
go install github.com/charmbracelet/vhs@latest

# Create your first tape
cat > hello.tape << 'EOF'
Output hello.gif
Set Width 80
Set Height 24
Type "echo 'Hello from VHS'"
Enter
Sleep 1s
EOF

# Record
vhs hello.tape

The GIF is your visual test. Commit it, compare it in CI, review it in PRs.

Stave uses programmatic asciicast generation for documentation recordings and Go-based golden file testing for output correctness. VHS is the natural next step for visual regression testing of the text-formatted output.

DEV Community

Visual Regression Testing for CLIs with VHS

Using VHS

How This Differs from Asciinema

Visual Regression Testing Pattern

Step 1: Create a `.tape` file per workflow

Step 2: Generate the baseline

Step 3: Compare in CI

Step 4: Review with PR comments

What Visual Tests Catch That Unit Tests Miss

Table alignment

Color and formatting

Progress indicators

Help text layout

VHS `.tape` Cheat Sheet

Combining Both Tools

Getting Started

Top comments (0)

Using VHS

How This Differs from Asciinema

Visual Regression Testing Pattern

Step 1: Create a .tape file per workflow

Step 2: Generate the baseline

Step 3: Compare in CI

Step 4: Review with PR comments

What Visual Tests Catch That Unit Tests Miss

Table alignment

Color and formatting

Progress indicators

Help text layout

VHS .tape Cheat Sheet

Combining Both Tools

Getting Started

Step 1: Create a `.tape` file per workflow

VHS `.tape` Cheat Sheet