How to use Charm's VHS to create GIF-based visual regression tests for your CLI's terminal output — catching formatting bugs that unit tests miss.
Your CLI's unit tests verify that the right data comes out. But they don't test what the user actually sees.
A missing newline. A table column that wraps at 80 characters. A progress spinner that bleeds into the output. An ANSI color code that renders as garbage on a light terminal theme. These are visual bugs that pass every unit test but make your CLI look broken.
VHS by Charm solves this by recording your terminal as a GIF from a script — and you can use those GIFs as visual regression tests.
Using VHS
VHS reads a .tape file that describes terminal interactions:
# demo.tape
Output demo.gif
Set Width 120
Set Height 40
Set Theme "Monokai"
Type "stave apply --controls ./controls --observations ./obs --format text"
Enter
Sleep 2s
Run it:
vhs demo.tape
Output: demo.gif — a pixel-perfect recording of what the terminal looks like when that command runs.
How This Differs from Asciinema
| Asciinema (.cast) | VHS (.gif/.png) | |
|---|---|---|
| Output | Text-based replay (NDJSON) | Pixel-based image (GIF/PNG/WebM) |
| Renders | In a JavaScript player | As a static image anywhere |
| Tests | Text content correctness | Visual formatting correctness |
| Use case | Documentation, interactive replay | README badges, visual regression |
| File size | Small (text) | Large (image) |
| Searchable | Yes (it's text) | No (it's pixels) |
Asciinema answers: "What text does the CLI produce?"
VHS answers: "What does the CLI look like?"
Both are useful. They test different things.
Visual Regression Testing Pattern
Step 1: Create a .tape file per workflow
# tapes/apply-violation.tape
Output testdata/screenshots/apply-violation.gif
Set Width 120
Set Height 40
Set FontSize 14
Set Theme "Catppuccin Mocha"
Type "stave apply --controls controls/s3 --observations observations --now 2026-01-15T00:00:00Z --format text"
Enter
Sleep 3s
Step 2: Generate the baseline
vhs tapes/apply-violation.tape
Commit testdata/screenshots/apply-violation.gif as the golden file.
Step 3: Compare in CI
# .github/workflows/visual.yml
- name: Generate screenshots
run: |
for tape in tapes/*.tape; do
vhs "$tape"
done
- name: Check for visual changes
run: |
git diff --exit-code testdata/screenshots/
If any GIF changes, the diff catches it. The developer reviews the visual change and either updates the golden file or fixes the formatting bug.
Step 4: Review with PR comments
For GitHub PRs, you can post the before/after GIF directly in a comment:
- name: Post visual diff
if: failure()
run: |
echo "Visual regression detected. See the updated screenshots below."
# Upload artifacts or post to PR
What Visual Tests Catch That Unit Tests Miss
Table alignment
CONTROL_ID ASSET_ID STATUS
CTL.S3.PUBLIC.001 my-very-long-bucket NON_COMPLIANT
-name-that-wraps
A unit test checks that the data is correct. A visual test catches that the column wraps and breaks the alignment.
Color and formatting
[PASS] CTL.S3.ENCRYPT.001 — Server-Side Encryption
[FAIL] CTL.S3.PUBLIC.001 — No Public Read Access
A unit test sees [PASS] and [FAIL]. A visual test sees whether the ANSI color codes render correctly — green for pass, red for fail — or whether they produce \033[32m[PASS]\033[0m garbage.
Progress indicators
Running: evaluating controls... ⠋
A spinner that works in a real terminal but bleeds into piped output. A visual test with a fixed terminal size catches this.
Help text layout
Usage:
stave apply [flags]
Flags:
-i, --controls string Path to control definitions (default "controls/s3")
-o, --observations string
Path to observation snapshots (default "observations")
Does the flag help wrap correctly? Are the defaults aligned? Is the long description properly indented? Unit tests don't check layout. VHS checks layout.
VHS .tape Cheat Sheet
Output file.gif # Output file (gif, png, webm, mp4)
Set Width 120 # Terminal width
Set Height 40 # Terminal height
Set FontSize 14 # Font size in pixels
Set Theme "Dracula" # Terminal theme
Set TypingSpeed 50ms # Delay between keystrokes
Type "command" # Type text (simulated keystrokes)
Enter # Press Enter
Sleep 2s # Wait for output
Ctrl+C # Send interrupt
Tab # Press Tab (for completion testing)
Backspace 5 # Delete 5 characters
Hide # Stop recording (for setup commands)
Show # Resume recording
Combining Both Tools
For a complete CLI testing strategy:
| Layer | Tool | Tests |
|---|---|---|
| Unit tests | go test |
Data correctness, error handling, exit codes |
| E2E golden files |
go test + JSON comparison |
Full output correctness, determinism |
| Text recordings | Custom asciicast generator | Documentation accuracy, demo freshness |
| Visual regression | VHS | Formatting, alignment, colors, layout |
Each layer catches different bugs. Unit tests catch logic errors. Golden files catch output regressions. Asciicast recordings catch documentation drift. VHS catches visual formatting bugs.
Getting Started
# Install VHS (macOS)
brew install charmbracelet/tap/vhs
# Install VHS (Linux)
go install github.com/charmbracelet/vhs@latest
# Create your first tape
cat > hello.tape << 'EOF'
Output hello.gif
Set Width 80
Set Height 24
Type "echo 'Hello from VHS'"
Enter
Sleep 1s
EOF
# Record
vhs hello.tape
The GIF is your visual test. Commit it, compare it in CI, review it in PRs.
Stave uses programmatic asciicast generation for documentation recordings and Go-based golden file testing for output correctness. VHS is the natural next step for visual regression testing of the text-formatted output.
Top comments (0)