Bala Paranj

Posted on May 13

Your Go Golden Tests Don't Need to Regenerate Everything

#go #testing #devops #productivity

A practical pattern for targeted golden file regeneration in Go projects — from minutes to 0.27 seconds.

I have 5,810 golden files in my project. Every time I changed one test, I was regenerating all of them. It took minutes. Now it takes 0.27 seconds.

The fix was just organizing the regeneration path so you could aim it at one file instead of firing at everything.

The problem with regenerate everything

Golden tests are great. You capture known-good output, save it to a file, and compare against it on every run. When the output changes intentionally, you regenerate the golden file.

Most Go projects start with a simple approach: a Makefile target that regenerates all golden files at once.

make regenerate-goldens

This works when you have 20 golden files. When you have 5,810, it doesn't. You change one test, you wait for the tool to process every fixture directory, and most of the time nothing else changed. You're wasting time to update one file.

Two kinds of golden tests

Before fixing anything, I audited how golden files worked in the codebase. I found two completely different mechanisms hiding behind the same word.

In-process goldens — a test function renders output, compares it against a .golden or golden.json file in testdata/. The test itself can write the file if you ask it to. I had 3 of these.

E2e fixture goldens — an external tool runs the compiled binary against fixture directories and captures stdout into expected.* files. The test reads those files and compares. I had 5,807 of these.

These need different regeneration strategies. Trying to unify them under one mechanism would either duplicate the external tool inside the test (pointless) or force the external tool to understand in-process test output (brittle).

The in-process pattern: UPDATE_GOLDEN env var

For the 3 in-process golden tests, I added a small helper to the existing test utilities package.

package testutil

import (
    "bytes"
    "os"
    "path/filepath"
    "testing"

    "github.com/google/go-cmp/cmp"
)

func UpdateGolden() bool {
    return os.Getenv("UPDATE_GOLDEN") != ""
}

func AssertGolden(t *testing.T, path string, got []byte) {
    t.Helper()

    if UpdateGolden() {
        writeIfChanged(t, path, got)
    }

    want, err := os.ReadFile(path)
    if err != nil {
        t.Fatalf("read golden file %s: %v\nRun with UPDATE_GOLDEN=1 to create it", path, err)
    }

    if diff := cmp.Diff(string(want), string(got)); diff != "" {
        t.Fatalf("golden mismatch %s (-want +got):\n%s\nRun with UPDATE_GOLDEN=1 to update", path, diff)
    }
}

func writeIfChanged(t *testing.T, path string, data []byte) {
    t.Helper()

    old, err := os.ReadFile(path)
    if err == nil && bytes.Equal(old, data) {
        return
    }

    if err := os.MkdirAll(filepath.Dir(path), 0755); err != nil {
        t.Fatalf("create golden dir: %v", err)
    }
    if err := os.WriteFile(path, data, 0644); err != nil {
        t.Fatalf("write golden file %s: %v", path, err)
    }
    t.Logf("updated golden file: %s", path)
}

Usage in a test:

func TestTextReporter_Golden(t *testing.T) {
    got := renderReport(buildFixture())
    testutil.AssertGolden(t, "testdata/reports/hipaa_golden.txt", []byte(got))
}

Regenerate just that one test:

UPDATE_GOLDEN=1 go test ./internal/profile/reporter -run TestTextReporter_Golden

0.27 seconds.

Why an env var instead of a flag

My first version used flag.Bool("update", ...). The problem: Go test flags are per-package. If you define -update in one package and run go test ./... -update, every other package fails because it doesn't recognize the flag.

An environment variable works across all packages without any registration.

# This works for any package, any test
UPDATE_GOLDEN=1 go test ./path/to/package -run TestWhatever

Why writeIfChanged matters

Without it, every UPDATE_GOLDEN=1 go test ./... run touches every golden file's timestamp, even if the content didn't change. Your git status fills up with phantom changes. writeIfChanged reads the file first, compares bytes, and skips the write if nothing changed. Five lines that keep your diffs clean.

The e2e pattern: wrap what already exists

For the 5,807 fixture goldens, I already had a regeneration tool (regengoldens) that accepted a -filter regex. The problem was discoverability. Nobody remembered the flag syntax.

I added a Makefile target:

.PHONY: golden-fixture
golden-fixture:
    @test -n "$(FILTER)" || (echo "Usage: make golden-fixture FILTER=<regex>" && exit 1)
    $(MAKE) regenerate-goldens ARGS='-filter $(FILTER)'

Now regenerating one fixture set:

make golden-fixture FILTER=hipaa

No syntax to remember. Tab-completable.

The full Makefile surface

# In-process goldens
.PHONY: golden-update-all
golden-update-all:
    UPDATE_GOLDEN=1 go test ./...

.PHONY: golden-update
golden-update:
    @test -n "$(PKG)" || (echo "Usage: make golden-update PKG=./path/to/pkg/..." && exit 1)
    UPDATE_GOLDEN=1 go test $(PKG)

.PHONY: golden-one
golden-one:
    @test -n "$(PKG)" || (echo "Usage: make golden-one PKG=./path/to/pkg/... RUN=TestName" && exit 1)
    @test -n "$(RUN)" || (echo "Usage: make golden-one PKG=./path/to/pkg/... RUN=TestName" && exit 1)
    UPDATE_GOLDEN=1 go test $(PKG) -run '$(RUN)'

# E2e fixture goldens
.PHONY: golden-fixture
golden-fixture:
    @test -n "$(FILTER)" || (echo "Usage: make golden-fixture FILTER=<regex>" && exit 1)
    $(MAKE) regenerate-goldens ARGS='-filter $(FILTER)'

Two mechanisms, one discoverable surface. grep golden Makefile tells you everything.

What I didn't do

I didn't unify the two mechanisms. The in-process tests and e2e fixture tests have different architectures. Forcing them into one pattern would mean either duplicating the external regeneration tool inside Go tests or making the tool understand in-process rendering. Both are worse than having two clear paths.

I didn't add subtests where they weren't needed. The original plan called for converting flat tests to subtests for -run targeting. In practice, my 3 in-process golden tests were either single-case or already subtested. Converting for the sake of a pattern would have been churn.

I didn't migrate one test that would have changed golden content. One golden file was a hand-ordered JSON map. json.MarshalIndent sorts keys alphabetically, so running AssertGolden with UPDATE_GOLDEN=1 would have silently reordered the file. The rule I set was: the migration changes the mechanism, not the content. If any golden file changes, something is wrong. I left that test alone and documented why.

The verification that matters

After the migration, this is the check:

UPDATE_GOLDEN=1 go test ./...
git diff --name-only -- '*.golden' '*.golden.*'

The diff must be empty. If any golden file changed during migration, the helper introduced a difference — a trailing newline, an encoding change, a key reordering. That's a bug in the migration, not a legitimate update.

Results

Before: change one test, run make regenerate-goldens, wait minutes.

After: change one test, run UPDATE_GOLDEN=1 go test ./pkg/foo -run TestBar, wait 0.27 seconds.

The approach is boring. Env var, a 40-line helper, a few Makefile targets. Nothing novel. But the development loop went from avoid touching golden tests to golden tests are free to change.

This speed up was applied to Stave codebase, an offline configuration safety evaluator.

DEV Community