Bala Paranj

Posted on Apr 19

We Forgot defer — 6 Resource Leaks We Found During Refactoring

#go #programming #refactoring #cli

We had a progress spinner. It animated on stderr while the evaluation ran. When the evaluation panicked, the spinner kept spinning. The terminal was corrupted. We had to reset the terminal every time.

The fix was one line: defer stop().

We found six patterns like this during a refactoring phase — resource leaks that only appeared under error conditions, panics, or signals. The happy path worked. The unhappy path leaked.

1. The Spinner That Survives a Panic

Before: Cleanup after computation

func (r *runner) Run(ctx context.Context, cfg config) error {
    stop := r.runtime.BeginProgress("Computing observation delta")

    delta, err := r.computeDelta(ctx, cfg.ObservationsDir, cfg.Filter)
    stop()  // <-- Never runs if computeDelta panics

    if err != nil {
        return err
    }
    return writeOutput(r.Stdout, cfg.Format, delta)
}

BeginProgress starts a goroutine that writes ANSI escape codes to stderr every 100ms. stop() sends a signal to the goroutine to stop and write the final "Done" message. If computeDelta panics, stop() is never called. The goroutine keeps running. The terminal keeps receiving escape codes. The panic message is interleaved with spinner frames.

After: defer guarantees cleanup

func (r *runner) Run(ctx context.Context, cfg config) error {
    stop := r.runtime.BeginProgress("Computing observation delta")
    defer stop()

    delta, err := r.computeDelta(ctx, cfg.ObservationsDir, cfg.Filter)
    if err != nil {
        return err
    }
    return writeOutput(r.Stdout, cfg.Format, delta)
}

defer stop() runs after the function returns — whether by normal return, error return, or panic. The spinner is always cleaned up.

But there's a subtlety: the stop function itself runs in a goroutine. If the goroutine panics (say, writing to a closed stderr), the panic propagates to the caller. Fix the goroutine too:

func (r *Runtime) BeginProgress(label string) func() {
    errOut := r.stderr()
    stopCh := make(chan struct{})

    go func() {
        defer func() {
            if r := recover(); r != nil {
                fmt.Fprintf(errOut, "\rprogress render panic: %v\n", r)
            }
        }()
        ticker := time.NewTicker(100 * time.Millisecond)
        defer ticker.Stop()
        for {
            select {
            case <-stopCh:
                return
            case <-ticker.C:
                fmt.Fprintf(errOut, "\r%s Running: %s...", frames[i%len(frames)], label)
                i++
            }
        }
    }()

    var once sync.Once
    return func() {
        once.Do(func() {
            close(stopCh)
            // ... write final "Done" message
        })
    }
}

Three layers of protection:

defer stop() in the caller guarantees the stop signal is sent
defer recover() in the goroutine prevents render panics from crashing the process
sync.Once in the stop function prevents double-close panics if stop is called multiple times

2. The Signal Handler That Skips Cleanup

Before: os.Exit skips all defers

func (a *App) installInterruptHandler() {
    sigCh := make(chan os.Signal, 1)
    signal.Notify(sigCh, os.Interrupt)

    go func() {
        <-sigCh
        fmt.Fprintln(os.Stderr, "Interrupted")
        os.Exit(130)  // <-- All deferred cleanup skipped
    }()
}

When the user hits Ctrl+C, os.Exit(130) terminates the process immediately. No deferred functions run. Open file handles aren't closed. Temp files aren't removed. Log files aren't flushed. CPU profile output is lost.

After: Context cancellation allows defer chain to complete

func (a *App) installInterruptHandler() func() {
    sigCh := make(chan os.Signal, 1)
    done := make(chan struct{})
    signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM)

    go func() {
        defer func() {
            if r := recover(); r != nil {
                fmt.Fprintf(os.Stderr, "signal handler panic: %v\n", r)
            }
        }()
        select {
        case <-sigCh:
            fmt.Fprintln(os.Stderr, "Interrupted")
            if a.cancel != nil {
                a.cancel()  // Cancel context — operations unwind naturally
            }
        case <-done:
            return
        }
    }()

    return func() {
        signal.Stop(sigCh)
        close(done)
    }
}

The main execution path sets up deferred cleanup:

func (a *App) Execute(args []string) {
    cleanup := a.installInterruptHandler()
    defer cleanup()
    defer a.recoverExecutePanic()

    a.executeRootCommand(args)
    // All defers run whether command succeeds, fails, or is interrupted
}

Now Ctrl+C cancels the context. Operations check ctx.Err() and return errors. The normal return path runs all defers. File handles close. Temp files are removed. The exit code is set from the error chain, not hardcoded.

3. The Temp File That Outlives the Process

Before: Error paths skip cleanup

func WriteFileAtomic(path string, data []byte, perm os.FileMode) error {
    dir := filepath.Dir(path)
    tmp, err := os.CreateTemp(dir, ".stave-*.tmp")
    if err != nil {
        return err
    }
    tmpPath := tmp.Name()

    if _, err := tmp.Write(data); err != nil {
        tmp.Close()
        os.Remove(tmpPath)  // Manual cleanup on write error
        return err
    }
    if err := tmp.Sync(); err != nil {
        tmp.Close()
        os.Remove(tmpPath)  // Manual cleanup on sync error
        return err
    }
    if err := tmp.Close(); err != nil {
        os.Remove(tmpPath)  // Manual cleanup on close error
        return err
    }
    return os.Rename(tmpPath, path)
}

Four error paths. Each one manually closes and removes the temp file. If you add a fifth step (like Chmod), you must remember to add cleanup. Miss one path and temp files accumulate.

After: defer handles all cleanup paths

func WriteFileAtomic(path string, data []byte, perm os.FileMode) error {
    dir := filepath.Dir(path)
    tmp, err := os.CreateTemp(dir, ".stave-*.tmp")
    if err != nil {
        return err
    }
    tmpPath := tmp.Name()

    // Cleanup: remove temp file on ANY failure. On success, rename
    // moves it to the final path and remove becomes a no-op.
    closed := false
    defer func() {
        if !closed {
            tmp.Close()
        }
        os.Remove(tmpPath)
    }()

    if _, err := tmp.Write(data); err != nil {
        return fmt.Errorf("write: %w", err)
    }
    if err := tmp.Sync(); err != nil {
        return fmt.Errorf("sync: %w", err)
    }
    if err := tmp.Close(); err != nil {
        return fmt.Errorf("close: %w", err)
    }
    closed = true

    return os.Rename(tmpPath, path)
}

The closed flag prevents double-closing: tmp.Close() is called explicitly for the success path (so we can check its error), and the deferred tmp.Close() only runs if the explicit close wasn't reached.

os.Remove(tmpPath) in the defer always runs. On the success path, Rename already moved the temp file to its final name — Remove tries to delete the old path, which no longer exists, and silently fails. On error paths, the temp file is cleaned up.

Adding a new step (Chmod, Chown, etc.) requires zero cleanup code — the defer handles it.

4. The Mutex That Never Unlocks

Before: Error returns skip unlock

func (cp *CountedProgress) Update(processed, total int) {
    cp.mu.Lock()

    if processed < 0 || total < 0 {
        cp.mu.Unlock()
        return  // Must remember to unlock before every return
    }

    cp.processed = processed
    cp.total = total
    cp.render()

    cp.mu.Unlock()
}

Two unlock calls. If you add a third return path (say, for a cancelled context), you must add a third unlock. Miss one and the mutex deadlocks.

After: defer unlock is unconditional

func (cp *CountedProgress) Update(processed, total int) {
    cp.mu.Lock()
    defer cp.mu.Unlock()

    if processed < 0 || total < 0 {
        return  // Unlock runs automatically via defer
    }

    cp.processed = processed
    cp.total = total
    cp.render()
    // Unlock runs automatically via defer
}

One line. Every return path — including future ones added during refactoring — automatically unlocks.

5. The sync.Once That Replaced a Race

Before: Bool flag with race condition

type Progress struct {
    stopped bool
    ch      chan struct{}
}

func (p *Progress) Stop() {
    if p.stopped {
        return
    }
    p.stopped = true
    close(p.ch)
}

Two goroutines call Stop() concurrently. Both read p.stopped as false. Both set it to true. Both call close(p.ch). Second close panics: "close of closed channel."

After: sync.Once eliminates the race

type Progress struct {
    once sync.Once
    ch   chan struct{}
}

func (p *Progress) Stop() {
    p.once.Do(func() {
        close(p.ch)
    })
}

sync.Once guarantees the closure runs exactly once, regardless of how many goroutines call Stop() concurrently. No race. No double-close. No panic.

Combined with defer:

done := rt.BeginProgress("Evaluating controls")
defer done()  // done() uses sync.Once internally — safe to call twice

If the function calls done() explicitly (e.g., to stop the spinner before printing results) and then returns (triggering the deferred done()), sync.Once ensures the channel is closed only once.

6. The Future-Proof Cleanup Contract

Before: No cleanup mechanism

deps, err := builder.Build(ctx, plan)
if err != nil {
    return err
}
// ... use deps ...
// What if deps acquires resources later? No cleanup contract exists.

Today ApplyDeps holds only values — no file handles, no connections. But tomorrow you might add a database connection, a log file, or a metric exporter. Without a cleanup contract, the caller doesn't know it needs to release resources.

After: Establish the contract now, implement later

type ApplyDeps struct {
    Runner *AuditWorkflow
    Config AssessmentConfig
}

func (d *ApplyDeps) Close() {
    // No-op today. Future resources cleaned up here.
}

The caller:

deps, err := builder.Build(ctx, plan)
if err != nil {
    return err
}
defer deps.Close()
// ... use deps ...

defer deps.Close() runs whether the function succeeds or fails. When you add a file handle to ApplyDeps next month, the cleanup is already wired — you just add d.handle.Close() to the Close() method. No caller changes needed.

The Anti-Pattern: defer in Loops

The deferInLoop gocritic check catches this:

// BAD: defer accumulates — all files stay open until loop ends
for _, path := range files {
    f, err := os.Open(path)
    if err != nil {
        return err
    }
    defer f.Close()  // <-- Runs after the LOOP, not after the iteration

    process(f)
}
// If files has 1000 entries, 1000 file handles are open simultaneously

Fix: extract to a function where defer scopes correctly:

// GOOD: defer scopes to the function, which returns per iteration
for _, path := range files {
    if err := processFile(path); err != nil {
        return err
    }
}

func processFile(path string) error {
    f, err := os.Open(path)
    if err != nil {
        return err
    }
    defer f.Close()  // Runs when processFile returns — per iteration
    return process(f)
}

Or use explicit close when the logic is simple:

for _, path := range files {
    f, err := os.Open(path)
    if err != nil {
        return err
    }
    err = process(f)
    f.Close()
    if err != nil {
        return err
    }
}

When We Added defer vs When We Should Have

Every defer we added during refactoring was a bug we shipped without. The happy path worked. The tests passed. But:

Resource	Happy Path	Panic/Error Path	Signal Path
Spinner stop	Worked	Leaked	Leaked
Signal cleanup	N/A	N/A	Skipped
Temp file remove	Worked	Leaked on some paths	Leaked
Mutex unlock	Worked	Deadlocked on new returns	N/A
Channel close	Worked	Panicked on double-close	Panicked

The pattern is consistent: we tested the success path. The failure path was invisible until a panic happened during development or a user hit Ctrl+C at the wrong moment.

The lesson: write defer on the line after resource acquisition. Not "later." Not "when we add error handling." Immediately. The cost is zero when everything works. The cost of forgetting is a corrupted terminal, a deadlocked process, or a temp directory full of orphaned files.

These 6 defer patterns were added during refactoring of Stave, a Go CLI for offline security evaluation. The deferInLoop gocritic check now prevents the most common defer mistake at CI time.

DEV Community