singleflight in Go: Collapsing Duplicate Work Under Load

#go #performance #concurrency

Book: The Complete Guide to Go Programming
Also by me: Hexagonal Architecture in Go — the companion book in the Thinking in Go series
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

A hot key expires in Redis. In the same millisecond, 5,000 in-flight
requests miss the cache and all decide to rebuild it. Every one of
them runs the same expensive query against the same row. The database
goes from bored to on fire, the query gets slower, more requests pile
up behind it, and the cache never gets a chance to refill.

That's a cache stampede. The load isn't higher than usual. The work is
just duplicated 5,000 times over when one call would have served
everyone. Go has a small package for exactly this shape of problem:
golang.org/x/sync/singleflight.

What singleflight does

singleflight.Group guarantees that for a given key, only one
execution of a function runs at a time. Concurrent callers with the
same key wait for that single execution and receive its result.

import "golang.org/x/sync/singleflight"

var group singleflight.Group

func GetUser(ctx context.Context, id string) (*User, error) {
    v, err, _ := group.Do(id, func() (any, error) {
        return loadUserFromDB(ctx, id)
    })
    if err != nil {
        return nil, err
    }
    return v.(*User), nil
}

Do takes a string key and a function. The first caller for a key
runs the function. Any other caller that arrives with the same key
while that function is still running blocks, then gets handed the same
(value, error) when it finishes. One database read serves the whole
crowd.

The third return value is a bool named shared. It tells you
whether this result was handed to more than one caller:

v, err, shared := group.Do(id, fn)
// shared == true means v went to several waiters at once.

Useful for a metric. If shared is true a lot, you know the
collapsing is doing real work.

Watching the collapse

Here's the behavior made visible. A hundred goroutines call the same
key inside a 50ms window. The underlying function counts how many
times it actually ran.

func main() {
    var g singleflight.Group
    var calls int64

    fetch := func() (any, error) {
        atomic.AddInt64(&calls, 1)
        time.Sleep(50 * time.Millisecond) // slow work
        return "payload", nil
    }

    var wg sync.WaitGroup
    for i := 0; i < 100; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            g.Do("user:42", fetch)
        }()
    }
    wg.Wait()

    fmt.Println(atomic.LoadInt64(&calls)) // 1
}

One hundred goroutines, one execution. That is the whole pitch.

One thing to be clear about: singleflight is not a cache. Do deletes
the key from its internal map the moment the function returns. The next
call for the same key runs the function again. It only collapses calls
that overlap in time. So it sits in front of your cache-load path, it
doesn't replace the cache. The pattern is: check cache, on a miss call
group.Do to load, write the result back to cache.

The shared-result caveat

The sharp edge is in the word shared. Every waiter gets the exact
same value. If that value is a pointer, a slice, or a map, all of them
now hold a reference to the same underlying data.

v, _, _ := group.Do(id, func() (any, error) {
    return loadUserFromDB(ctx, id) // returns *User
})
u := v.(*User)
u.LastSeen = time.Now() // every other caller sees this write

That mutation races with every other goroutine that received the same
*User. It's a classic data race, and it will not show up in a quick
test where one caller wins. It shows up under load, which is the only
time singleflight does anything at all.

Two ways out. Treat the returned value as immutable and never write
through the pointer. Or return a value the callers can own, and copy
before you hand it back:

func GetUser(ctx context.Context, id string) (User, error) {
    v, err, _ := group.Do(id, func() (any, error) {
        u, err := loadUserFromDB(ctx, id)
        return u, err // *User
    })
    if err != nil {
        return User{}, err
    }
    return *v.(*User), nil // copy out, caller owns it
}

Returning a copy means a caller can mutate its User without touching
anyone else's. Whichever rule you pick, write it down next to the code,
because the failure mode is silent.

DoChan and context

Do blocks. If a caller wants to give up when its request context is
cancelled, use DoChan, which returns a channel instead of blocking:

func GetUser(ctx context.Context, id string) (*User, error) {
    ch := group.DoChan(id, func() (any, error) {
        return loadUserFromDB(ctx, id)
    })
    select {
    case <-ctx.Done():
        return nil, ctx.Err()
    case res := <-ch:
        if res.Err != nil {
            return nil, res.Err
        }
        return res.Val.(*User), nil
    }
}

Now a caller whose context is cancelled returns right away instead of
waiting on the shared work.

There's a subtlety worth knowing. The function runs under the context
of whichever caller triggered the flight. Late arrivals wait on work
that belongs to the first caller. If that first caller's context has a
tight deadline and cancels, the shared load can fail for everyone who
joined it. When the loaded value is meant to be shared across requests,
detach the work from any single request's context. Derive the
function's context from context.WithoutCancel(ctx) or from a
background context with its own timeout, so one impatient caller can't
poison the result for the rest.

Forget: don't glue every caller to one failure

Because Do drops the key as soon as the function returns, a failure
never gets cached across sequential calls. The next request after a
failed one starts fresh. So most of the time you don't touch Forget.

Where it earns its place is the in-flight window. Picture a slow load
that takes three seconds and then fails. Every request that arrived
during those three seconds attached to that one call and shares its
error. A transient blip becomes a synchronized failure for a whole
batch of users.

Forget drops a key from the group so the next caller starts a new
execution instead of waiting on the current one:

group.DoChan(id, func() (any, error) {
    // After a short window, let new callers start
    // their own flight instead of riding this one.
    time.AfterFunc(20*time.Millisecond, func() {
        group.Forget(id)
    })
    return loadUserFromDB(ctx, id)
})

Without the Forget, a stampede that lands during a slow or failing
call all rides that single call and inherits its fate. Forgetting the
key after a short window caps the blast radius: a request that shows up
at second two gets its own attempt rather than a doomed one. The
trade-off is that you allow more than one concurrent execution, so you
give back some of the deduplication. Tune the window to the shape of
your load. A few milliseconds is plenty for a real stampede.

Where this belongs

Keep singleflight at the boundary where duplicate work is expensive:
the read-through cache loader, the outbound call to a slow upstream,
the config refresh that a thousand goroutines want at once. It doesn't
belong in your domain logic, and it isn't a substitute for a real
cache with a TTL. It's the thing that stops a cache miss from turning
into a self-inflicted denial of service.

Reach for it when you can point at a specific key that many goroutines
rebuild at the same time. Skip it when calls don't overlap, because
then it adds a map lookup and buys you nothing.

The two pitfalls to remember

The value is shared across every caller, and the context comes from the
first caller. Guard against both and singleflight quietly removes an
entire class of load spike.

If you want the machinery underneath this — how goroutines park and
wake on the internal wait group, why the returned value is shared, and
how interface type assertions behave at the boundary — that's the
runtime and language depth in The Complete Guide to Go Programming.
And when you're deciding which layer singleflight lives in so it stays
out of your domain code, Hexagonal Architecture in Go is the book on
keeping that boundary honest.