David Horvat

Posted on Jan 16 • Edited on Feb 3

Go Internals for Interviews: Concurrency

This is the first article in the Go Internals for Interviews series. The topics in this series are chosen based on real Go technical interviews. The goal is to go beyond surface-level answers and explain how things actually work, so you are prepared for the kinds of “why” and “what happens if” questions interviewers tend to ask.

You might consider yourself a confident developer who believes they fully understand concepts because you have used them successfully in real projects. I thought the same, until I went through several job interviews. That experience made me realize how many edge cases I had never considered, how many theoretical questions I only partially understood, and how often I had an answer in my head but struggled to explain it clearly to another person.

This article is intended for developers who already have a basic understanding of concurrency in Go and are familiar with using goroutines. It is not a guide on how to use goroutines, but rather a collection of additional theory and insights that go beyond day-to-day usage.

Another reason to read this article, besides interview preparation, is simply to deepen your understanding. You may encounter concepts here that you have not worked with before. Learning about them now can help you avoid subtle bugs, recognize problems before they become production incidents, or even fix mistakes that already exist in your codebase. And of course, there is always the pure satisfaction of learning more, which, at least for me, is reason enough.

I will provide as many code examples as possible. Most of them will be intentionally simple and will not model real-world scenarios, which might feel artificial to some readers. The goal of these examples is not to teach you how to build complete systems, but to help you clearly understand individual concepts without unnecessary distractions. I recommend pasting code examples in go playground to see what is going on when program is executed.

1) Mental model: what we’re solving
- 1.1 Parallelism vs Concurrency
2) Goroutine vs OS thread
3) Goroutine syncronization: sync.WaitGroup vs errgroup.Group
4) Channels: communication and coordination
5) Getting stuck: deadlocks vs livelocks
6) Data races and correctness models
7) Goroutine leaks (lifetime bugs)
- 7.1 Common leak patterns
- 7.2 Goroutine leak prevention techniques
8) Bounded concurrency and backpressure
9) sync package toolbox (advanced primitives)
10) Debugging and observability: pprof

1) Mental model: what we’re solving

1.1 Parallelism vs Concurrency

To understand concurrency in the Go, we first need to understand the difference between the concepts of concurrency and parallelism.

Concurrency refers to structuring a program in such a way that multiple tasks can make progress within the same time period, but not necessarily at the same time. Go enables this by using goroutines and channels.

Parallelism means executing multiple tasks at the same time. For this to be possible in any computing system, multiple CPU cores are required. Go achieves parallelism by scheduling goroutines across multiple CPU cores.

2) Goroutine vs OS thread

In the context of concurrency, a common question on technical interviews is what the difference is between a goroutine and an OS thread, which is why I decided to explain it in depth.

2.1. User space vs kernel space (just for nerds)

If we want to understand the difference between a goroutine and an OS thread at a deeper level, we first need to understand the difference between user space and kernel space. Going into a detailed explanation of what the kernel is would take us too far off track, and I will assume you already have a basic understanding of it. If you are reading this, chances are you are a bit of a nerd, and as such, you probably already know what the kernel is. If, however, you are reading out of curiosity and are not familiar with the concept, we can loosely define the kernel as the part of the operating system responsible for controlling access to the computer’s hardware and managing physical resources such as CPU, memory, and devices.

Modern operating systems divide execution into two broad domains: user space and kernel space.

User space is where normal application code runs. This includes your Go program, the Go runtime, and most libraries. Code running in user space cannot directly access hardware, manage memory mappings, or control CPU scheduling. These operations are restricted for safety and stability.

Kernel space is where the operating system itself runs. The kernel has full access to hardware and is responsible for managing CPU scheduling, memory, devices, filesystems, and system calls. When a program needs the kernel to do something privileged such as reading from disk, sending data over the network, or creating an OS thread, it must cross from user space into kernel space via a system call.

Crossing this boundary is relatively expensive. It involves changing CPU modes, validating permissions, and executing kernel scheduler and bookkeeping logic before returning back to user space.

2.2. OS thread

An OS thread is the smallest unit of execution that the operating system schedules onto CPU cores. Each OS thread has its own stack of fixed size and its own registers.

A register represents very small and very fast memory location inside the CPU itself.

The size of a thread’s stack is determined by the kernel. Moreover, kernel is responsible for creating, terminating, and performing context switches between OS threads, and these operations are relatively expensive in terms of resources and performance.

To fully understand OS threads and, consequently, how they differ from goroutines, we need to understand why context switching between OS threads is considered expensive. Answering this question requires diving into some computer science fundamentals, but I will try to keep it as concise as possible.

First, it is important to understand how a CPU performs its work and what makes it so fast. A CPU relies on several layers of memory to execute instructions efficiently. The fastest layer is made up of registers, which we have already mentioned. Registers are small, extremely fast memory locations located directly inside the CPU. Above registers, the CPU uses cache memory, which is divided into multiple levels: L1, L2, and L3. These caches store recently accessed data and instructions to reduce the need to access main memory. At the bottom of the hierarchy is RAM, which is significantly slower compared to registers and cache from the CPU’s perspective.

Another key optimization CPUs rely on is branch prediction. The predictor is a hardware mechanism that attempts to predict the future execution path of a thread based on its past behavior. By speculating on which instructions will be needed next, the CPU can keep its execution pipelines full and avoid costly stalls.

Context switching between OS threads is expensive because the operating system must intervene. The kernel has to stop the currently running thread, save its execution state, choose another runnable thread, and restore that thread’s state before execution can continue. During this process, the CPU’s caches and prediction logic are still optimized for the previous thread, so the new thread initially runs with poor cache locality and frequent branch mispredictions until the CPU adapts. In addition, the switch requires entering and leaving the kernel space, which involves scheduler bookkeeping and privilege transitions that add overhead before any useful work resumes.

2.3. Goroutine

Goroutines are the smallest units of execution managed by the Go runtime. Each goroutine has its own stack of variable size.

When the runtime decides to stop executing one goroutine and run another, it performs a goroutine context switch. During this process, the Go scheduler pauses the currently running goroutine and saves its execution state. This state includes the goroutine’s stack pointer, program counter, a small set of registers, and scheduler metadata maintained by the Go runtime. The scheduler then restores the saved state of another goroutine and continues its execution on the same OS thread. From the programmer’s perspective, this looks like multiple goroutines making progress “at the same time,” even though only one of them may be running on a given thread at any instant.

Switching between goroutines is typically much cheaper than switching between OS threads for several reasons. First, goroutine switching usually stays entirely in user space. The Go runtime scheduler makes the decision and performs the switch without invoking the operating system’s scheduler, which avoids expensive user–kernel transitions. Second, the OS thread itself does not change. Because goroutines have their own stacks that are managed by the Go runtime, the scheduler does not need to switch to a different OS thread stack or modify the kernel’s view of the running thread. Finally, goroutine switching avoids many of the heavyweight steps involved in OS thread context switches, such as kernel scheduler bookkeeping, virtual memory context changes, and cache disruption associated with moving execution to a different thread. As a result, Go can efficiently support very large numbers of concurrent goroutines on a single OS threat.

2.4 GMP model

GMP model is another thing thats is asked about on technical interviews.

To understand how goroutines achieve both high concurrency and efficient parallelism, we need to look at how the Go runtime scheduler works internally. This scheduler is responsible for deciding which goroutines run, when they run, and on which CPU cores.

The Go scheduler is built around the GMP model, which defines three core concepts and how they interact.

At a high level, the goal of the GMP model is to efficiently multiplex a large number of goroutines onto a smaller number of operating-system threads, while still taking advantage of all available CPU cores.

Components of GMP model are :

G (Goroutine)

A goroutine represents a unit of work: a function that can be scheduled and executed independently. Goroutines are lightweight and cheap to create, which is why Go programs often use thousands or even millions of them.
M (Machine)

An M represents an OS thread. This is the entity that the operating system actually schedules onto CPU cores. An M executes Go code by running goroutines, but it can only do useful work when it has a P.
P (Processor)

A P is a scheduler resource that represents the ability to execute Go code. You can think of it as a logical token or execution context required to run goroutines.

The number of Ps is controlled by GOMAXPROCS and usually matches the number of logical CPU cores on the machine.

A crucial rule of the Go scheduler is that:

An OS thread (M) must hold a P in order to execute Go code.

This design allows the runtime to control how much Go code runs in parallel, independently of how many OS threads exist.

You may ask why does P exist anyways. At first glance, P might seem unnecessary. Why not schedule goroutines directly onto OS threads, right? The reason is control.

By introducing P as a separate concept, the Go runtime can:

limit parallel execution to a desired level (GOMAXPROCS)
efficiently schedule goroutines without constantly involving the kernel
quickly move work between OS threads when blocking occurs

This separation is what allows Go to combine lightweight concurrency (many goroutines) with efficient parallelism (bounded by CPU cores).

The blocking syscall scenario (why GMP matters)

The importance of the GMP model becomes clear when a goroutine performs a blocking system call, such as reading from disk or waiting on network I/O.

Suppose a goroutine G is running on an OS thread M, and that thread holds a P. When G makes a blocking system call, the OS thread M becomes blocked inside the kernel and cannot execute any Go code.

If the scheduler did nothing, this would waste a CPU core.

Instead, the Go runtime:

detaches the P from the blocked M
assigns that P to another available or newly created OS thread
continues executing other runnable goroutines

When the blocked system call eventually completes, the original M does not immediately resume running Go code. It must first reacquire a P.

This mechanism allows Go programs to keep making progress even when some threads are blocked on I/O therefore avoiding wasting CPU cores.

Why you should care about the GMP model

You don’t need to know the GMP model to use goroutines, but understanding it explains many real-world behaviors, such as:

why blocking syscalls don’t freeze your entire program
why goroutines are cheap but OS threads are not
why GOMAXPROCS controls parallelism but not concurrency
how Go can run millions of goroutines on a small number of threads

In short, the GMP model is the foundation that allows Go to deliver simple concurrency semantics without sacrificing performance. And also, GMP model is a very frequent question on technical interviews.

2.5 GOMAXPROCS

GOMAXPROCS is a setting in the Go runtime that controls how much Go code can run in parallel. More precisely, it determines the number of scheduler processors (the P in the GMP model) available to the runtime. Since an OS thread must hold a P in order to execute Go code, GOMAXPROCS effectively sets an upper bound on how many goroutines may execute simultaneously.

By default, GOMAXPROCS is set to the number of logical CPUs available on the machine. This choice reflects the common case: for CPU-bound workloads, allowing one unit of parallel execution per logical CPU usually provides the best balance between throughput and scheduling overhead. Increasing this value does not create more CPU cores, and decreasing it does not eliminate concurrency; it only changes how much parallel execution the runtime allows.

It is important to distinguish parallelism from concurrency when reasoning about GOMAXPROCS. Concurrency refers to the ability to structure a program so that multiple goroutines can make progress over time, while parallelism refers to multiple goroutines actually executing at the same time on different CPU cores. GOMAXPROCS limits parallelism, not concurrency. Even with GOMAXPROCS set to one, a program may have thousands of goroutines that are scheduled, blocked, resumed, and interleaved by the runtime.

When GOMAXPROCS is set to one, the runtime creates a single scheduler processor. Only one goroutine can execute Go code at any instant, even on a multi-core machine. There is no parallel execution of Go code, but concurrency remains fully intact. Goroutines are still preempted, they still block on channels and mutexes, and they are still interleaved over time. Because goroutines can be paused at arbitrary points, unsynchronized access to shared memory can still interleave in unsafe ways. For this reason, setting GOMAXPROCS to one does not eliminate data races, and proper synchronization is still required. Blocking system calls also behave as expected when GOMAXPROCS is one. If a goroutine performs a blocking operation such as disk or network I/O, the underlying OS thread will block in the kernel. The Go runtime can detach the scheduler processor from that thread and attach it to another OS thread, allowing other goroutines to continue running. This is why a single blocking call does not freeze the entire program, even when parallelism is limited to one.

If GOMAXPROCS is set higher than the number of available logical CPUs, the runtime will allow more goroutines to be runnable in parallel than the hardware can actually execute at once. The operating system will then time-slice OS threads across the available cores. This does not increase true parallelism for CPU-bound workloads, since the hardware is still limited by the number of cores. Instead, it usually increases scheduling overhead and can reduce performance due to more frequent context switches and poorer cache locality. For this reason, setting GOMAXPROCS above the logical CPU count rarely helps and often hurts performance. There are niche scenarios where a slightly higher value can be beneficial, such as programs that spend significant time in blocking native code outside the Go runtime, for example through cgo. In these cases, additional scheduler processors may help keep Go code running while some OS threads are blocked. Such adjustments should be driven by measurement and profiling rather than by default assumptions.

3) Goroutine syncronization: `sync.WaitGroup` vs `errgroup.Group`

When working with goroutines, you have almost certainly needed to synchronize them at some point, and you have probably used a sync.WaitGroup, which is often the right tool for the job. However, many developers are either unaware of errgroup.Group or know about it but have never taken the time to explore how it works.

The difference between `sync.WaitGroup and errgroup.Group` is a common interview topic. Understanding this distinction is important not only for answering interview questions correctly, but also for choosing the right tool for the right situation in real-world code.

sync.WaitGroup and errgroup.Group solve the same basic problem: start several goroutines and then wait until they all finish. The difference is what they do beyond “wait”.

A sync.WaitGroup is the simplest tool. It only tracks completion. You use it when each goroutine can run independently, you don’t need to return errors, and you don’t need a coordinated cancellation mechanism. It’s also the lowest-overhead and most “direct” option: you increment a counter with Add, each goroutine calls Done, and the caller blocks on Wait. If you already have another way to collect results (for example, writing results into a channel, or updating protected shared state), a WaitGroup is often all you need.

Here is an example of simple use of a sync.WaitGroup:

package main

import (
    "fmt"
    "sync"
    "time"
)

func main() {
    var wg sync.WaitGroup

    tasks := []int{1, 2, 3}

    wg.Add(len(tasks))

    for _, t := range tasks {
        go func(task int) {
            defer wg.Done()

            time.Sleep(200 * time.Millisecond)
            fmt.Println("finished task", task)
        }(t)
    }

    // wait until all goroutines call Done()
    wg.Wait()

    fmt.Println("all tasks finished")
}

An errgroup.Group (from golang.org/x/sync/errgroup) is essentially a WaitGroup with an opinionated pattern built in: each goroutine returns an error, and the group collects it. The most common form, errgroup.WithContext, also adds cancellation: when one goroutine returns an error, the context is canceled so the other goroutines can stop early. You use errgroup when failure in one goroutine should affect the others, or when you want a clean, standard way to propagate the first error back to the caller. This is very common in real programs: request fan-out to multiple services, parallel file/network operations, or multi-step pipelines where continuing after an error would waste work.

Here is an example of simple use of a errgroup.Group:

package main

import (
    "context"
    "fmt"
    "time"

    "golang.org/x/sync/errgroup"
)

func main() {
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    g, ctx := errgroup.WithContext(ctx)

    tasks := []int{1, 2, 3}

    for _, t := range tasks {
        task := t
        g.Go(func() error {
            select {
            case <-ctx.Done():
                // stop early if another goroutine failed
                return ctx.Err()
            default:
            }

            time.Sleep(200 * time.Millisecond)

            if task == 2 {
                return fmt.Errorf("task %d failed", task)
            }

            fmt.Println("finished task", task)
            return nil
        })
    }

    if err := g.Wait(); err != nil {
        fmt.Println("error:", err)
    }
}

Why pick one over the other comes down to semantics:

Use sync.WaitGroup when:

you only care that all goroutines finish (no error reporting needed)
goroutines should run to completion even if one fails, or failures are handled internally
you want minimal complexity and overhead
you’re already coordinating results through channels or shared state guarded by locks

Use errgroup.Group when:

goroutines should report errors back to the caller
you want to stop the whole operation if any goroutine fails
you want cancellation to be part of the structure (especially with WithContext)
you’re implementing “fan-out/fan-in” work where partial failure should abort the rest

One subtle but important point: WaitGroup doesn’t provide a built-in way to propagate errors or cancel sibling goroutines. You can build that yourself (error channels, atomic error storage, context cancellation), but errgroup packages that pattern into something standard and less error-prone. In other words, if you find yourself adding “WaitGroup + error channel + cancel context” repeatedly, that’s usually the moment to switch to errgroup.

4) Channels: communication and coordination

Channels are a key tool for communication between goroutines or between goroutines and the program that launches them. We distinguish between buffered channels and unbuffered channels.

4.1 Unbuffered vs buffered channel

An unbuffered channel synchronizes goroutines by requiring sending and receiving to happen at the same time. In other words, the sender and receiver must meet.

A buffered channel decouples sender and receiver by allowing a limited number of values to be temporarily stored in an internal queue.

This distinction is important because using the wrong type of channel can seriously degrade performance or lead to program blocking. For example, if we try to send data on an unbuffered channel for which there is no receiver, the program will block forever.

With buffered channels, data is stored in a queue whose size is defined when the channel is created. Once this queue is full, sending new data blocks until the receiver consumes at least one value from the channel.

unbufferedChannel := make(chan)

bufferedChannel := make(chan, 10)

4.2 Closing and receiving: how to do it safely

When receiving data from a channel, it is good practice to check whether the channel is closed. This allows us to determine whether the value we received is actual data or just the zero value of the type.

When reading from a channel, we actually receive two values. The first is the data itself, and the second is a boolean indicating whether the channel is closed. If the channel is buffered, all previously stored values are received from the internal queue first. Only after the buffer is emptied do we receive the zero value along with a boolean value of false.

// if ok == true, that means data is fetched from a unclosed channel
// if ok == false, that means data equals to zero value of certain data type and channel is closed
data, ok := <- channel

If we attempt to send data on a closed channel, the program will panic, which is a situation we always want to avoid.

channel := make(chan)

x = 10

close(channel)

// this will cause panic
channel <- x

When we use channels, it is also important to close them in order to signal that no more data will be sent and to allow proper resource cleanup.

When closing a channel, an important rule must be followed: only the sender is allowed to close the channel. A receiver can never reliably know whether more data will be sent on the channel, which is why it must not close it.

package main

import (
    "fmt"
    "time"
)

func sender(ch chan int) {
    for i := 1; i <= 3; i++ {
        ch <- i
        time.Sleep(500 * time.Millisecond)
    }
    close(ch) // ✅ sender closes the channel
}

func receiver(ch chan int) {
    for value := range ch { // keeps receiving until channel is closed
        fmt.Println("Received:", value)
    }
    fmt.Println("Channel closed, receiver done")
}

func main() {
    ch := make(chan int)

    go sender(ch)
    receiver(ch)
}

In situations where multiple senders are sending data on the same channel, closing the channel must be handled by a dedicated coordinator. Who plays the role of a coordinator is up to you, but it can be a simple goroutine that has full control over the channel’s lifecycle.

package main

import (
    "fmt"
    "sync"
    "time"
)

func producer(id int, out chan<- int, wg *sync.WaitGroup) {
    defer wg.Done()

    for i := 1; i <= 3; i++ {
        time.Sleep(200 * time.Millisecond)
        out <- id*10 + i // e.g. 11,12,13 then 21,22,23 ...
    }
    // ✅ producer does NOT close(out)
}

func main() {
    out := make(chan int)

    var wg sync.WaitGroup
    wg.Add(2)

    go producer(1, out, &wg)
    go producer(2, out, &wg)

    // ✅ dedicated coordinator closes the channel
    go func() {
        wg.Wait()   // wait until all producers finished sending
        close(out)  // safe: nobody will send after Wait() returns
    }()

    // receiver(s) just range until closed
    for v := range out {
        fmt.Println("received:", v)
    }

    fmt.Println("all done")
}

4.3 Select statement

When working with channels, we will often encounter the select statement. The select statement is similar to a switch statement, but instead of ordinary conditions, its cases represent sending data to a channel or receiving data from a channel.

If multiple cases are ready at the same time, select chooses one at random, which helps avoid starvation of any particular channel. Futhermore, once a select statement selects a case and begins executing its associated code, it will run to completion without being interrupted by activity on other channels in the same select.

If a select statement has no default case, it blocks execution until at least one of the defined cases becomes ready to run.

select {
case msg := <-ch:
    fmt.Println("received:", msg)
default:
    fmt.Println("no data available")
}

So far we discussed what happens when we send or receive from closed channel, buffered or unbuffered, but we never defined what happens when we perform send or receive on nil channel.

Sending to a nil channel blocks forever.

Receiving from a nil channel blocks forever.

Closing a nil channel panics.

In a select statement, a case that involves a nil channel is never selectable, because that send or receive can never proceed. So setting a channel variable to nil is a clean way to disable that branch without changing the select structure.

A closed channel is always ready to receive. That means a select can repeatedly choose a receive case from a closed channel and keep returning the zero value immediately, which causes busy looping if you don’t handle the ok flag or disable the case.

Your gate pattern is exactly the standard fix :

package main

import (
    "fmt"
    "time"
)

func producer(name string, out chan<- int, delay time.Duration) {
    for i := 1; i <= 3; i++ {
        time.Sleep(delay)
        out <- i
    }
    close(out) // sender closes its own channel
    fmt.Println(name, "closed")
}

func main() {
    ch1 := make(chan int)
    ch2 := make(chan int)

    go producer("producer 1", ch1, 300*time.Millisecond)
    go producer("producer 2", ch2, 500*time.Millisecond)

    // Fan-in loop: keep going until both channels are nil
    for ch1 != nil || ch2 != nil {
        select {
        case v, ok := <-ch1:
            if !ok {
                ch1 = nil // disable this case
                continue
            }
            fmt.Println("from ch1:", v)

        case v, ok := <-ch2:
            if !ok {
                ch2 = nil // disable this case
                continue
            }
            fmt.Println("from ch2:", v)
        }
    }

    fmt.Println("all channels closed, done")
}

If you do not set the channel to nil after it closes, that receive case can win the select immediately over and over again, returning the zero value each time. That can starve other cases and peg a CPU core.

5) Getting stuck: deadlocks vs livelocks

5.1 Deadlocks

A deadlock occurs when two or more goroutines wait on each other indefinitely, and none of them can make progress. In this state, execution is completely blocked, no goroutine is running, and the program becomes permanently stuck. This typically happens due to circular waiting on locks or because of blocking channel operations where no sender or receiver is available.

Here is a simple example of a deadlock:

package main

import (
    "fmt"
    "sync"
)

func main() {
    ch1 := make(chan string)
    ch2 := make(chan string)

    var wg sync.WaitGroup
    wg.Add(2)

    // Goroutine A
    go func() {
        defer wg.Done()
        fmt.Println("A: waiting for message from B")
        msg := <-ch1 // blocks forever
        fmt.Println("A got:", msg)

        ch2 <- "reply from A"
    }()

    // Goroutine B
    go func() {
        defer wg.Done()
        fmt.Println("B: waiting for message from A")
        msg := <-ch2 // blocks forever
        fmt.Println("B got:", msg)

        ch1 <- "reply from B"
    }()

    wg.Wait() // main waits, but goroutines are deadlocked
    fmt.Println("main done")
}

This code would produce a message saying “f*atal error: all goroutines are asleep - deadlock!*”

5.2. Deadlock prevention

Many deadlocks and goroutine leaks come from one core mistake: a goroutine enters a blocking operation without a guaranteed way to get unstuck. The following practices are not abstract rules; each one exists to prevent a very specific class of bugs.

Keep lock aquisition order consistent

If multiple locks exist, all goroutines must acquire them in the same order. Violating this rule easily creates circular waits.

// ❌ deadlock-prone
func f() {
    muA.Lock()
    muB.Lock()
    defer muB.Unlock()
    defer muA.Unlock()
}

func g() {
    muB.Lock()
    muA.Lock()
    defer muA.Unlock()
    defer muB.Unlock()
}

If f holds muA and g holds muB, both will wait forever.

// ✅ consistent order
func f() {
    muA.Lock()
    muB.Lock()
    defer muB.Unlock()
    defer muA.Unlock()
}

func g() {
    muA.Lock()
    muB.Lock()
    defer muB.Unlock()
    defer muA.Unlock()
}

The rule is simple: once an order exists, never violate it.

Do not block while holding a lock

Locks should protect shared state, not surround slow or blocking operations. If you hold a lock while doing I/O, waiting on other goroutines, or calling unknown code, you’re effectively preventing all other goroutines from accessing that protected state until the blocking operation completes. Under load, this often turns into “everything is stuck waiting for one slow thing.”

A very common mistake is holding a mutex while performing a network call.

package main

import (
    "fmt"
    "sync"
    "time"
)

type Client struct {
    mu    sync.Mutex
    token string
}

func (c *Client) Fetch() {
    c.mu.Lock()
    defer c.mu.Unlock()

    // simulate slow network call
    time.Sleep(2 * time.Second)

    fmt.Println("using token:", c.token)
}

func main() {
    c := &Client{token: "secret"}

    // one goroutine does a slow fetch
    go c.Fetch()

    // for seeing how much time is wasted by blocking unnecesarrly
    t1 := time.Now()

    // another goroutine wants to update the token
    time.Sleep(100 * time.Millisecond)
    go func() {
        c.mu.Lock()
        c.token = "new-token"
        c.mu.Unlock()
        fmt.Println("token updated")

        // prints out wasted time
        fmt.Println("time passed: ", time.Since(t1))
    }()

    time.Sleep(2 * time.Second)
}

This code “works,” but it serializes unrelated operations behind the lock. If one request stalls, every other goroutine that needs token (or any other protected state in this struct) is blocked behind it. In real systems, this is a classic source of stalls and cascading timeouts.

The fix is to keep the critical section tiny: copy the shared state you need, unlock, and only then do the slow work.

package main

import (
    "fmt"
    "sync"
    "time"
)

type Client struct {
    mu    sync.Mutex
    token string
}

func (c *Client) Fetch() {
    // lock only to copy shared state
    c.mu.Lock()
    token := c.token
    c.mu.Unlock()

    // simulate slow I/O without holding the lock
    time.Sleep(1 * time.Second)

    fmt.Println("using token:", token)
}

func main() {
    c := &Client{token: "secret"}

    go c.Fetch()

    // for seeing how much time is saved by not blocking unnecesarrly
    t1 := time.Now()

    time.Sleep(100 * time.Millisecond)
    go func() {
        c.mu.Lock()
        c.token = "new-token"
        c.mu.Unlock()
        fmt.Println("token updated")
        // prints out the proof that time is saved
        fmt.Println("time passed: ", time.Since(t1))
    }()

    time.Sleep(2 * time.Second)
}

Now other goroutines can still read/update c.token while the request is in flight. The mutex protects only what it should: the shared state, not the slow operation.

The same rule applies to disk I/O, WaitGroup.Wait, channel sends/receives, and callbacks you don’t control. In all of these cases, the safe pattern is: lock, touch shared state, unlock — then block or do slow work.

Make channel protocols explicit

Channels are safe only when their communication protocol is clearly defined. For every channel, it must be unambiguous which goroutines are allowed to send, which goroutines are allowed to receive, and which goroutine is responsible for closing the channel. If these roles are not explicit, blocking behavior becomes unpredictable and deadlocks are easy to introduce.

A common failure mode is symmetric waiting, where both sides expect the other to initiate communication.

package main

import (
    "fmt"
)

func worker(ch chan int) {
    fmt.Println("worker: waiting for start signal")
    <-ch // waits for a value
    fmt.Println("worker: working")
}

func main() {
    ch := make(chan int)

    go worker(ch)

    fmt.Println("main: waiting for worker")
    <-ch // waits for a value

    fmt.Println("main: done")
}

Both goroutines are waiting to receive, and no goroutine ever sends. The channel itself is correct; the protocol is not. Neither side has been assigned responsibility for initiating communication and the program blocks forever.

The same code becomes correct once the protocol is made explicit.

package main

import (
    "fmt"
    "time"
)

func worker(ch chan int) {
    fmt.Println("worker: waiting for start signal")
    <-ch // waits for start signal
    fmt.Println("worker: working")
}

func main() {
    ch := make(chan int)

    go worker(ch)

    time.Sleep(100 * time.Millisecond) // just to make output order clearer

    fmt.Println("main: sending start signal")
    ch <- 1 // main initiates

    time.Sleep(100 * time.Millisecond)
}

Now the roles are clear: the main goroutine sends the start signal, and the worker receives it. There is no ambiguity about who acts first.

Closing rules are part of the protocol as well. Only the goroutine that owns sending on a channel may close it. Receivers must never close a channel, because they cannot know whether more values will be sent.

package main

import (
    "fmt"
)

func producer(out chan<- int) {
    for i := 0; i < 3; i++ {
        out <- i
    }
    close(out) // ✅ sender closes the channel
}

func consumer(in <-chan int) {
    for v := range in { // stops automatically when channel is closed
        fmt.Println("received:", v)
    }
    fmt.Println("consumer done")
}

func main() {
    ch := make(chan int)

    go producer(ch)
    consumer(ch)
}

Here the protocol is complete: the producer sends and closes, and the consumer only receives. Because ownership is clear, the blocking behavior is safe and predictable.

In practice, correct channel usage is less about the mechanics of send and receive and more about defining and respecting a simple protocol. If you cannot clearly answer who sends, who receives, and who closes a channel, the design is incomplete and likely incorrect.

Use close or context for broadcast signals

Sending a single value wakes only one receiver and can block. Closing a channel or canceling a context wakes everyone and never blocks.

// ❌ send-based signaling
done <- struct{}{} // blocks if no receiver

// ✅ close-based signaling
close(done) // never blocks, wakes all receivers

This is why context.Context works so well for cancellation: it is a broadcast mechanism.

Add cancellation for potentially infinite blocks

Any goroutine that blocks indefinitely must have a cancellation path.

// ❌ no escape hatch
job := <-jobs
process(job)

If no job ever arrives, the goroutine leaks.

// ✅ cancellable block
select {
case job := <-jobs:
    process(job)
case <-ctx.Done():
    return
}

This pattern guarantees that blocking can always end.

Avoid locking the same mutex twice

Go’s sync.Mutex is not re-entrant. A goroutine that already holds a mutex cannot lock it again. Attempting to do so causes a self-deadlock, because the goroutine blocks waiting for a lock that it itself is holding.

This often happens indirectly through nested function calls.

package main

import (
    "fmt"
    "sync"
)

var mu sync.Mutex

func doWork() {
    fmt.Println("doWork: trying to lock")
    mu.Lock() // ❌ second lock by same goroutine
    defer mu.Unlock()

    fmt.Println("doWork: work done")
}

func handle() {
    fmt.Println("handle: locking")
    mu.Lock()
    defer mu.Unlock()

    doWork() // ❌ deadlock here
    fmt.Println("handle: done")
}

func main() {
    handle()
}

At runtime, handle acquires the lock and then calls doWork. Inside doWork, the second mu.Lock() blocks forever, because the mutex is already held by the same goroutine and will never be released.

The solution is not to “unlock earlier” or to add flags, but to restructure the code so that locking happens at a single, well-defined level. A common pattern is to split functions into locked and unlocked variants.

package main

import (
    "fmt"
    "sync"
)

var mu sync.Mutex

// doWorkLocked assumes the caller already holds mu.
// It must NOT lock mu itself.
func doWorkLocked() {
    fmt.Println("doWorkLocked: work done")
}

func handle() {
    fmt.Println("handle: locking")
    mu.Lock()
    defer mu.Unlock()

    // ✅ call the locked helper to avoid re-locking the same mutex
    doWorkLocked()
    fmt.Println("handle: done")
}

func main() {
    handle()
}

With this structure, ownership of the lock is explicit. Functions that require the lock assume it is already held, and functions that acquire the lock do not call back into code that locks again. This avoids self-deadlocks and makes synchronization boundaries clear and easy to reason about.

Centralize ownership and synchronization

Deadlocks and subtle blocking bugs often appear when shared state has no clear owner. If many goroutines are allowed to read and modify the same data directly, synchronization logic becomes scattered across the codebase. This makes it hard to reason about lock ordering, increases the risk of circular waits, and makes changes dangerous because every access must be audited for correctness.

package main

import (
    "fmt"
    "sync"
)

type State struct {
    x int
}

func main() {
    var (
        mu    sync.Mutex
        state State
        wg    sync.WaitGroup
    )

    // multiple goroutines mutate shared state
    for i := 0; i < 5; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            mu.Lock()
            state.x++
            mu.Unlock()
        }()
    }

    wg.Wait()
    fmt.Println("final value:", state.x)
}

In this style, correctness depends on every goroutine remembering to take the same lock, in the same order, for every access. As the codebase grows, this quickly becomes fragile.

A safer and often simpler alternative is to give ownership of the state to a single goroutine and require all modifications to go through it.

package main

import (
    "fmt"
    "sync"
)

type update struct {
    delta int
}

func owner(updates <-chan update, done chan<- int) {
    x := 0
    for u := range updates {
        x += u.delta
    }
    done <- x // report final value
}

func main() {
    updates := make(chan update)
    done := make(chan int)

    go owner(updates, done)

    var wg sync.WaitGroup

    // multiple goroutines send update requests
    for i := 0; i < 5; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            updates <- update{delta: 1}
        }()
    }

    wg.Wait()
    close(updates) // no more updates

    final := <-done
    fmt.Println("final value:", final)
}

Here, only the owner goroutine is allowed to touch x. Other goroutines do not manipulate the state directly; instead, they send update requests through a channel. This eliminates the need for locks entirely and makes blocking behavior explicit: the only way to affect the state is by sending a message, which either succeeds or blocks in a well-defined place.

With clear ownership, synchronization becomes centralized, invariants are easier to maintain, and the risk of deadlocks caused by competing access paths is greatly reduced.

5.3 Livelocks

Livelock occurs when goroutines are actively running but continuously reacting to each other in a way that prevents any real progress. In this state, goroutines are not blocked and keep executing, yet no useful work is accomplished. This usually happens due to overly aggressive retry logic or “polite algorithms,” where goroutines repeatedly yield to one another.

Example:

package main

import (
    "fmt"
    "sync"
)

func worker(name string, ch chan struct{}, wg *sync.WaitGroup) {
    defer wg.Done()

    for {
        select {
        case <-ch:
            // be polite: give the turn back
            fmt.Println(name, "received signal, yielding back")
            ch <- struct{}{}
        default:
            // do nothing, keep checking
        }
    }
}

func main() {
    ch := make(chan struct{})
    var wg sync.WaitGroup

    wg.Add(2)

    go worker("A", ch, &wg)
    go worker("B", ch, &wg)

    // start the interaction
    ch <- struct{}{}

    wg.Wait()
}

5.4 Livelock prevention

One of the most common causes of livelocks in Go is the misuse of non-blocking select statements.

Consider the following example.

package main

import (
    "fmt"
    "time"
)

func spinner(jobs <-chan int) {
    for {
        select {
        case j := <-jobs:
            fmt.Println("got job:", j)
            return
        default:
            // ❌ nothing to do, but we keep looping
            // this goroutine stays runnable and burns CPU
        }
    }
}

func main() {
    jobs := make(chan int)

    go spinner(jobs)

    // job arrives later
    time.Sleep(500 * time.Millisecond)
    jobs <- 42

    time.Sleep(100 * time.Millisecond)
}

Because the select contains a default case, it never blocks. When no job is available, the goroutine immediately falls through the default branch and loops again. This goroutine remains runnable at all times, repeatedly checking the channel and consuming CPU cycles while doing no useful work. Under load, many such goroutines can saturate CPU cores and starve productive work.

The core problem here is not the channel, but the retry behavior: the goroutine retries immediately and indefinitely instead of waiting.

The simplest fix is to block until a real event occurs.

package main

import (
    "fmt"
    "time"
)

func worker(jobs <-chan int) {
    fmt.Println("worker: waiting for job")
    j := <-jobs // ✅ blocks until a real event happens
    fmt.Println("worker: got job:", j)
}

func main() {
    jobs := make(chan int)

    go worker(jobs)

    // simulate delay before work is available
    time.Sleep(500 * time.Millisecond)

    fmt.Println("main: sending job")
    jobs <- 42

    time.Sleep(100 * time.Millisecond)
}

By removing the busy loop and allowing the goroutine to block on the receive, the scheduler can park the goroutine efficiently. While blocked, it consumes no CPU and resumes only when there is actual work to do. This eliminates the livelock entirely.

In more realistic cases, a goroutine must wait for multiple events, such as incoming work or a shutdown signal. The correct solution is not to add a default, but to design the select so that it blocks when nothing is ready.

package main

import (
    "fmt"
    "time"
)

func worker(jobs <-chan int, done <-chan struct{}) {
    for {
        select {
        case j := <-jobs:
            fmt.Println("got job:", j)
        case <-done:
            fmt.Println("stopping")
            return
        }
    }
}

func main() {
    jobs := make(chan int)
    done := make(chan struct{})

    go worker(jobs, done)

    // send some work
    jobs <- 1
    jobs <- 2

    // signal cancellation
    close(done)

    time.Sleep(100 * time.Millisecond)
}

Here, the absence of a default case is crucial. When neither jobs nor done is ready, the select blocks. The goroutine is parked by the scheduler and wakes only when meaningful progress can be made. This avoids both CPU spinning and livelock behavior.

The general techniques for preventing livelocks all follow the same principle: ensure that progress is guaranteed rather than endlessly retried.

Retry loops must slow down and eventually stop

This is about the most common livelock ingredient: a goroutine keeps retrying an operation even though nothing changes between attempts. If retries happen immediately, they create constant contention and keep the goroutine runnable. If retries never stop, the system can thrash forever under load.

package main

import (
    "fmt"
    "math/rand"
    "time"
)

func tryOperation() bool {
    // pretend the operation is failing most of the time
    return rand.Intn(10) == 0
}

func main() {
    rand.Seed(time.Now().UnixNano())

    deadline := time.Now().Add(700 * time.Millisecond) // progress guarantee
    backoff := 20 * time.Millisecond                    // bounded backoff base

    for attempt := 1; ; attempt++ {
        if tryOperation() {
            fmt.Println("success on attempt", attempt)
            return
        }

        if time.Now().After(deadline) {
            fmt.Println("fallback: giving up after", attempt, "attempts")
            return
        }

        // bounded backoff + jitter
        jitter := time.Duration(rand.Intn(20)) * time.Millisecond
        sleep := backoff + jitter
        fmt.Println("attempt", attempt, "failed; sleeping", sleep)

        time.Sleep(sleep)

        if backoff < 200*time.Millisecond {
            backoff *= 2
        }
    }
}

Three things make this livelock-resistant. The sleep introduces breathing room so the goroutine isn’t constantly runnable, the jitter prevents multiple goroutines from waking up and colliding in lockstep, and the deadline guarantees the loop won’t retry forever (it either succeeds or exits into a fallback path).

Blocking beats spin-and-check

This is about the “goroutine stays runnable without a reason to run” pattern. The easiest way to create a livelock is to poll repeatedly: check if work exists, and if not, loop again immediately. In Go, the classic foot-gun is select with a default, because that makes the loop non-blocking.

This example runs two goroutines. spinner uses select { default: } and spins until a job arrives. blocker simply blocks on a receive. Both get a job after a delay, but the spinner burns CPU in the meantime.

package main

import (
    "fmt"
    "time"
)

func spinner(jobs <-chan int) {
    spins := 0
    for {
        select {
        case j := <-jobs:
            fmt.Println("spinner got job:", j, "spins:", spins)
            return
        default:
            spins++ // ❌ stays runnable and burns CPU
        }
    }
}

func blocker(jobs <-chan int) {
    j := <-jobs // ✅ blocks efficiently
    fmt.Println("blocker got job:", j)
}

func main() {
    jobs1 := make(chan int)
    jobs2 := make(chan int)

    go spinner(jobs1)
    go blocker(jobs2)

    time.Sleep(200 * time.Millisecond)
    jobs1 <- 1
    jobs2 <- 2

    time.Sleep(50 * time.Millisecond)
}

The fix is not “make the loop smarter,” it’s “don’t loop when there is nothing to do.” The blocking receive parks the goroutine inside the scheduler until a real event happens. The spinner keeps the goroutine runnable and wastes cycles. Removing the default (or using a direct receive) is what eliminates the livelock-style behavior.

Break symmetry when goroutines compete

This is about repeated collisions. Sometimes goroutines are actively competing for a resource, but they do so in a perfectly symmetric way. If both sides retry with the same timing, they can repeatedly collide and make no progress even though they are “busy.”

We already mentioned jitters and how they can break symetry, but in this case we will use another tool at our disposal and it is a tie-breaker. A tie-breaker is any rule that decides who proceeds when two or more contenders are otherwise in an identical situation.

package main

import (
    "fmt"
    "time"
)

func worker(name string, want, otherWant chan struct{}, priority bool) {
    for {
        want <- struct{}{}

        select {
        case <-otherWant:
            if priority {
                // I win ties
                fmt.Println(name, "enters critical section")
                return
            }
            fmt.Println(name, "backs off")
            time.Sleep(10 * time.Millisecond)
        default:
            fmt.Println(name, "enters critical section")
            return
        }
    }
}

func main() {
    a := make(chan struct{}, 1)
    b := make(chan struct{}, 1)

    go worker("A", a, b, true)  // A has priority
    go worker("B", b, a, false)

    time.Sleep(200 * time.Millisecond)
}

What “fixes” the livelock here is the deterministic rule that breaks symmetry. Without some kind of ordering or priority, both goroutines can keep attempting in the same rhythm and colliding repeatedly. A tie-breaker ensures at least one goroutine is allowed to proceed instead of both continually “trying at the same time.”

Centralize arbitration when contention grows

This is about what to do when contention becomes too complex for “a few locks and retries” to stay understandable. When many goroutines can touch the same state, the system can end up with scattered locking, retry loops, and hard-to-debug interactions. A coordinator is a design move: it creates one obvious place where the decision happens.

package main

import (
    "fmt"
    "sync"
)

type update struct {
    delta int
}

func owner(updates <-chan update, done chan<- int) {
    x := 0
    for u := range updates {
        x += u.delta
    }
    done <- x
}

func main() {
    updates := make(chan update)
    done := make(chan int)

    go owner(updates, done)

    var wg sync.WaitGroup
    for i := 0; i < 5; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            updates <- update{delta: 1}
        }()
    }

    wg.Wait()
    close(updates)

    fmt.Println("final x:", <-done)
}

The “fix” is that contention is no longer distributed across many goroutines. Only the owner goroutine mutates x, so there are no competing access paths, no lock ordering issues, and far fewer opportunities for livelock-style thrashing. The only blocking point is explicit: sending to updates (which is exactly where you want backpressure and coordination to happen).

The key lesson is that livelocks are rarely caused by “too much concurrency,” but by goroutines that stay runnable without a reason to run. Blocking when there is no work is not a weakness; it is what allows the scheduler to make the entire system efficient and fair.

6) Data races and correctness models

A data race occurs when two or more goroutines access the same memory at the same time, and at least one of them performs a write operation. This can lead to incorrect results and subtle bugs in the program.

What makes data races particularly dangerous is the fact that they often do not appear consistently. They may occur only occasionally, under higher load, or completely at random. Because of this unpredictability, data races are extremely difficult to reproduce and debug, making them one of the most problematic bugs in concurrent programs.

Simple example:

package main

import (
    "fmt"
    "time"
)

func main() {
    counter := 0

    go func() {
        for i := 0; i < 1000; i++ {
            counter++ // ❌ write
        }
    }()

    go func() {
        for i := 0; i < 1000; i++ {
            counter++ // ❌ write
        }
    }()

    time.Sleep(1 * time.Second)
    fmt.Println(counter)
}

In this case, we cannot guarantee what will be printed.

Example closer to real world:

package main

import (
    "fmt"
    "strings"
    "sync"
)

func main() {
    logs := []string{
        "INFO user logged in",
        "ERROR database timeout",
        "WARN disk almost full",
        "INFO request completed",
        "ERROR failed to write file",
    }

    counts := map[string]int{} // ❌ shared mutable state

    var wg sync.WaitGroup

    for _, line := range logs {
        wg.Add(1)
        go func(l string) {
            defer wg.Done()

            level := strings.Split(l, " ")[0]
            counts[level]++ // ❌ concurrent map write
        }(line)
    }

    wg.Wait()
    fmt.Println("Log counts:", counts)
}

To avoid those scenarios, we use channels or mutexes.

6.1 Mutexes: protecting shared state

A mutex provides a happens-before relationship. This means that a mutex guarantees exclusive access to a specific section of memory to a single goroutine at a given time.

How we can use mutex to fix an previous example of a data race is shown in following example:

package main

import (
    "fmt"
    "strings"
    "sync"
)

func main() {
    logs := []string{
        "INFO user logged in",
        "ERROR database timeout",
        "WARN disk almost full",
        "INFO request completed",
        "ERROR failed to write file",
    }

    counts := map[string]int{}
    var mu sync.Mutex
    var wg sync.WaitGroup

    for _, line := range logs {
        wg.Add(1)
        go func(l string) {
            defer wg.Done()

            level := strings.Split(l, " ")[0]

            mu.Lock()
            counts[level]++
            mu.Unlock()
        }(line)
    }

    wg.Wait()
    fmt.Println("Log counts:", counts)
}

6.2 Channels: ownership and happens-before via send/receive

With channels, a send happens before the corresponding receive, which guarantees memory visibility between goroutines.

Here is a fix of the same data race that utilizes channels:

package main

import (
    "fmt"
    "strings"
    "sync"
)

func main() {
    logs := []string{
        "INFO user logged in",
        "ERROR database timeout",
        "WARN disk almost full",
        "INFO request completed",
        "ERROR failed to write file",
    }

    levels := make(chan string)
    var wg sync.WaitGroup

    // workers
    for _, line := range logs {
        wg.Add(1)
        go func(l string) {
            defer wg.Done()
            levels <- strings.Split(l, " ")[0]
        }(line)
    }

    // coordinator closes channel
    go func() {
        wg.Wait()
        close(levels)
    }()

    // main goroutine owns the map
    counts := map[string]int{}
    for level := range levels {
        counts[level]++
    }

    fmt.Println("Log counts:", counts)
}

6.3 When mutex is more suitable than channels and vice versa

Mutexes are a better fit when goroutines need coordinated access to shared in-memory state. The following cases highlight situations where using channels would add complexity or overhead without improving correctness.

Protecting shared state (in-memory data structures)

When multiple goroutines need to read and update the same data structure in memory, a mutex is often the clearest and most efficient solution. The shared state remains local, and the synchronization logic stays close to the data it protects.

package main

import (
    "fmt"
    "sync"
    "time"
)

type Cache struct {
    mu   sync.RWMutex
    data map[string]int
}

func NewCache() *Cache {
    return &Cache{
        data: make(map[string]int),
    }
}

func (c *Cache) Get(key string) int {
    c.mu.RLock()
    defer c.mu.RUnlock()
    return c.data[key]
}

func (c *Cache) Increment(key string) {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.data[key]++
}

func main() {
    cache := NewCache()
    var wg sync.WaitGroup

    // writer goroutine
    wg.Add(1)
    go func() {
        defer wg.Done()
        for i := 0; i < 5; i++ {
            cache.Increment("hits")
            fmt.Println("writer incremented")
            time.Sleep(100 * time.Millisecond)
        }
    }()

    // multiple readers
    for r := 0; r < 3; r++ {
        wg.Add(1)
        go func(id int) {
            defer wg.Done()
            for i := 0; i < 5; i++ {
                v := cache.Get("hits")
                fmt.Printf("reader %d read value %d\n", id, v)
                time.Sleep(50 * time.Millisecond)
            }
        }(r)
    }

    wg.Wait()
    fmt.Println("final value:", cache.Get("hits"))
}

Here, the mutex directly protects the data map. Each operation clearly defines when shared state is accessed and ensures exclusive access when needed. Using channels instead would require routing every read and write through a single goroutine acting as a “map owner,” turning simple map access into a request–response protocol. That approach would be harder to read, harder to maintain, and usually slower for this kind of workload.

Many readers, few writers (`sync.RWMutex`)

When a data structure is read frequently but written infrequently, sync.RWMutex allows multiple readers to proceed concurrently while still ensuring exclusive access for writers.

type Store struct {
    mu sync.RWMutex
    m  map[string]int
}

func (s *Store) Get(k string) int {
    s.mu.RLock()
    defer s.mu.RUnlock()
    return s.m[k]
}

func (s *Store) Set(k string, v int) {
    s.mu.Lock()
    s.m[k] = v
    s.mu.Unlock()
}

This pattern works well because readers do not block each other, and writes are rare enough that the exclusive lock does not become a bottleneck. Achieving the same behavior with channels would require serializing all access through one goroutine, eliminating parallel reads entirely and reducing throughput for read-heavy workloads.

Atomic multi-step updates and invariants

Some operations require multiple reads and writes to shared state that must appear atomic as a group. These operations often enforce invariants that must never be observed in a partially updated state.

package main

import (
    "fmt"
    "sync"
)

type Account struct {
    mu      sync.Mutex
    balance int
    limit   int
}

func (a *Account) Withdraw(x int) bool {
    a.mu.Lock()
    defer a.mu.Unlock()

    if a.balance-x < -a.limit {
        return false
    }

    a.balance -= x
    return true
}

func main() {
    a := &Account{
        balance: 100,
        limit:   50,
    }

    var wg sync.WaitGroup

    for i := 0; i < 3; i++ {
        wg.Add(1)
        go func(id int) {
            defer wg.Done()
            ok := a.Withdraw(60)
            fmt.Println("goroutine", id, "withdraw ok:", ok)
        }(i)
    }

    wg.Wait()
    fmt.Println("final balance:", a.balance)
}

In this example, the check and the update must be performed together. A mutex makes this guarantee explicit: while the lock is held, no other goroutine can observe or modify the account’s state. Implementing this with channels would require sending requests to a dedicated goroutine and encoding the invariant logic into message handling, which obscures the intent and complicates the code without adding safety.

Performance-sensitive hot paths with low contention

In hot paths where operations are frequent and contention is low, mutexes are often the most efficient option. They introduce minimal overhead and avoid extra allocations or scheduling.

type Metrics struct {
    mu    sync.Mutex
    hits  int
}

func (m *Metrics) Hit() {
    m.mu.Lock()
    m.hits++
    m.mu.Unlock()
}

This code performs a tiny critical section with a single increment. Using a channel here would require sending a message, possibly blocking, and waking another goroutine to process it. That adds unnecessary latency and overhead compared to a simple lock around a few instructions.

6.4 When channels are more suitable than mutexes

Mutexes are great for protecting shared memory, but channels shine when the problem is fundamentally about communication: moving data between independent stages, distributing work, or signaling events. In these designs, channels make the flow of the program explicit. Instead of multiple goroutines reaching into shared state, goroutines communicate by sending values, which often leads to clearer lifecycles and fewer accidental races.

Pipelines: staged processing

A pipeline is a series of stages where each stage performs one transformation and passes the result downstream. Channels are a natural fit here because they let each stage run independently while staying loosely coupled to the next stage.

In the example below, generate produces numbers and closes its output channel when finished. square consumes numbers from its input channel, transforms them, and closes its own output channel when the input is closed. This structure is idiomatic because it makes shutdown automatic: closing the upstream channel causes downstream stages to finish naturally.

func generate(nums []int) <-chan int {
    out := make(chan int)
    go func() {
        for _, n := range nums {
            out <- n
        }
        close(out)
    }()
    return out
}

func square(in <-chan int) <-chan int {
    out := make(chan int)
    go func() {
        for n := range in {
            out <- n * n
        }
        close(out)
    }()
    return out
}

func main() {
    nums := generate([]int{1, 2, 3, 4})
    squares := square(nums)

    for v := range squares {
        fmt.Println(v)
    }
}

Two details are worth noticing. First, each stage owns and closes its own output channel, which cleanly communicates “no more values are coming.” Second, no shared state exists between the stages, so there is no need for locks and no risk of concurrent mutation bugs. Each goroutine has a single responsibility: receive, transform, send.

This pattern scales naturally to more stages (filtering, batching, encoding, writing, etc.) and is one of the clearest cases where channels are simpler than mutexes.

Work queues: worker pools and backpressure

Channels are also an excellent fit for work queues, where one part of the program produces tasks and multiple goroutines consume them. A buffered channel can act as a bounded queue: it naturally applies backpressure because sending blocks when the buffer is full. This is a major advantage over ad-hoc shared slices guarded by mutexes, where you must implement queueing, waking, and capacity limits yourself.

You’ll cover worker pools and backpressure later, but the key idea is that channels can represent the queue directly, and their blocking behavior becomes a built-in flow-control mechanism.

Signaling: stop/done events and timeouts with `select`

Channels are not only for data; they are also one of Go’s cleanest signaling mechanisms. A closed channel can broadcast “we’re done” to any number of listeners, and a select statement makes it straightforward to combine “do work” with “stop when signaled” or “stop after a timeout.”

This becomes especially valuable for cancellation and shutdown: instead of relying on shared boolean flags guarded by locks, goroutines can wait on a done channel (or ctx.Done()), ensuring that blocking operations always have a way to unblock.

You’ll cover these signaling patterns later, but the takeaway is simple: when the goal is coordination through events rather than protecting a shared memory structure, channels tend to be clearer and more idiomatic than mutexes.

7) Goroutine leaks (lifetime bugs)

When we talk about potential bugs caused by improper use of goroutines, we are not only risking data inconsistency or program panics, but also careless memory usage and exhaustion of system resources. This situation is known as a goroutine leak.

A goroutine leak occurs when a goroutine remains blocked or continues running indefinitely because it has no way to complete its work or be cancelled, even though its result is no longer needed.

7.1 Common leak patterns

Most goroutine leaks come from the same root cause: a goroutine enters a blocking operation without a guaranteed way to unblock.

Blocked on channel send

A send blocks until some goroutine receives from the channel (for unbuffered channels), or until there is space in the buffer (for buffered channels). If no receiver ever arrives, the sender is stuck forever.

ch <- value// no receiver, ever

This often happens when the receiver exits early due to an error or timeout, but the sender continues and tries to report a result.

Blocked on channel receive

A receive blocks until some goroutine sends, or until the channel is closed. If nobody will ever send and the channel is never closed, the receiver will wait forever.

<-ch// channel never closed, no sender

This commonly occurs in worker goroutines that are expected to wait for jobs, but the producer forgets to send or to close the channel when shutting down.

Ranging over a channel that is never closed

A for range loop exits only when the channel is closed. If the channel stays open forever, the goroutine ranging over it cannot terminate.

for v := range ch {
    ...
}

In practice, this pattern becomes a leak when the sender disappears or returns early without closing the channel, leaving the receiver blocked on the next iteration.

Missing cancellation: goroutine waiting on work

In this example, the worker is designed to wait for a job, but it has no cancellation path. If a job never arrives, the goroutine stays alive forever.

package main

import (
    "fmt"
    "time"
)

func worker(jobs <-chan int) {
    fmt.Println("worker started")
    job := <-jobs // ❌ waits forever if no job arrives
    fmt.Println("processing job:", job) // this is never printed
}

func main() {
    jobs := make(chan int)

    go worker(jobs)

    // main forgets to send a job or close the channel
    time.Sleep(2 * time.Second)

    fmt.Println("main exiting")
}

Caller returns early

This is one of the most common real-world leak patterns: the caller decides it no longer needs the result, but the worker still tries to send it.

package main

import (
    "fmt"
    "time"
)

func fetchData(result chan<- string) {
    time.Sleep(1 * time.Second)
    result <- "data from worker" // ❌ blocks forever if nobody receives
}

func main() {
    result := make(chan string)

    go fetchData(result)

    // caller decides to return early
    fmt.Println("error occurred, returning early")
    return

    // never receives from result
    // <-result
}

The goroutine running fetchData becomes stuck on the send and cannot exit. This is why cancellation and bounded lifetimes matter: a goroutine should not outlive the operation that created it.

Cancellation exists, but is not checked at the blocking point

This example contains a done channel, but the goroutine does not observe it while blocked on jobs. If no jobs arrive, it never reaches the select that checks done.

package main

import (
    "fmt"
    "time"
)

func worker(jobs <-chan int, done <-chan struct{}) {
    for {
        // ❌ BUG: blocks here forever if no jobs come in
        job := <-jobs
        fmt.Println("got job:", job)

        // "done" is never checked while blocked on jobs
        select {
        case <-done:
            fmt.Println("worker stopping")
            return
        default:
        }
    }
}

func main() {
    jobs := make(chan int)
    done := make(chan struct{})

    go worker(jobs, done)

    time.Sleep(300 * time.Millisecond)
    close(done) // we signal stop...

    // ...but worker is still blocked on <-jobs and never exits.
    fmt.Println("stop signaled, but worker is still stuck waiting on jobs")

    time.Sleep(1 * time.Second)
    fmt.Println("main exits (worker leaked while program was running)")
}

The presence of a cancellation signal is not enough. Cancellation must be wired into the same select that performs the blocking operation; otherwise the goroutine can still leak.

7.2 Goroutine leak prevention techniques

The guiding rule is simple:

Every goroutine must have a guaranteed exit path. If a goroutine can block, it must also be able to unblock and exit.

There are a few idiomatic ways to achieve this, depending on what the goroutine is doing.

Stop by channel close (most common, very idiomatic)

When a goroutine ranges over a channel, closing that channel gives it a natural termination condition.

package main

import (
    "fmt"
    "time"
)

func worker(jobs <-chan int) {
    for job := range jobs {
        fmt.Println("processing", job)
    }
    fmt.Println("worker exited")
}

func main() {
    jobs := make(chan int)

    go worker(jobs)

    jobs <- 1
    jobs <- 2
    close(jobs) // ✅ guaranteed exit

    time.Sleep(100 * time.Millisecond)
}

This works because the worker is waiting on the channel itself, and channel close is a built-in “no more work” signal.

Stop by done channel (explicit cancellation)

A done channel is useful when the goroutine should stop even if no work arrives, or when work does not naturally end.

package main

import (
    "fmt"
    "time"
)

func worker(done <-chan struct{}) {
    for {
        select {
        case <-done:
            fmt.Println("worker stopped")
            return
        default:
            fmt.Println("working...")
            time.Sleep(200 * time.Millisecond)
        }
    }
}

func main() {
    done := make(chan struct{})

    go worker(done)

    time.Sleep(600 * time.Millisecond)
    close(done) // ✅ broadcast stop

    time.Sleep(200 * time.Millisecond)
}

A closed done channel is a broadcast signal: all listeners are woken up and can exit.

Stop by context.Context (best for real apps)

In real services, cancellation is usually tied to request lifetime. context.Context is the standard mechanism for that.

package main

import (
    "fmt"
    "time"
)

func worker(ctx context.Context) {
    for {
        select {
        case <-ctx.Done():
            fmt.Println("worker stopped:", ctx.Err())
            return
        default:
            fmt.Println("working...")
            time.Sleep(200 * time.Millisecond)
        }
    }
}

func main() {
    ctx, cancel := context.WithCancel(context.Background())

    go worker(ctx)

    time.Sleep(600 * time.Millisecond)
    cancel() // ✅ guaranteed exit

    time.Sleep(200 * time.Millisecond)
}

The key is not just passing a context, but ensuring that blocking points observe ctx.Done() (directly or via context-aware APIs).

Stop after sending a result (finite goroutine)

Some goroutines don’t need explicit cancellation because they are naturally finite: they do one job, send the result, and exit.

package main

import (
    "fmt"
)

func compute(result chan<- int) {
    result <- 42
    // goroutine exits naturally
}

func main() {
    result := make(chan int)

    go compute(result)

    fmt.Println(<-result)
}

This pattern is safe only if the receiver is guaranteed to receive (or the channel is buffered sufficiently), otherwise it turns into the “caller returns early” leak.

Stop via select on work + cancellation (most important pattern)

When a goroutine may block waiting for work, cancellation must be part of the same select. This is the core pattern that prevents most leaks in worker-style code.

func worker(jobs <-chan int, done <-chan struct{}) {
    for {
        select {
        case job := <-jobs:
            fmt.Println("job:", job)
        case <-done:
            fmt.Println("worker stopped")
            return
        }
    }
}

Now, if no job arrives, the goroutine can still exit when done closes.

Single sender → sender closes the channel

To prevent receivers from blocking forever, channels must be closed correctly. As we said already, the rule is: only the sender (or a coordinator that owns the sending lifecycle) should close a channel.

package main

import (
    "fmt"
    "time"
)

func producer(out chan<- int) {
    defer close(out) // ✅ sender closes when finished

    for i := 1; i <= 5; i++ {
        out <- i
        time.Sleep(100 * time.Millisecond)
    }
}

func main() {
    ch := make(chan int)

    go producer(ch)

    // receiver just ranges until channel is closed
    for v := range ch {
        fmt.Println("received:", v)
    }

    fmt.Println("done")
}

This makes the receiver safe to range until completion.

Multiple senders → coordinator closes the channel

When multiple producers send on the same channel, none of them can safely close it, because they cannot know whether another producer is still sending. A coordinator that waits for all producers can safely close the channel.

package main

import (
    "fmt"
    "sync"
    "time"
)

func producer(id int, out chan<- int, wg *sync.WaitGroup) {
    defer wg.Done()

    for i := 1; i <= 3; i++ {
        out <- id*10 + i
        time.Sleep(100 * time.Millisecond)
    }
    // ✅ producer does NOT close(out)
}

func main() {
    out := make(chan int)

    var wg sync.WaitGroup
    wg.Add(2)

    go producer(1, out, &wg)
    go producer(2, out, &wg)

    // ✅ coordinator owns channel lifecycle
    go func() {
        wg.Wait()  // wait for all producers to finish sending
        close(out) // safe: no more sends can happen after Wait returns
    }()

    // receiver ranges until closed
    for v := range out {
        fmt.Println("received:", v)
    }

    fmt.Println("done")
}

Avoid “fire-and-forget”: tie goroutine to request lifetime

A goroutine should not exist independently of the operation that created it. If the caller returns early, the goroutine must be canceled, or it may leak while doing work that nobody cares about.

func handleRequest(ctx context.Context) error {
    done := make(chan struct{})

    go func() {
        defer close(done)
        backgroundWork(ctx) // should respect ctx too
    }()

    select {
    case <-done:
        return nil
    case <-ctx.Done():
        return ctx.Err()
    }
}

This pattern makes the lifetime explicit: the request either completes the work or cancels it, but the goroutine is never left running without supervision.

8) Bounded concurrency and backpressure

Not all goroutine-related issues are leaks. Some goroutines terminate correctly, but creating an uncontrolled number of them can still exhaust system resources.

This typically happens during traffic spikes when a burst of work causes thousands of goroutines to be spawned simultaneously. While each goroutine may eventually exit, the cumulative resource usage can overwhelm the scheduler, memory, or file descriptors.

In these situations, the problem is unbounded concurrency rather than a goroutine leak. The correct solution is to apply backpressure by limiting how much work can run concurrently.

8.1 Semaphore pattern (simple, great for “do N things concurrently”)

The semaphore pattern is the simplest way to cap concurrency when you have a fixed number of independent tasks and want at most N of them running at the same time.

package main

import (
    "fmt"
    "sync"
    "time"
)

func doWork(id int) {
    time.Sleep(200 * time.Millisecond) // pretend this is I/O or CPU work
    fmt.Println("done", id)
}

func main() {
    const (
        totalTasks   = 50
        maxInFlight  = 5 // ✅ concurrency limit (backpressure)
    )

    sem := make(chan struct{}, maxInFlight)
    var wg sync.WaitGroup

    for i := 0; i < totalTasks; i++ {
        wg.Add(1)

        sem <- struct{}{} // ✅ acquire a slot (blocks when full)

        go func(id int) {
            defer wg.Done()
            defer func() { <-sem }() // ✅ release slot

            doWork(id)
        }(i)
    }

    wg.Wait()
    fmt.Println("all tasks finished")
}

Here, maxInFlight defines the concurrency limit. The buffered channel acts as a semaphore. Before starting a goroutine, the code acquires a slot by sending into the channel. If the channel buffer is full, this send blocks. That blocking is the backpressure: task creation slows down automatically instead of spawning unlimited goroutines. When the goroutine finishes, it releases its slot. The important property of this pattern is that concurrency is bounded at the point of goroutine creation. No matter how many tasks exist, only maxInFlight goroutines can be running the critical work at once.

This pattern is ideal when:

you have a known, finite batch of tasks
each task is independent
you want a simple concurrency cap with minimal structure

It is less suitable when work arrives continuously over time, because task submission and execution are tightly coupled.

8.2 Worker pool (best when work arrives continuously)

A worker pool decouples work production from work execution. Instead of spawning a goroutine per task, a fixed number of long-lived worker goroutines pull tasks from a queue.

package main

import (
    "fmt"
    "sync"
    "time"
)

type Job struct {
    ID int
}

func worker(id int, jobs <-chan Job, wg *sync.WaitGroup) {
    defer wg.Done()

    for job := range jobs {
        time.Sleep(200 * time.Millisecond) // pretend work
        fmt.Printf("worker %d processed job %d\n", id, job.ID)
    }
}

func main() {
    const (
        numWorkers = 4
        queueSize  = 8 // ✅ bounded queue (backpressure)
        totalJobs  = 30
    )

    jobs := make(chan Job, queueSize)

    var wg sync.WaitGroup
    wg.Add(numWorkers)

    for w := 0; w < numWorkers; w++ {
        go worker(w, jobs, &wg)
    }

    // producer
    for i := 0; i < totalJobs; i++ {
        jobs <- Job{ID: i} // ✅ blocks if queue is full => backpressure
    }

    close(jobs) // ✅ sender closes when no more jobs
    wg.Wait()

    fmt.Println("all jobs processed")
}

Here, numWorkers bounds concurrency, and queueSize bounds how much work can be buffered. Each worker runs in a loop, receiving jobs from the channel. If all workers are busy and the queue is full, the producer blocks. This prevents unbounded growth in memory and goroutines and forces the producer to slow down when the system is saturated.

The worker pool pattern is ideal when:

work arrives continuously or unpredictably
tasks are homogeneous
you want stable resource usage over time
the system should absorb bursts gracefully rather than collapse

Unlike the semaphore pattern, workers are long-lived and reused, which often improves cache locality and reduces scheduling overhead.

8.3 Semaphore vs worker pool: how to choose

Both patterns enforce backpressure, but they do so at different points.

The semaphore pattern limits how many goroutines may exist concurrently. It is simple and works well for one-off batches.

The worker pool limits how many tasks may be processed concurrently and how many may be queued. It introduces more structure, but scales better for long-running systems.

In both cases, the important idea is the same: blocking is intentional. Blocking is not a failure; it is how the system protects itself. By forcing producers to wait, the program stays within the limits of the machine instead of exhausting it.

9) sync package toolbox (advanced primitives)

The sync package provides low-level concurrency primitives that solve specific coordination problems. These tools are powerful, but also easy to misuse if applied outside their intended scope. This section focuses on what each primitive is actually for, and where its boundaries are.

9.1 sync/atomic

To understand sync/atomic and its significance, we first need to learn what word “atomic” represents: something atomic cannot be interrupted midway.

Atomic operations are low-level CPU instructions that operate on a single memory location in a way that is indivisible, race-free, and provides memory ordering guarantees.

This code:

counter++

is not atomic. Under the hood it’s closer to:

tmp := counter   // 1) read from memory
tmp = tmp + 1    // 2) add 1 (CPU register)
counter = tmp    // 3) write back to memory

Two goroutines can both read the same old value and overwrite each other — lost update.

That’s where atomic package comes in:

atomic.AddInt64(&counter, 1)

In this example, increment of a counter variable is performed atomically and no other goroutine can observe or modify it mid-increment.

Modern CPUs reorder instructions, cache values, and delay writes. Atomics prevent unsafe reordering around the atomic instruction and enforce well-defined visibility rules.

One may ask: “What is the difference between using mutex and atomic ?”

Mutexes protect invariants. Atomics protect single memory operations.

That’s the core difference.

Mutex example (clear invariant protection):

mu.Lock()
if count < 5 {
    count++
}
mu.Unlock()

Atomic check-then-act is unsafe:

if atomic.LoadInt64(&count) < 5 {
    atomic.AddInt64(&count, 1)
}

If you must preserve an invariant with atomics, you need CAS loops:

for {
    v := atomic.LoadInt64(&count)
    if v >= 5 {
        return false
    }
    if atomic.CompareAndSwapInt64(&count, v, v+1) {
        return true
    }
}

Atomics are suitable for simple counters, flags, and publishing immutable data — but not for multi-step state invariants.

9.2 sync.Once

sync.Once is a synchronization primitive used for one-time initialization. It guarantees that the function passed to Do runs at most once across the entire process, even if many goroutines call it concurrently. The first goroutine that reaches Do runs the function; all other callers block until that function returns. After it finishes, future calls to Do return immediately.

A typical use case is lazy initialization:

var once sync.Once
var config Config

func loadConfig() {
    config = readConfigFromDisk()
}

func getConfig() Config {
    once.Do(loadConfig)
    return config
}

Pitfall: recursive or circular use (self-deadlock)

sync.Once is not re-entrant. If the function currently running inside once.Do(...) tries to call Do again on the same Once, it will deadlock. The second Do call waits for the first one to finish, but the first one can’t finish because it is waiting on itself (directly or indirectly).

Direct recursion example:

package main

import (
    "fmt"
    "sync"
)

var once sync.Once

func initA() {
    fmt.Println("initA started")

    // Deadlock: calling once.Do on the same Once while initA is running
    once.Do(initA)

    fmt.Println("initA finished")
}

func main() {
    once.Do(initA)
}

The same thing can happen indirectly through circular initialization dependencies:

package main

import (
    "fmt"
    "sync"
)

var onceA sync.Once
var onceB sync.Once

func InitA() {
    onceA.Do(func() {
        fmt.Println("InitA: start")
        InitB()
        fmt.Println("InitA: done")
    })
}

func InitB() {
    onceB.Do(func() {
        fmt.Println("InitB: start")
        InitA()
        fmt.Println("InitB: done")
    })
}

func main() {
    InitA()
}

The rule of thumb: initialization dependencies must form an acyclic graph. sync.Once does not detect cycles for you—it will just block forever if you create one.

Pitfall: panics still mark the Once as “done”

Another surprising behavior: if the function passed to Do panics, the Once is still considered completed. Even if you recover the panic, later calls to Do will not retry the initialization.

package main

import (
    "fmt"
    "sync"
)

var once sync.Once
var ready bool

func initSystem() {
    fmt.Println("initializing system")
    ready = true

    // Something goes wrong
    panic("unexpected failure")
}

func main() {
    func() {
        defer func() {
            if r := recover(); r != nil {
                fmt.Println("recovered:", r)
            }
        }()
        once.Do(initSystem)
    }()

    fmt.Println("ready after panic:", ready)

    // This will NOT run initSystem again
    once.Do(initSystem)

    fmt.Println("program continues")
}

Even if the panic is recovered, future calls to Do will not retry initialization. This can leave the program in a partially initialized state.

A safer pattern is to record errors explicitly instead of panicking:

package main

import (
    "errors"
    "fmt"
    "sync"
)

var once sync.Once
var initErr error
var config string

func initConfig() {
    // Do not panic. Record the error instead.
    initErr = errors.New("failed to load config")
}

func GetConfig() (string, error) {
    once.Do(initConfig)
    if initErr != nil {
        return "", initErr
    }
    return config, nil
}

func main() {
    cfg, err := GetConfig()
    fmt.Println("config:", cfg, "error:", err)
}

If retries are required, sync.Once is the wrong tool.

9.3 sync.Cond

sync.Cond exists for state-based waiting: goroutines sleep until shared state may have changed. It does not store data, does not store conditions, and does not replace a mutex. It must be used with a mutex.

Correct pattern:

mu := sync.Mutex{}
cond := sync.NewCond(&mu)

ready := false

// waiter
mu.Lock()
for !ready {
    cond.Wait()
}
mu.Unlock()

// signaler
mu.Lock()
ready = true
mu.Unlock()
cond.Signal()

Wait atomically unlocks the mutex and parks the goroutine. When woken, it re-locks the mutex before returning. The condition must be checked in a loop to handle spurious wakeups and racing waiters.

Example: a simple queue that blocks when empty.

type Queue struct {
    mu       sync.Mutex
    notEmpty *sync.Cond
    items    []int
}

func NewQueue() *Queue {
    q := &Queue{}
    q.notEmpty = sync.NewCond(&q.mu)
    return q
}

func (q *Queue) Push(v int) {
    q.mu.Lock()
    q.items = append(q.items, v)
    q.mu.Unlock()
    q.notEmpty.Signal()
}

func (q *Queue) Pop() int {
    q.mu.Lock()
    for len(q.items) == 0 {
        q.notEmpty.Wait()
    }
    v := q.items[0]
    q.items = q.items[1:]
    q.mu.Unlock()
    return v
}

Here, goroutines wait on state (“queue is non-empty”), not on events. This is why sync.Cond exists.

Channels are event-based. sync.Cond is state-based.

9.4 sync.Pool

sync.Pool is a concurrency-safe pool for reusing temporary objects to reduce allocation and garbage collection pressure. Items in the pool may be dropped at any time by the garbage collector. Because of this, a pool must be treated as a performance hint, not as a cache or ownership mechanism.

A classic use case is reusing buffers:

var bufPool = sync.Pool{
    New: func() any {
        return make([]byte, 0, 1024)
    },
}

func handle(data []byte) {
    buf := bufPool.Get().([]byte)
    buf = buf[:0]

    buf = append(buf, data...)
    process(buf)

    bufPool.Put(buf)
}

This works because:

the buffer is temporary
it has no meaning outside the function
correctness does not depend on reuse

Another valid use case is reusing short-lived structs:

type Parser struct {
    tmp []byte
}

var parserPool = sync.Pool{
    New: func() any {
        return &Parser{tmp: make([]byte, 0, 256)}
    },
}

func parse(data []byte) {
    p := parserPool.Get().(*Parser)
    p.tmp = p.tmp[:0]

    p.tmp = append(p.tmp, data...)

    parserPool.Put(p)
}

sync.Pool is not suitable for managing long-lived resources such as database connections, file descriptors, or goroutines. The runtime may discard pooled items at any GC cycle, making lifecycle management impossible.

10) Debugging and observability: pprof

pprof is Go’s built-in profiling and observability tool. It allows you to inspect what a running program is doing at runtime: CPU usage, memory allocations, heap growth, blocking behavior, mutex contention, and—critically for concurrency bugs—the set of goroutines that currently exist and what they are blocked on.

When debugging goroutine leaks, the goroutine profile is the most valuable signal. It shows stack traces of live goroutines, making it possible to see where goroutines are stuck and whether their number keeps growing over time.

10.1 Enabling pprof in a long-running program

In server-style applications, the most common way to enable pprof is to expose it over an HTTP endpoint. Importing net/http/pprof registers profiling handlers on the default HTTP mux.

package main

import (
    "log"
    "net/http"
    _ "net/http/pprof"
)

func main() {
    // Your normal application setup would go here.

    go func() {
        log.Println("pprof listening on http://localhost:6060")
        log.Println(http.ListenAndServe("localhost:6060", nil))
    }()

    select {} // keep the process running (example)
}

In real services, this endpoint is usually bound to localhost, a private network, or protected behind authentication, since it exposes internal details of the program.

10.2 Inspecting goroutines directly

The fastest way to inspect goroutines is to fetch a human-readable goroutine dump:

curl http://localhost:6060/debug/pprof/goroutine?debug=2

This prints full stack traces of goroutines currently alive in the process. It is often enough to immediately spot leaks: you may see hundreds or thousands of goroutines blocked in the same function, waiting on the same channel, select, or WaitGroup.Wait.

10.3 Using the pprof UI for deeper analysis

For more systematic analysis, you can download a goroutine profile and explore it using the pprof web UI:

curl -o goroutine.pb.gz http://localhost:6060/debug/pprof/goroutine
go tool pprof -http=:0 goroutine.pb.gz

This opens an interactive interface where you can:

group goroutines by stack trace
see which call paths account for the most goroutines
drill into specific functions that appear frequently in blocked stacks

This is especially useful when leaks are subtle or spread across multiple code paths.

10.4 A practical workflow for finding goroutine leaks

A goroutine leak usually shows up as growth over time, not as a single snapshot. A practical approach looks like this:

Periodically log the number of goroutines:
```
log.Println("goroutines:", runtime.NumGoroutine())
```
If this number steadily increases and never returns to a baseline, you likely have a leak.

Capture two goroutine dumps at different times under similar load:

curl http://localhost:6060/debug/pprof/goroutine?debug=2 > g1.txt
# wait or apply load
curl http://localhost:6060/debug/pprof/goroutine?debug=2 > g2.txt

Compare the dumps and look for repeated stack traces that grow in number.

In practice, goroutine leaks tend to follow a small number of recurring patterns. Leaked goroutines are often found blocked on channel receives or sends with no matching counterpart, stuck in select statements waiting on channels that never become ready, or waiting in WaitGroup.Wait without all corresponding Done calls ever occurring. Other common cases include goroutines blocked on I/O operations that lack cancellation and goroutines waiting on ctx.Done() when the associated context is never canceled. If many goroutines share the same blocked stack trace and that count keeps increasing, you’ve likely found the leak site.

10.5 Interpreting results correctly

Not every large number of goroutines indicates a leak. Busy servers can legitimately have many goroutines handling active requests. The key signal is unbounded growth combined with identical blocked stacks.

In practice, most goroutine leaks trace back to missing cancellation, missing channel closure, or a goroutine waiting on work that will never arrive. pprof doesn’t fix these bugs, but it makes them visible—which is often the hardest part.