DEV Community

James Lee
James Lee

Posted on

Go Compiler & defer: Bootstrap, Three defer Implementations, panic/recover & Closures

Go's compiler is written entirely in Go — a self-hosting compiler that handles everything from frontend parsing to backend code generation. In this article we'll trace the program lifecycle from the very first instruction, dig into how defer is implemented in three different ways, and cover the panic/recover model, build tags, and the infamous closure-in-loop trap.


1. The Go Compiler Pipeline

Source (.go)
    ↓
Lexer / Parser       → AST
    ↓
Type Checker         → typed AST
    ↓
IR (SSA)             → optimizations, escape analysis
    ↓
Code Generation      → machine code
    ↓
Linker               → binary
Enter fullscreen mode Exit fullscreen mode

The entire pipeline — from frontend to backend — is implemented in Go itself. No C, no LLVM (by default).

Performance Tooling

Before optimizing, always measure:

Tool What it shows
go tool pprof CPU, memory, goroutine profiles — µs-level code hotspots
go tool trace Runtime events: goroutine scheduling, GC, netpoller — ns-level

Optimization priority:

Storage layer (ms gains) > Business logic (µs gains) > Low-level code (ns gains)
Enter fullscreen mode Exit fullscreen mode

Optimization workflow:

  1. Load test with realistic traffic
  2. pprof → identify CPU/memory hotspots
  3. Fix: async, cache, algorithm change
  4. benchmark → verify local improvement
  5. Repeat load test → check p95 latency

2. Program Bootstrap: From rt0_amd64 to main.main

Your Go program does not start at main.main. It doesn't even start at runtime.main. The real entry point is deep in the runtime assembly.

runtime.rt0_amd64          ← actual binary entry point
    ↓
runtime.rt0_go             ← determine CPU core count + physical page size
    ↓
runtime.schedinit()        ← initialize: scheduler, stack allocator,
    ↓                         memory allocator, GC
runtime.newproc(main)      ← create main goroutine, push to P's LRQ
    ↓
runtime.mstart()           ← start M0, enter scheduling loop (never returns)
    ↓
runtime.main()             ← main goroutine starts here
Enter fullscreen mode Exit fullscreen mode

runtime.main() execution order:

func main() {
    // 1. Set max stack size: 1GB (64-bit) / 250MB (32-bit)
    maxstacksize = 1000000000

    // 2. Start sysmon background thread (GC, preemption, netpoll)
    systemstack(func() { newm(sysmon, nil) })

    // 3. Initialize runtime packages
    runtime_init()

    // 4. Enable GC background workers
    gcenable()

    // 5. Run all init() functions in imported packages
    main_init()

    // 6. Run user's main.main()
    main_main()

    // 7. Exit
    exit(0)
}
Enter fullscreen mode Exit fullscreen mode

Key difference: Non-main goroutines return to goexit when done. The main goroutine calls exit(0) — terminating the entire process immediately.


3. defer: Three Implementation Strategies

defer looks like a simple "run this at function exit" mechanism. The reality is more nuanced — Go uses three different implementations depending on the context, each with different performance characteristics.

Why not just insert a function call at return?

Because defer can appear inside conditionals and loops, the compiler cannot always statically determine how many defers exist or which ones will execute. This makes a purely compile-time solution insufficient.


Strategy 1: Heap Allocation (General Case)

Each defer creates a _defer struct allocated on the heap, chained into the goroutine's defer linked list.

// Compiler transforms:
defer foo()
// Into:
deferproc(foo)   // allocate _defer on heap, push to G._defer list
...
deferreturn()    // at function exit: walk list, execute defers in LIFO order
Enter fullscreen mode Exit fullscreen mode
type _defer struct {
    siz       int32      // size of arguments + return values
    sp        uintptr    // stack pointer at defer site
    pc        uintptr    // caller's program counter
    fn        *funcval   // the deferred function
    link      *_defer    // next defer in chain (linked list)
}

type g struct {
    _defer *_defer       // head of this goroutine's defer list
}
Enter fullscreen mode Exit fullscreen mode

Cost: Heap allocation per defer call. Slowest strategy.


Strategy 2: Stack Allocation (Go ≥ 1.13)

When the compiler can prove a defer is not in a loop and the number is bounded, it allocates the _defer struct on the stack instead of the heap.

// Compiler allocates _defer directly in the function's stack frame
t := deferstruct(stksize)   // stack-allocated _defer
// ... initialize fields ...
deferreturn()
Enter fullscreen mode Exit fullscreen mode

Cost: No heap allocation. Significantly faster than heap strategy.


Strategy 3: Open-Coded (Go ≥ 1.14, Most Common)

When all conditions are met, the compiler inlines the deferred calls directly at each return site, using a bitmask (deferBits) to track which defers should fire.

Conditions for open-coded defer:

  • Compiler optimizations not disabled (-gcflags "-N" not set)
  • ≤ 8 defers in the function
  • num_defers × num_returns ≤ 15
  • No defer inside a loop
// Source:
defer f1(a1)
if cond {
    defer f2(a2)
}

// Compiler generates:
deferBits = 0b00000000
deferBits |= 1 << 0     // f1 is always deferred → bit 0 set
_f1, _a1 = f1, a1
if cond {
    deferBits |= 1 << 1 // f2 conditionally deferred → bit 1 set
    _f2, _a2 = f2, a2
}

// At every return site (reverse order):
if deferBits & (1<<1) != 0 {
    deferBits &^= (1 << 1)
    _f2(_a2)
}
if deferBits & (1<<0) != 0 {
    deferBits &^= (1 << 0)
    _f1(_a1)
}
Enter fullscreen mode Exit fullscreen mode

Cost: Near-zero — just a few bit operations and direct calls. No allocation.

Not truly zero-cost: Arguments are evaluated and copied to the stack at the defer site. Conditional defers still need the deferBits check at runtime.


defer Gotchas

Gotcha 1: Arguments are evaluated immediately

// Prints "0s" — time.Since() is evaluated when defer is registered
func main() {
    startedAt := time.Now()
    defer fmt.Println(time.Since(startedAt))
    time.Sleep(time.Second)
}

// Prints "1s" — time.Since() is evaluated when the closure runs
func main() {
    startedAt := time.Now()
    defer func() { fmt.Println(time.Since(startedAt)) }()
    time.Sleep(time.Second)
}
Enter fullscreen mode Exit fullscreen mode

Gotcha 2: Not all builtins can be deferred directly

// ❌ Cannot defer directly:
defer append(sl, 1)
defer cap(sl)
defer len(sl)

// ✅ Wrap in a closure:
defer func() { _ = append(sl, 1) }()

// ✅ Can defer directly:
defer close(ch)
defer delete(m, key)
defer recover()
Enter fullscreen mode Exit fullscreen mode

Gotcha 3: Use anonymous functions to scope locks precisely

func someFunc() {
    // ... lots of code ...
    func() {
        mu.Lock()
        defer mu.Unlock()
        // critical section — lock released at end of anonymous func,
        // not at end of someFunc
    }()
    // ... more code runs without holding the lock ...
}
Enter fullscreen mode Exit fullscreen mode

4. panic / recover Internals

Mental Model

Go Java equivalent
panic RuntimeException + Error
recover catch (but only inside defer)

Key rule: recover only works inside a defer function. It catches panics propagated via runtime.panic(), but not runtime.throw() or runtime.fatal() (which are unrecoverable runtime errors).

Data Structures

type _panic struct {
    arg       interface{}  // value passed to panic()
    link      *_panic      // previous panic in chain
    recovered bool         // has this panic been recovered?
    aborted   bool         // has this panic been aborted?
}

type g struct {
    _panic *_panic   // head of panic chain (innermost first)
    _defer *_defer   // head of defer chain (innermost first)
}
Enter fullscreen mode Exit fullscreen mode

gopanic Execution Flow

panic(val) called
    ↓
gopanic():
    allocate _panic on stack
    prepend to g._panic list
        ↓
    loop: walk g._defer list
        ↓
        execute each defer function
            ↓
            defer contains recover()?
                ├── YES: p.recovered = true
                │         mcall(recovery) → re-enter scheduler  ✅
                │         gopanic exits here
                └── NO:  continue to next defer
        ↓
    no more defers, p.recovered still false
        ↓
    preprintpanics() → print stack trace
    fatalpanic()     → terminate process  💥
Enter fullscreen mode Exit fullscreen mode
// Minimal recover pattern:
func safeCall() {
    defer func() {
        if r := recover(); r != nil {
            fmt.Println("recovered:", r)
        }
    }()
    panic("something went wrong")
}
Enter fullscreen mode Exit fullscreen mode

Panic is not for normal error handling. Use error returns for expected failures. Reserve panic for truly unrecoverable states (programmer errors, invariant violations).


5. Build Tags

Build tags control which files are included in compilation — at the file level, not the code block level.

// dev.go
//go:build dev
package main

func init() {
    configArr = append(configArr, "mysql dev")
}
Enter fullscreen mode Exit fullscreen mode
// prod.go
//go:build prod
package main

func init() {
    configArr = append(configArr, "mysql prod")
}
Enter fullscreen mode Exit fullscreen mode
go build -tags "dev"   # includes dev.go, excludes prod.go
go build -tags "prod"  # includes prod.go, excludes dev.go
Enter fullscreen mode Exit fullscreen mode

Common use cases: environment-specific config, OS-specific implementations, feature flags, test fixtures.


6. Closures & the Goroutine Loop Trap

A closure captures variables by reference, not by value. This creates a classic bug when launching goroutines inside a loop.

The Trap

// ❌ All goroutines process the LAST node
for i := range nodes {
    go func() {
        node := nodes[i]  // i is shared — by the time goroutine runs,
        process(node)     // the loop has already advanced i
    }()
}
Enter fullscreen mode Exit fullscreen mode

The Fix

// ✅ Capture a local copy of i at each iteration
for i := range nodes {
    index := i            // new variable per iteration
    go func() {
        node := nodes[index]  // each goroutine has its own index
        process(node)
    }()
}

// ✅ Or pass as argument (cleaner):
for i := range nodes {
    go func(idx int) {
        node := nodes[idx]
        process(node)
    }(i)
}
Enter fullscreen mode Exit fullscreen mode

Go 1.22+ changed loop variable semantics: each iteration now creates a new variable, making the first pattern safe. But for pre-1.22 compatibility, always capture explicitly.


7. Summary

Topic Key Takeaway
Bootstrap Entry: rt0_amd64schedinitnewproc(main)mstart
defer (heap) _defer on heap, linked list on G — slowest, most general
defer (stack) _defer in stack frame — faster, no heap alloc
defer (open-coded) Inlined at return sites with deferBits bitmask — near-zero cost
defer args Evaluated at defer registration, not at execution
panic/recover gopanic walks defer chain; recover sets p.recovered; mcall(recovery) re-enters scheduler
Build tags File-level inclusion/exclusion at compile time
Closure trap Loop variables are shared; always capture a local copy for goroutines

This concludes the Go Runtime Internals series. You now have a complete picture of how Go manages memory, I/O, system calls, scheduling, and language-level features like defer and panic — all the way down to the assembly level.


Found this series useful? Share it with your team and follow for more Go deep dives.

Top comments (0)