DEV Community

James Lee
James Lee

Posted on

Goroutine Scheduling: GMP Model, Schedule Loop, Preemption & Stack Management

The Go scheduler is one of the most sophisticated pieces of the runtime. It manages thousands of goroutines across a handful of OS threads, with minimal overhead and near-zero context-switch cost. In this article we'll trace the full lifecycle of a goroutine — from creation to scheduling loop to preemption to cleanup.


1. System Call vs Function Call: The Fundamental Difference

Before diving into the scheduler, it's worth understanding why goroutine scheduling requires assembly — and why it's fundamentally different from a regular function call.

Function Call System Call
Instruction CALL SYSCALL / INT 0x80
Address space User space only Crosses into kernel space
How to invoke By function name/address By integer syscall number
Registers saved PC + SP only All CPU registers (full context)
Virtual memory Not touched Not touched (same process)
Process switch No No (same process, intra-process switch)

On AMD64 Linux, the syscall convention is:

  • rax → syscall number
  • rdi, rsi, rdx, r10, r8, r9 → first 6 arguments

Why assembly for goroutine switching? Because switching execution flows requires precisely controlling PC (program counter) and SP (stack pointer) registers — something no high-level language can do. The scheduler's gogo, mcall, and goexit are all written in assembly.


2. The GMP Model

Go's scheduler is built on three core abstractions:

┌─────────────────────────────────────────────────────────┐
│                    GMP Model                            │
│                                                         │
│   G (Goroutine)   — the unit of work                    │
│   M (Machine)     — OS thread, the actual executor      │
│   P (Processor)   — scheduling context, owns LRQ        │
│                                                         │
│   P0          P1          P2                            │
│  [LRQ]       [LRQ]       [LRQ]    ← local run queues   │
│    │           │           │                            │
│    M0          M1          M2      ← OS threads         │
│    │           │           │                            │
│    G           G           G       ← running goroutines │
│                                                         │
│  [Global Run Queue]  ← shared, requires lock            │
│  [netpoller]         ← waiting on I/O                   │
└─────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Key design: mcache lives on P, not M. When M+G enter a blocking syscall and P detaches, the new M that picks up P inherits its mcache — ensuring lock-free allocation continues.


3. Initialization: From g0 to main

Before any user code runs, the runtime bootstraps itself:

Step 1: Initialize g0
─────────────────────────────────────────────
g0 is the "runtime goroutine" — provides a stack
for the runtime itself to execute on.
m0 ←→ g0 are bound together.
get_tls() on main thread → returns g0 → g0.m → m0

Step 2: Initialize P and allp
─────────────────────────────────────────────
schedinit() → mcommoninit() → procresize()
Creates GOMAXPROCS P structs, stored in allp[]

Step 3: Create main goroutine (newg)
─────────────────────────────────────────────
runtime.newproc() creates newg with 2KB stack
newg.stack.hi / newg.stack.lo mark stack bounds
Stack is primed: goexit's address pushed as fake return
newg.sched.pc = runtime.main
newg state → _Grunnable → placed in P's LRQ

Step 4: gogo — switch from g0 to newg
─────────────────────────────────────────────
Save g0's SP to g0.sched.sp
schedule() → finds newg
gogo() → switches stack from g0 to newg
         → JMP to newg.sched.pc (runtime.main)
Enter fullscreen mode Exit fullscreen mode

runtime.main startup sequence:

runtime.main()
    ├── start sysmon thread
    ├── init runtime packages
    ├── init all imported packages
    ├── call main.main()
    └── exit() syscall → process ends
Enter fullscreen mode Exit fullscreen mode

Note: Non-main goroutines return to goexit after finishing. The main goroutine terminates the entire process when it returns — this is the key difference.


4. The Scheduling Loop

Every M runs an infinite scheduling loop. In pseudocode:

// Each OS thread runs this loop
func scheduleLoop() {
    for {
        g := findRunnableGoroutine()  // find next G
        run(g)                         // execute G
        saveState(g)                   // save registers
    }
}
Enter fullscreen mode Exit fullscreen mode

The actual schedule() function searches for a runnable G in priority order:

func schedule() {
    // 1. Every 61 ticks: check global run queue (fairness guarantee)
    if schedtick % 61 == 0 && sched.runqsize > 0 {
        gp = globrunqget(p, 1)
    }
    // 2. Check P's local run queue (LRQ)
    if gp == nil {
        gp, inheritTime = runqget(p)
    }
    // 3. findrunnable(): steal from other Ps, check netpoller, block if nothing
    if gp == nil {
        gp, inheritTime = findrunnable()  // blocks until a G is found
    }
    // 4. Execute the G
    execute(gp, inheritTime)
}
Enter fullscreen mode Exit fullscreen mode

Fairness mechanism: Every 61 scheduling ticks, the scheduler checks the global queue first. Without this, goroutines in the global queue could starve if local queues are always full.

execute()gogo() → user goroutine

func execute(gp *g, inheritTime bool) {
    casgstatus(gp, _Grunnable, _Grunning)  // G state: Runnable → Running
    gp.preempt = false
    _g_.m.curg = gp   // M → G (forward link)
    gp.m = _g_.m      // G → M (back link)
    gogo(&gp.sched)   // assembly: switch stack + JMP to gp's PC
}
Enter fullscreen mode Exit fullscreen mode

The Full Scheduling Cycle

schedule()
    → execute()
        → gogo()          ← assembly: switch to G's stack
            → user func() ← G runs
            → goexit()    ← fake return (injected by newproc)
                → goexit1()
                    → mcall(goexit0)
                        → goexit0()
                            → casgstatus(_Grunning → _Gdead)
                            → dropg()        ← unbind M and G
                            → gfput(p, gp)   ← recycle G into gfree list
                            → schedule()     ← start next cycle  ♻️
Enter fullscreen mode Exit fullscreen mode

This is the complete loop: schedule → execute → gogo → user code → goexit → schedule.


5. Goroutine Preemption

Go uses a hybrid cooperative + preemptive scheduling model.

Cooperative Preemption (Go ≤ 1.13)

  • sysmon detects G running > 10ms → sets g.preempt = true
  • At every function call, the compiler inserts a stack-growth check (runtime.morestack_noctxt)
  • If preempt == true, G detaches from M and is pushed to the global queue

The loophole:

func main() {
    go fmt.Println("hi")
    for {}  // no function calls → preempt flag never checked → goroutine never yields
}
Enter fullscreen mode Exit fullscreen mode

Signal-Based Async Preemption (Go ≥ 1.14)

Go 1.14 introduced true async preemption using OS signals:

Step 1: M registers sighandler for SIGURG at startup

Step 2: sysmon detects G running > 10ms
        → calls preemptone()
        → sends SIGURG to the M running that G

Step 3: M's sighandler fires
        → receives sigPreempt
        → calls doSigPreempt()
        → injects a CALL to runtime.asyncPreempt
          into G's execution context (modifies PC)

Step 4: G executes asyncPreempt (without knowing it)
        → mcall → switches to g0 stack
        → gopreempt_m() → schedule()

Step 5: G is preempted and re-queued
        When rescheduled, G resumes from original PC  ✅
Enter fullscreen mode Exit fullscreen mode

Why SIGURG? It's rarely used by application code, making it a safe signal to hijack for the runtime.

This mechanism also benefits the GC: to stop the world, Go simply sends SIGURG to all running threads — no need for safe-point polling.


6. gopark: Voluntarily Suspending a Goroutine

When a goroutine needs to wait (channel, mutex, I/O), it calls gopark:

func gopark(unlockf func(*g, unsafe.Pointer) bool, ...) {
    mp := acquirem()
    gp := mp.curg
    mp.waitunlockf = unlockf
    gp.waitreason = reason
    releasem(mp)
    mcall(park_m)   // switch to g0, call park_m
}

func park_m(gp *g) {
    casgstatus(gp, _Grunning, _Gwaiting)  // G: Running → Waiting
    dropg()                                // unbind M and G
    if fn := _g_.m.waitunlockf; fn != nil {
        ok := fn(gp, _g_.m.waitlock)
        if !ok {
            // condition already met, re-run immediately
            casgstatus(gp, _Gwaiting, _Grunnable)
            execute(gp, true)
        }
    }
    schedule()  // find next G to run
}
Enter fullscreen mode Exit fullscreen mode

G state transitions:

_Grunnable ──execute()──► _Grunning
    ▲                          │
    │                     gopark() / syscall
    │                          ▼
goready()              _Gwaiting
    ▲                          │
    └──── event fires ─────────┘
Enter fullscreen mode Exit fullscreen mode

7. Goroutine Stacks

Unlike OS threads (which have fixed-size stacks, typically 1–8MB), goroutine stacks are dynamic and managed by the runtime.

type stack struct {
    lo uintptr   // stack bottom
    hi uintptr   // stack top
}
Enter fullscreen mode Exit fullscreen mode
  • Initial size: 2KB (allocated from stackpool or mcache)
  • Growth: When a function call would overflow the stack, the compiler-inserted morestack check triggers a stack copy to a larger allocation
  • Small stacks (< threshold): Allocated from stackpool / mcache.stackcache
  • Large stacks: Allocated from stackLarge or directly from mheap

Goroutine stacks vs OS thread stacks: OS thread stacks are part of the process memory model (task_struct). Goroutine stacks are runtime-allocated heap memory, referenced by the g.stack field.


8. Goroutine Leaks

A goroutine leak occurs when a goroutine starts but never exits — typically because it's blocked forever on a channel, mutex, or I/O that never resolves.

Why leaks are serious:

  • Each goroutine holds at least 2KB of stack memory
  • Heap objects referenced by the goroutine cannot be GC'd
  • Memory usage grows continuously → eventual OOM crash

Common leak patterns:

// Leak: channel send with no receiver
func leak() {
    ch := make(chan int)
    go func() {
        ch <- 1  // blocks forever if no one reads
    }()
    // ch is never read
}

// Leak: infinite loop with no exit condition
func leak2(done <-chan struct{}) {
    go func() {
        for {
            // forgot to check <-done
        }
    }()
}
Enter fullscreen mode Exit fullscreen mode

Detection: Use go tool pprof → goroutine profile to see all live goroutines and their creation stacks.


9. Summary: The Complete Scheduling Picture

go func() { ... }
      ↓
runtime.newproc() → allocate G, push to P's LRQ
      ↓
M's schedule loop picks up G
      ↓
execute() → gogo() → G runs on M
      ↓
Preemption? ──► SIGURG → asyncPreempt → schedule()
Blocking?   ──► gopark() → _Gwaiting → schedule()
Syscall?    ──► entersyscall → P detaches → exitsyscall
Done?       ──► goexit() → goexit0() → gfput() → schedule()
Enter fullscreen mode Exit fullscreen mode
Concept Role
schedule() Find next runnable G (LRQ → global queue → steal → netpoller)
execute() Set G state to Running, bind M↔G, call gogo
gogo() Assembly: switch stack + JMP to G's PC
goexit() Cleanup after G finishes: reset, recycle, re-schedule
gopark() Voluntarily suspend G: Running → Waiting → schedule()
sysmon Background monitor: preempt long Gs, reclaim stuck Ps
Signal preemption SIGURG → inject asyncPreempt → force schedule()

Next in this series: Go Compiler & defer: Bootstrap Compiler, pprof+trace Toolchain & defer Internals (Part 6)


Follow the series for more deep dives into Go's runtime internals.

Top comments (0)