The Complex and Beautiful Truth About Go's Concurrency Model
The Core Revelation: Goroutines Are Virtual, But Their Behavior Is Real
The most mind-bending insight: A goroutine has no physical form in your operating system. It is not in your process table. It is not a real OS resource.
A goroutine is a virtual thread.
Let me illustrate with a powerful analogy:
The Facebook Profile Analogy
Consider your social media profile. The profile itself is virtual—no flesh and blood. It's a logical construct. But when your virtual profile sends a message, a real person reads it. The behavior is real. The effect is real. The impact on the real world is undeniable.
Goroutines work the same way.
Virtual Profile → Sends Message → Real Person Reads It
Virtual Thread → Executes Code → Real Effect on State
A goroutine is logical. It exists as a concept within the Go Runtime. But when it executes, when it modifies variables, when it prints to stdout, when it sends over a network—those effects are profoundly real, executed by real OS threads on real CPU cores.
This distinction is not mere philosophy. It's the foundation for understanding everything that follows.
The Code Simulation: What Actually Happens
Let's trace through a simple example:
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("main function started")
go fmt.Println("hello this is Islam Saiful-5")
goRoutine()
time.Sleep(5 * time.Second)
fmt.Println("main function ended")
}
func goRoutine() {
go fmt.Println("hello this is saiful")
fmt.Println("hello world")
go fmt.Println("hello this is saiful2")
go fmt.Println("hello this is saiful3")
go fmt.Println("hello this is saiful4")
fmt.Println("bye world")
}
Without the go Keyword (Sequential Execution)
If we remove all go keywords:
Output:
main function started
hello world
bye world
hello this is Islam Saiful-5
main function ended
The execution is linear, predictable, deterministic. One thing after another.
With the go Keyword (Concurrent Execution)
With the go keywords, multiple things happen "simultaneously":
Output (Run 1):
main function started
hello world
bye world
hello this is saiful
hello this is saiful2
hello this is saiful3
hello this is saiful4
hello this is Islam Saiful-5
main function ended
Output (Run 2):
main function started
hello this is Islam Saiful-5
hello world
hello this is saiful
bye world
hello this is saiful4
hello this is saiful3
hello this is saiful2
main function ended
Output (Run 3):
main function started
hello world
hello this is saiful3
hello this is saiful
bye world
hello this is saiful4
hello this is Islam Saiful-5
hello this is saiful2
main function ended
Notice: The order is different every time. The goroutines are executing concurrently, and their relative ordering is non-deterministic.
The Critical Question: Why Do We Need time.Sleep?
time.Sleep(5 * time.Second) // Why is this line essential?
This is the barrier protecting you from a hard truth: The main goroutine is a tyrant. When it finishes, it terminates the entire process—no exceptions.
If main returns before other goroutines complete, they are instantly killed. The OS doesn't care about them. The Go Runtime doesn't get a say. The process exits. Period.
The time.Sleep is a crude but effective way to keep the main goroutine alive long enough for others to finish. Without it:
func main() {
go fmt.Println("Will this print?")
// Nope. Main returns, process dies.
}
The answer is no. You never see that output.
This is why understanding the main goroutine's dominance is crucial.
The Birth of a Process: Disk → Binary → RAM → Execution
Step 1: Compilation (go build main.go)
When you compile your Go code:
go build main.go
You create a binary executable file. This file is structured:
Binary File
├── Code Segment
│ ├── Machine instructions (functions)
│ └── Constants (read-only)
├── Data Segment
│ └── Global variables (initialized)
└── BSS Segment
└── Uninitialized globals
This binary sits on your hard disk, inert and lifeless. It's potential energy.
Step 2: Execution (./main)
When you run the binary:
./main
The OS loader springs into action:
- Loads the binary into RAM from the hard disk
- Allocates memory for the process
- Creates a process structure (virtual computer)
- Creates the main thread (first execution context)
- Jumps to the entry point (typically, the Go Runtime initialization)
Now the binary transforms:
┌──────────────────────────┐
│ Hard Disk │ (Inert binary file)
└─────────┬────────────────┘
│ OS Loader
↓
┌──────────────────────────┐
│ RAM (Memory Layout) │
├──────────────────────────┤
│ Code Segment │ ← Machine instructions
├──────────────────────────┤
│ Data Segment │ ← Global variables
├──────────────────────────┤
│ Stack │ ← Function calls, local vars
├──────────────────────────┤
│ Heap │ ← Dynamic memory
└──────────────────────────┘
↓
┌──────────────────────────┐
│ CPU Execution │
│ (Fetches, Decodes, │
│ Executes instructions) │
└──────────────────────────┘
This is where your program comes alive.
Enter the Go Runtime: The Mini-Operating System
This is the game-changer. The Go Runtime is a mini operating system running inside your Go process.
Think about it: The OS is a program that manages hardware, schedules threads, allocates memory. The Go Runtime does the same thing, but at a higher level, with different resources (goroutines instead of threads, logical processors instead of physical cores).
Timeline of Execution
1. OS loads binary into RAM
2. Process created with main thread
3. Main thread starts at the entry point
4. Go Runtime INITIALIZES (before your code runs!)
5. Go Runtime sets up:
- 8MB main stack
- Goroutine Scheduler
- Heap Allocator
- Garbage Collector
- Logical Processors
6. THEN your main() function executes
7. When main() returns, Go Runtime shuts down
8. Process terminates
Key insight: Your code doesn't have exclusive control. The Go Runtime is always present, always managing, always orchestrating.
The Four Core Components
1. Goroutine Scheduler
The traffic controller of your program. It:
- Tracks all goroutines
- Decides which goroutine runs when
- Manages the G-M-P model (Goroutine-Machine-Processor)
- Works like the OS kernel scheduler, but in user-space
2. Heap Allocator
The memory banker. It:
- Allocates memory for goroutine stacks
- Manages the
make()andnew()allocations - Tracks where every byte lives
- Works alongside the Garbage Collector
3. Garbage Collector
The janitor of memory. It:
- Identifies unreachable memory
- Reclaims it automatically
- Runs concurrently with your code
- Uses mark-and-sweep algorithms
4. Logical Processors (P)
Virtual CPUs. They:
- Correspond to your system's actual CPU cores
- If your CPU has 4 cores, you get 4 Logical Processors
- Each has a run queue of goroutines
- Each is paired with an OS thread (M)
CPU has 4 cores
↓
Go Runtime creates 4 Logical Processors (P)
↓
OS creates 4 OS Threads (M)
↓
Each M executes goroutines from its P's queue
The Complete Layer Hierarchy: From CPU to Your Code
Understanding concurrent execution requires understanding all layers:
┌────────────────────────────────────────────┐
│ Your Go Code │
│ func main() { go doWork() } │
└────────────────────────────────────────────┘
↓
┌────────────────────────────────────────────┐
│ Goroutines (G) │
│ - Virtual threads │
│ - 2KB initial stack │
│ - Auto-growing stacks (heap) │
│ - Thousands can exist │
└────────────────────────────────────────────┘
↓
┌────────────────────────────────────────────┐
│ Logical Processors (P) │
│ - Virtual CPUs │
│ - Count = runtime.NumCPU() │
│ - Each has a run queue of Gs │
│ - Owned by Go Runtime Scheduler │
└────────────────────────────────────────────┘
↓
┌────────────────────────────────────────────┐
│ OS Threads (M) │
│ - Real OS threads │
│ - 8MB stack each (kernel memory) │
│ - ~1 per P │
│ - Owned by OS Kernel │
└────────────────────────────────────────────┘
↓
┌────────────────────────────────────────────┐
│ Go Runtime Scheduler │
│ - Maps G → P → M │
│ - User-space scheduling │
│ - Work-stealing algorithm │
└────────────────────────────────────────────┘
↓
┌────────────────────────────────────────────┐
│ OS Kernel Scheduler │
│ - Schedules OS threads (M) │
│ - Kernel-space scheduling │
│ - Preemptive scheduling │
└────────────────────────────────────────────┘
↓
┌────────────────────────────────────────────┐
│ CPU Cores │
│ - Physical execution │
│ - Execute machine instructions │
│ - Control Unit, Program Counter, │
│ Registers │
└────────────────────────────────────────────┘
This is the symphony. Each layer abstracts the one below, providing a simplified interface. Your code sees only goroutines. The Go Runtime handles the rest.
The 2KB Secret: Why Goroutines Are Lightweight
This is where goroutines become magical.
OS Thread Stack: Fixed 8MB
When the OS creates a thread, it immediately allocates 8MB for its stack. This is fixed. Whether you use 1KB or 7.9MB, the OS has reserved 8MB.
Implication: You can create only thousands of threads. Beyond that, you run out of memory.
1,000,000 threads × 8 MB = 8,000,000 MB = 8 TB
No modern system has 8TB of memory for thread stacks.
Goroutine Stack: 2KB Initial, Dynamic Growth
A goroutine starts with 2KB—that's a 4000:1 ratio.
But here's the magic: It's not fixed. When a goroutine needs more stack (due to nested function calls), the Go Runtime reallocates:
2KB stack is full
↓
Go Runtime detects overflow
↓
Allocates new 4KB stack in heap
↓
Copies all data from old to new
↓
Deletes old stack
↓
Continues execution seamlessly
This is transparent to your code. You never see it. It just works.
Implication: You can create millions of goroutines.
1,000,000 goroutines × 2 KB = 2,000,000 KB = 2 GB
Most modern systems have 2GB of RAM available.
The Memory Efficiency Advantage
| Aspect | OS Thread | Goroutine |
|---|---|---|
| Stack Size | 8 MB (fixed) | 2 KB (initial) |
| Stack Location | Kernel memory | Heap memory |
| Growth | None | Dynamic |
| Max Stack | Fixed | Up to 1 GB |
| Creation Overhead | High (syscall) | Low (runtime call) |
| Thousands Possible? | ~4,000 | Yes |
| Millions Possible? | No | Yes |
This memory efficiency is why Go can handle massive concurrency. This is why you can build a server handling 1 million concurrent connections. This is why goroutines exist.
The Scheduling Model: (from BGCE ARCHIEVE)
The Go Runtime's scheduler implements the G-M-P model:
- G = Goroutine (user-created concurrent units)
- M = Machine (OS thread)
- P = Processor (logical CPU)
How the Scheduler Works
You write: go printHello(1)
↓
Go Runtime creates Goroutine G1
↓
Scheduler adds G1 to a Processor's run queue
↓
When a Machine (OS thread) is free on that Processor
↓
M picks G1 from P's queue
↓
M executes G1 on the CPU
↓
When G1 blocks or finishes
↓
M picks next G from queue
↓
Repeat
Visual Example
Imagine a CPU with 4 cores:
Go Runtime Scheduler
│
┌────┼────┬────┐
↓ ↓ ↓ ↓
P1 P2 P3 P4 (4 Logical Processors)
│ │ │ │
M1 M2 M3 M4 (4 OS Threads)
│ │ │ │
G1,G5,G9 G2,G6,G10 G3,G7,G11 G4,G8,G12
│ │ │ │
(4 Gs per queue, 12 total goroutines)
│ │ │ │
↓ ↓ ↓ ↓
Core1 Core2 Core3 Core4 (Physical CPU Cores)
The scheduler's job: Keep those 4 cores busy by swapping goroutines in and out.
Scheduling Example: 100,000 Goroutines
100,000 goroutines on 4 cores:
- P1, P2, P3, P4 each have a queue of ~25,000 Gs
- M1, M2, M3, M4 rapidly swap goroutines
- If G1 blocks on I/O, M1 picks G2 from queue
- Context switching happens microseconds
- From CPU's perspective, all 4 cores are always busy
- From your perspective, 100,000 things happen "simultaneously"
This is the magic. It's not parallel (only 4 at a time). It's concurrent (interleaved, but appears simultaneous).
📚 Stack & Heap: Where Goroutines Live
Main Goroutine vs Other Goroutines
Main Goroutine:
- Executes main() function
- Stack in kernel memory (8MB)
- Special status: only one per process
- When it exits, process terminates
Other Goroutines:
- Created with `go func()`
- Stack in heap memory (2KB initial)
- Completely interchangeable
- Process continues even if they exit
Memory Layout During Execution
Process Memory
├─ Kernel Stack (8MB for main goroutine)
│ ├─ main()
│ ├─ printHello()
│ ├─ fmt.Println()
│ └─ ... (other function frames)
│
└─ Heap
├─ Goroutine 1 Stack (2KB → 4KB → 8KB)
│ ├─ printHello(1)
│ ├─ fmt.Println()
│ └─ ...
│
├─ Goroutine 2 Stack (2KB)
│ ├─ printHello(2)
│ └─ ...
│
├─ Goroutine 3 Stack (2KB → grows)
│ └─ ...
│
└─ ... (more goroutines)
Each goroutine is independent. Their stacks are separate, managed individually. When a goroutine needs more stack, the Go Runtime handles it—allocating new space, copying data, updating pointers.
Summarized
-
Goroutines are virtual threads
- Logical, not physical
- Managed by Go Runtime, not OS
- Their behavior is real, their existence is virtual
-
Go Runtime is a mini-operating system
- Initializes before your code
- Manages scheduler, allocator, garbage collector
- Orchestrates everything transparently
-
Memory efficiency is the secret
- 2KB goroutine vs 8MB OS thread (4000:1 ratio)
- Dynamic growth in heap memory
- Millions possible, not thousands
-
Scheduling is sophisticated
- G-M-P model: Goroutines → Processors → Machines → CPU
- Work-stealing algorithm for load balancing
- Non-deterministic by design, not accident
-
Main goroutine is your control point
- First goroutine to run
- Process persists while it's alive
- Control its lifetime via blocking mechanisms
-
Non-determinism is a feature
- Prevents race conditions by forcing safe code
- Scales with confidence
- Forces channels over shared memory
-
Layers of abstraction protect you
- You write code; don't manage threads
- Go Runtime handles scheduling
- OS Kernel handles execution
- CPU handles actual computation
Top comments (0)