Athreya aka Maneshwar

Posted on Nov 5

Understanding Goroutines, Concurrency, and Parallelism in Go

#webdev #go #beginners #programming

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building *one place for all dev tools, cheat codes, and TLDRs* — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.

When you first dive into Go, one of the most exciting things you’ll encounter is goroutines, a simple yet incredibly powerful way to handle concurrency and parallelism.

But what exactly are goroutines? How do they work? And what’s the real difference between concurrency and parallelism when your machine has multiple CPU cores?

Let’s break it all down with code and deep insight.

What Are Goroutines?

A goroutine is a lightweight, independently executing function managed by the Go runtime.
You start one with the go keyword:

go myFunction()

That line tells Go:

“Run myFunction() in the background — don’t block the current function.”

Unlike OS threads, goroutines are extremely cheap — you can spawn thousands or even millions of them.
Each starts with a small stack (around 2 KB) that grows dynamically, unlike fixed-size thread stacks.

How Goroutines Work Internally

Go uses an M:N scheduler, meaning:

M = number of OS threads
N = number of goroutines

The runtime maps many goroutines (N) onto a smaller pool of OS threads (M) and distributes them across CPU cores.

It handles context switching, scheduling, and synchronization behind the scenes, no need to manually deal with pthread_create() or thread pools.

Concurrency vs Parallelism

These two are often confused, but they’re different ideas.

Term	Meaning	Analogy
Concurrency	Multiple tasks in progress (not necessarily simultaneous)	Cooking dinner while checking your phone — switching rapidly
Parallelism	Multiple tasks literally running at the same time	You and a friend cooking different dishes on separate stoves

Goroutines give you concurrency by default.
You get parallelism only when multiple cores execute goroutines at the same time.

Example 1: Concurrency (Single Core)

Here’s what concurrency looks like when Go is limited to a single CPU core:

package main

import (
    "fmt"
    "runtime"
    "time"
)

func worker(id int) {
    for i := 1; i <= 3; i++ {
        fmt.Printf("Worker %d running iteration %d\n", id, i)
        time.Sleep(400 * time.Millisecond)
    }
}

func main() {
    runtime.GOMAXPROCS(1) // use only 1 OS thread
    fmt.Println("Using GOMAXPROCS:", runtime.GOMAXPROCS(0))
    fmt.Println("---- Concurrency Demo ----")

    go worker(1)
    go worker(2)
    go worker(3)

    time.Sleep(3 * time.Second)
}

Even though all three goroutines are active, only one CPU core executes them at any given time.
The scheduler rapidly switches between them, creating the illusion of simultaneous work.

That’s concurrency — overlapping in time, not execution.

Example 2: Parallelism (Multi-Core)

Now let’s see true parallelism by letting Go use all available CPU cores:

package main

import (
    "fmt"
    "runtime"
    "sync"
    "time"
)

func heavyWork(id int, wg *sync.WaitGroup) {
    defer wg.Done()
    start := time.Now()
    sum := 0
    for i := 0; i < 5e7; i++ {
        sum += i
    }
    fmt.Printf("Worker %d done in %v (sum=%d)\n", id, time.Since(start), sum)
}

func main() {
    cores := runtime.NumCPU()
    runtime.GOMAXPROCS(cores)

    fmt.Printf("Detected %d cores\n", cores)
    fmt.Println("---- Parallelism Demo ----")

    var wg sync.WaitGroup
    for i := 1; i <= cores; i++ {
        wg.Add(1)
        go heavyWork(i, &wg)
    }

    wg.Wait()
    fmt.Println("All workers done.")
}

Each goroutine performs CPU-heavy work.
The Go scheduler spreads them across all CPU cores, running truly in parallel.

On an 8-core CPU, the total runtime will be roughly the time of one heavyWork() run — not eight times longer.
That’s real parallelism.

How Go Uses Your Cores

You can inspect and control how many cores Go uses:

fmt.Println(runtime.NumCPU())      // total logical CPUs
fmt.Println(runtime.GOMAXPROCS(0)) // CPUs Go is currently using

Since Go 1.5, Go automatically uses all available logical CPUs by default.
You can override this using:

runtime.GOMAXPROCS(n)

For example, setting GOMAXPROCS(2) on an 8-core system means Go will schedule goroutines across only 2 cores.

CPU-bound vs I/O-bound Workloads

When do extra cores actually help? Depends on your workload:

Type	Description	Examples	Benefit of Multi-Core
CPU-bound	Heavy computation	math, compression, image processing	✅ Huge speedup
I/O-bound	Waiting for network/disk I/O	API calls, DB queries, file I/O	⚙️ Limited (mostly concurrency)
Mixed	Compute + I/O	API + post-processing	⚡ Good

So, more cores help only when your goroutines are actually doing CPU work, not just waiting.

Waiting for Goroutines to Finish

By default, the main() function exits immediately, killing any running goroutines.

Use sync.WaitGroup to wait until all complete:

var wg sync.WaitGroup

for i := 1; i <= 5; i++ {
    wg.Add(1)
    go func(id int) {
        defer wg.Done()
        fmt.Println("Worker", id, "done")
    }(i)
}

wg.Wait() // blocks until all goroutines complete

How the Go Scheduler Works (Deep Dive)

The Go runtime scheduler revolves around three components:

Term	Meaning
G	Goroutine (your lightweight task)
M	Machine (an OS thread)
P	Processor (a logical context that schedules Gs onto Ms)

There are always GOMAXPROCS number of Ps.
Each P manages one M and runs many Gs using a local queue.
If one P runs out of work, it “steals” goroutines from another — this is work stealing, keeping CPUs busy efficiently.

That’s how Go can handle millions of goroutines on a few OS threads.

Conclusion

Goroutines are Go’s superpower, they make concurrent and parallel programming both simple and efficient.

Use goroutines for concurrency (many tasks “in progress”)
Use GOMAXPROCS(runtime.NumCPU()) for parallelism (all cores active)
Combine both for massive scalability

Even with millions of goroutines, Go stays efficient thanks to its work-stealing scheduler, dynamic stack sizing, and M:N threading model.

So the next time your 8-core CPU is chilling, let Go do the heavy lifting — effortlessly.

I’ve been building for FreeDevTools.

A collection of UI/UX-focused tools crafted to simplify workflows, save time, and reduce friction in searching tools/materials.

Any feedback or contributors are welcome!

It’s online, open-source, and ready for anyone to use.

👉 Check it out: FreeDevTools
⭐ Star it on GitHub: freedevtools

DEV Community