Serif COLAKEL

Posted on Nov 23

Go Memory Profiling & Performance Debugging — Real-World Guide to pprof

#go #productivity #software #backend

Memory issues in Go aren’t always dramatic.
Sometimes they’re silent: your service boots at 200MB, runs fine, then slowly grows to 2GB+ over the day.

No panics.
No spikes in CPU.
Just… memory creep.

If you’ve ever faced this, welcome — this guide is the practical, real-world walkthrough I wish I had years ago.

In this article, we’ll explore:

how to correctly use pprof (CPU, heap, block, mutex)
how to read flamegraphs
how to diagnose true memory leaks
how to analyze production memory spikes
how concurrency patterns impact memory
real-world case studies

Let’s dive in. 🚀

🌡️ 1. pprof Basics (CPU, Memory, Block, Mutex)

Go ships with profiling tools that many languages envy. With almost no setup, you can inspect:

CPU profiling → where time is spent
Memory (heap) → what allocates / what retains
Block profiling → goroutines blocked on channels/locks
Mutex profiling → lock contention

Enabling it is simple:

import _ "net/http/pprof"

go func() {
    log.Println(http.ListenAndServe("0.0.0.0:6060", nil))
}()

Then:

curl http://localhost:6060/debug/pprof/heap > heap.out
curl http://localhost:6060/debug/pprof/profile?seconds=30 > cpu.out

Or open the UI:

go tool pprof -http=:9999 heap.out

🧠 2. alloc_space vs inuse_space (Critical!)

When reading heap profiles, you’ll see:

alloc_space

Total memory allocated over time (cumulative).
→ Great for spotting allocation-heavy functions.

inuse_space

Memory currently retained and in use.
→ This is where memory leaks appear.

Common mistake:
People confuse “allocations” with “leaks”.

A leak shows up in inuse_space, not alloc_space.

🏥 3. Case Study: “Why is my Go service using 2GB RAM?”

A very real production issue:

service normally uses ~200MB
memory grows slowly over hours
GC runs frequently
CPU normal
RAM never drops

Step 1 — Capture the heap

curl http://localhost:6060/debug/pprof/heap > heap.out

Step 2 — Inspect

go tool pprof -top heap.out

Look for:

large inuse_space
suspicious data structures
packages that shouldn’t hold memory

Step 3 — Visualize (the real magic)

go tool pprof -http=:9999 heap.out

Typical root causes:

unbounded channels
slices that grow but never shrink
caches with no eviction
goroutines stuck holding references
large buffers reused incorrectly

Every real memory issue I’ve diagnosed involved the flamegraph.

🔥 4. Reading Flamegraphs (Fast, Practical Guide)

When you open the pprof UI, the flamegraph is your best friend.

How to read it:

wide boxes = lots of memory
boxes at the bottom = root cause
bright colors = heavy allocators
tall stacks = long call chains

Look for:

a wide block at the very bottom
repeated patterns
unexpected external packages
functions returning large objects

If something looks “too wide,” it probably is.

🟢 5. Safe Production Profiling (Do’s & Don’ts)

Yes — you can run pprof in production.

✔ Safe:

heap profiles
short CPU profiles (5–15s)
mutex/block profiles
profiling internal-only endpoints

✘ Dangerous:

long CPU profiles on high-traffic systems
exposing /debug/pprof publicly
profiling during incident response without sampling

Recommended setup:

import (
    _ "net/http/pprof"
)

func init() {
    go func() {
        http.ListenAndServe("127.0.0.1:6060", nil)
    }()
}

Keep it on localhost or behind internal ingress only.

🧪 6. Production Memory Profiling Checklist

Before concluding “it’s a leak,” verify:

✔ Compare multiple heap snapshots
✔ Look at inuse_space, not alloc_space
✔ Check GC frequency (GODEBUG=gctrace=1)
✔ Capture goroutine dump (debug/pprof/goroutine)
✔ Inspect channel sizes
✔ Confirm no unbounded caches
✔ Look for large slices/maps retaining data

Memory issues rarely come from one single spot — they’re usually behavioral patterns.

⚡ 7. Memory Spikes During Traffic Surges

Traffic surges often cause:

temporary buffer spikes
batching behavior
backpressure in channels
worker pools expanding
slow consumers holding memory longer
GC pauses misaligned with load

To debug:

curl http://localhost:6060/debug/pprof/heap > spike.out
curl http://localhost:6060/debug/pprof/heap > normal.out

Then compare:

go tool pprof -diff_base normal.out spike.out

This shows what changed during the spike.

🔄 8. How Concurrency Affects Memory

Channels

Unbounded sends → memory grows.
Slow consumers → backlog accumulates.

Buffers

Large byte slices retained in memory.
Temporary buffers escaping to the heap accidentally.

sync.Pool

Useful, but not a fix-all:

no guarantee memory is freed immediately
may keep pools warm and increase RSS
bad reuse of objects → stale references

Workers

Worker pools can hide memory growth when load increases.

If concurrency increases under load → memory increases too.

🎯 Final Thoughts

Memory profiling is one of the most powerful debugging skills in Go — and one of the most underrated.

If you master:

heap profiles
flamegraphs
GC tracing
concurrency behavior

…you’ll be able to diagnose 90% of real-world performance issues in modern Go microservices.

Happy coding! 🚀

DEV Community