Hey there, Go developer! If you’ve been writing Go for a year or two, you’re probably comfy with goroutines and channels. They’re lightweight, slick, and make concurrency feel like a breeze. But here’s the catch: when your program shuts down, do those goroutines exit cleanly—or linger like uninvited guests, hogging memory and ports?
Picture this: you deploy a web service, send a SIGTERM
to restart it, and… nothing. Memory’s climbing, the port’s locked, and rogue goroutines are to blame. I’ve been there—debugging a production memory leak caused by sloppy shutdowns—and it’s not fun. Poor goroutine management can lead to leaks, dangling file handles, or corrupted data, turning your reliable app into a mess.
In this guide, we’re diving into graceful shutdowns: making sure your goroutines finish their work and release resources before the curtain falls. We’ll go from basics to production-ready patterns, with code, pitfalls, and lessons from my decade in Go. Whether you’re squashing bugs or leveling up your concurrency game, you’ll leave with tools to make your goroutines bow out gracefully. Let’s dive in!
What’s a Graceful Shutdown, Anyway?
A graceful shutdown means your program stops cleanly: all goroutines wrap up, resources get freed, and no tasks are left half-baked. Think of it as giving your workers a polite “shift’s over” instead of yanking the plug.
Why It Matters
Goroutines don’t clean up after themselves—they run until their function ends or the program dies. Without proper shutdowns, you risk:
- Memory Leaks: Each goroutine starts at 2KB and grows. A few stragglers can balloon into GBs.
- Resource Hogs: Open files or sockets pile up, crashing with “too many open files.”
- Data Chaos: Half-finished tasks can corrupt your DB or drop messages.
In dev, this hides. In production, it bites. Graceful shutdowns deliver reliability, easier debugging, and smooth restarts—crucial for microservices or servers.
Real Talk
A web server getting SIGTERM
should finish its requests, not ghost users. A scheduler shouldn’t ditch a task mid-run. It’s about control—and Go’s got the tools to make it happen.
Core Tools for Goroutine Shutdowns
Go hands you a killer toolkit: context.Context
, sync.WaitGroup
, and channels. Let’s see them in action with three practical patterns.
Pattern 1: Channel Notification
The simplest trick: use a channel to say “stop.”
package main
import (
"fmt"
"time"
)
func worker(exitChan chan struct{}) {
for {
select {
case <-exitChan:
fmt.Println("Worker shutting down...")
return
default:
fmt.Println("Worker running...")
time.Sleep(time.Second)
}
}
}
func main() {
exitChan := make(chan struct{})
go worker(exitChan)
time.Sleep(3 * time.Second)
close(exitChan)
time.Sleep(time.Second)
fmt.Println("Main exiting...")
}
How It Works: The worker listens for exitChan
to close, then exits. Clean and easy.
When to Use: Single, lightweight tasks like logging loops.
Watch Out: It’s basic—no timeouts or details.
Pattern 2: Context with Timeout
Need timeouts or cancellations? context.Context
is your friend.
package main
import (
"context"
"fmt"
"time"
)
func worker(ctx context.Context) {
for {
select {
case <-ctx.Done():
fmt.Println("Worker stopped:", ctx.Err())
return
default:
fmt.Println("Worker running...")
time.Sleep(time.Second)
}
}
}
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
defer cancel()
go worker(ctx)
time.Sleep(5 * time.Second)
fmt.Println("Main exiting...")
}
How It Works: ctx.Done()
triggers on timeout or cancel, with ctx.Err()
explaining why.
When to Use: Time-sensitive stuff like HTTP requests.
Watch Out: Slightly more setup, but worth it.
Pattern 3: WaitGroup + Signal
Got multiple goroutines? sync.WaitGroup
ensures they all finish.
package main
import (
"fmt"
"sync"
"time"
)
func worker(id int, wg *sync.WaitGroup, exitChan chan struct{}) {
defer wg.Done()
for {
select {
case <-exitChan:
fmt.Printf("Worker %d shutting down...\n", id)
return
default:
fmt.Printf("Worker %d running...\n", id)
time.Sleep(time.Second)
}
}
}
func main() {
var wg sync.WaitGroup
exitChan := make(chan struct{})
for i := 1; i <= 3; i++ {
wg.Add(1)
go worker(i, &wg, exitChan)
}
time.Sleep(3 * time.Second)
close(exitChan)
wg.Wait()
fmt.Println("All workers done, exiting...")
}
How It Works: wg.Wait()
blocks until every goroutine calls wg.Done()
.
When to Use: Batch jobs like parallel uploads.
Watch Out: Don’t forget to call wg.Add()
before launching!
Real-World Shutdowns: Code That Works
Let’s apply these patterns to common scenarios, with lessons from the trenches.
Scenario 1: HTTP Server
Goal: Handle SIGTERM
and finish requests.
package main
import (
"context"
"fmt"
"log"
"net/http"
"os"
"os/signal"
"syscall"
"time"
)
func main() {
srv := &http.Server{
Addr: ":8080",
Handler: http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(2 * time.Second) // Simulate work
fmt.Fprintf(w, "Hello, World!")
}),
}
go func() {
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Fatalf("Server error: %v", err)
}
}()
log.Println("Server on :8080")
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
<-sigChan
log.Println("Shutting down...")
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
if err := srv.Shutdown(ctx); err != nil {
log.Printf("Shutdown failed: %v", err)
} else {
log.Println("Server stopped")
}
}
Lesson: Set a timeout (5s works for most). Log everything—saved me hours once.
Pitfall: Forgot to close a custom listener once. Port stayed locked. Oof.
Scenario 2: Scheduled Tasks
Goal: Finish the current task on stop.
package main
import (
"context"
"fmt"
"time"
)
func taskScheduler(ctx context.Context) {
ticker := time.NewTicker(2 * time.Second)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
fmt.Println("Scheduler stopped:", ctx.Err())
return
case t := <-ticker.C:
fmt.Printf("Task at %v\n", t)
time.Sleep(1 * time.Second)
}
}
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
go taskScheduler(ctx)
time.Sleep(5 * time.Second)
cancel()
time.Sleep(1 * time.Second)
fmt.Println("Main exiting...")
}
Lesson: Use ticker.Stop()
to avoid leaks. Decide: stop now or finish?
Pitfall: Missed a ticker.Stop()
—goroutine leaked until I checked runtime.NumGoroutine()
.
Level Up: Advanced Tricks
Production demands more. Here’s how to dodge leaks and boost performance.
Hunt Goroutine Leaks
Leaks are sneaky. I once had a queue consumer spawn thousands.
package main
import (
"fmt"
"runtime"
"time"
)
func leakyWorker(ch chan struct{}) {
<-ch // Never closes!
fmt.Println("Exiting...")
}
func main() {
ch := make(chan struct{})
go leakyWorker(ch)
time.Sleep(2 * time.Second)
fmt.Printf("Goroutines: %d\n", runtime.NumGoroutine())
}
Fixes: Close channels, add timeouts, log runtime.NumGoroutine()
at exit.
Master Timeouts
Too short? Tasks die. Too long? Restarts lag.
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(load)*time.Second)
Tip: Base it on P95 request times. Test it.
Log Like a Pro
defer func() {
if ctx.Err() != nil {
log.Printf("Worker stopped: %v", ctx.Err())
}
}()
Tip: Add context—vague logs once hid a DB timeout from me.
Debugging Shutdowns Like a Pro
When your app won’t quit cleanly, rogue goroutines are often to blame. Here’s my checklist:
Count Goroutines
fmt.Printf("Goroutines running: %d\n", runtime.NumGoroutine())
Tip: Log at shutdown. If it’s not near 1, you’ve got leaks.
Profile with pprof
import _ "net/http/pprof"
func main() {
go http.ListenAndServe("localhost:6060", nil)
// Your code
}
Run go tool pprof http://localhost:6060/debug/pprof/goroutine
to spot stragglers.
Trace Execution
buf := make([]byte, 1<<16)
runtime.Stack(buf, true)
fmt.Printf("Stack trace:\n%s", buf)
Lesson: Caught a WebSocket deadlock with this.
Watch Your Step: Common Pitfalls
Here are the nastiest traps I’ve hit—and how to dodge them.
Unclosed Channels
func worker(ch chan struct{}) {
<-ch // Hangs if unclosed!
}
Fix: defer close(ch)
.
Context Overload
ctx1, cancel1 := context.WithTimeout(context.Background(), 5*time.Second)
ctx2, cancel2 := context.WithCancel(ctx1)
// Too messy!
Fix: One context per scope.
Forgetting WaitGroup Counters
go worker(&wg) // Forgot wg.Add(1)!
Fix: Pair go
with wg.Add(1)
.
Silent Failures
srv.Shutdown(ctx) // Ignored!
Fix: Check if err != nil
.
Testing Your Shutdowns: Don’t Trust, Verify
Untested shutdowns are a gamble. Here’s how to test them.
Simulate Signals
func TestServerShutdown(t *testing.T) {
srv := startServer()
time.Sleep(100 * time.Millisecond)
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, syscall.SIGTERM)
go func() { sigChan <- syscall.SIGTERM }()
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := srv.Shutdown(ctx); err != nil {
t.Errorf("Shutdown failed: %v", err)
}
}
Mock Workers
func TestWorkerShutdown(t *testing.T) {
var wg sync.WaitGroup
exitChan := make(chan struct{})
doneChan := make(chan struct{}, 1)
wg.Add(1)
go worker(&wg, exitChan, doneChan)
close(exitChan)
wg.Wait()
select {
case <-doneChan:
case <-time.After(500 * time.Millisecond):
t.Error("Worker didn’t shut down")
}
}
Tips: Use -race
, log with t.Log
, test in CI.
Wrapping Up
Graceful shutdowns aren’t just tech—they’re a mindset. You’ve got:
- Why: No leaks, stable apps.
- How: Channels, context, WaitGroup—mix and match.
- Where: Servers, schedulers—plan early.
Start small, test it, monitor with pprof
, and log everything. I’ve cut restart times to milliseconds with these tricks—your turn! Got a shutdown bug or test trick? Share it below—I’d love to hear your stories!
Top comments (3)
Thanks, Jones!
I'm currently working on a REST project.
I also encountered various implementations for starting and stopping an HTTP server.
My issue wasn't related to leaks, but rather with logging:
log.Println("Server on :8080")
It's impossible to completely avoid "false positives" in the logs with this approach.
I did it like this:
Since go1.23 Ticker no longer need to be closed
pkg.go.dev/time#NewTicker
You really need to include signal.NotifyContext as it avoids mucking around with channels. Just exit your workers when the context is done.