Go Bulkhead Pattern — Complete Reference
Table of Contents
- What is the Bulkhead Pattern
- Three Mechanisms
- HTTP Transport Parameters — Deep Dive
- Little's Law — Sizing Semaphores Correctly
- Full Production Implementation
- Wiring It Up
- Correctly Sized Real-World Configs
- Mental Model Summary
1. What is the Bulkhead Pattern
Named after watertight compartments in ship hulls. If one compartment floods, the others stay dry.
In software: isolate resources per dependency so a failure or slowdown in one cannot exhaust shared resources and take down unrelated parts of the system.
Without bulkheads:
Auth service slows down
→ goroutines pile up waiting
→ shared thread/connection pool exhausted
→ Payment service also fails (never touched auth)
→ Entire service down
With bulkheads:
Auth service slows down
→ only auth's semaphore/pool fills up
→ auth calls get fast 503s
→ Payment service unaffected
→ Ship stays afloat
2. Three Mechanisms
Mechanism 1 — Semaphore (Concurrency Bulkhead)
Caps how many concurrent in-flight calls exist to a dependency at any moment.
sem := semaphore.NewWeighted(20) // max 20 concurrent calls
func call(ctx context.Context) error {
// TryAcquire = instant rejection if full (true bulkhead)
// Acquire = waits up to QueueTimeout for a slot
acquireCtx, cancel := context.WithTimeout(ctx, 50*time.Millisecond)
defer cancel()
if err := sem.Acquire(acquireCtx, 1); err != nil {
return fmt.Errorf("bulkhead full: %w", err) // fast 503
}
defer sem.Release(1)
// make the actual call
}
TryAcquire vs Acquire:
| Method | Behaviour | Use when |
|---|---|---|
TryAcquire(1) |
Instant reject if full | Strict load shedding |
Acquire(ctx, 1) |
Wait up to ctx deadline | Brief queuing acceptable |
Each dependency gets its own independent semaphore — they do not share state.
Mechanism 2 — Connection Pool (TCP Bulkhead)
Caps how many TCP connections exist to each host. Without this, all services share http.DefaultTransport and one slow host monopolises all connections.
transport := &http.Transport{
MaxConnsPerHost: 20, // hard ceiling on total connections
MaxIdleConnsPerHost: 20, // warm connections kept alive
MaxIdleConns: 20, // global idle pool
}
Mechanism 3 — Timeouts (Blast Radius Bulkhead)
Even if concurrency is high, tight timeouts ensure slots are released quickly when a dependency misbehaves. This is your blast radius control knob.
client := &http.Client{
Transport: transport,
Timeout: 600 * time.Millisecond, // absolute end-to-end deadline
}
3. HTTP Transport Parameters — Deep Dive
transport := &http.Transport{
MaxIdleConnsPerHost: N,
MaxConnsPerHost: N,
MaxIdleConns: N,
DialContext: ...,
TLSHandshakeTimeout: ...,
ResponseHeaderTimeout: ...,
IdleConnTimeout: ...,
}
client := &http.Client{
Transport: transport,
Timeout: ...,
}
MaxConnsPerHost — Hard Ceiling
What: Maximum total TCP connections (active + idle) to a single host.
New requests BLOCK when this is hit (until a connection is freed or context times out).
Effect: This IS the TCP-level bulkhead. Beyond this number, no new connections are opened.
How to set it: Match your semaphore's MaxConcurrent. There is no point allowing 50 concurrent calls if you only have 10 TCP connections — requests would block waiting for a free connection anyway.
MaxConnsPerHost: cfg.MaxConcurrent // keep these equal
MaxIdleConnsPerHost — Warm Connection Cache
What: How many idle (reusable) connections to keep open to a single host after a request completes.
Effect: Avoids TCP+TLS handshake overhead on the next request. Higher = faster reuse, more FDs held open.
How to set it:
- Set equal to or slightly below
MaxConnsPerHost - For internal services (fast, high RPS): match
MaxConnsPerHostexactly — you want all connections warm - For external APIs (slow, low RPS): can be lower since connections are held for longer and fewer are needed warm simultaneously
MaxIdleConnsPerHost = MaxConnsPerHost // internal, high-throughput
MaxIdleConnsPerHost = MaxConnsPerHost / 2 // external, slow/bursty
MaxIdleConns — Global Idle Pool
What: Total idle connections across ALL hosts combined.
Effect: If this is too low, connections for one host get evicted to make room for another,
causing unexpected TCP reconnects even though per-host limits haven't been hit.
How to set it: Sum of all your per-host idle limits across all bulkheads in the process.
// If you have 3 bulkheads: auth(20) + payment(10) + inventory(15)
MaxIdleConns: 20 + 10 + 15, // = 45, avoids cross-host eviction
Common mistake: Leaving this at Go's default of 100 when
MaxIdleConnsPerHostis also 100. The global cap then silently limits what you think are independent pools.
ResponseHeaderTimeout — First Byte Timeout
What: Time allowed between sending the request body and receiving the first byte of response headers.
Effect: The most important timeout for catching slow/hung servers.
Does NOT include time to read the response body.
How to set it: Slightly above your p99 latency for this dependency. This is the "server is stuck" detector.
External API p99 = 500ms → ResponseHeaderTimeout = 600ms
Internal svc p99 = 5ms → ResponseHeaderTimeout = 20ms
Database p99 = 20ms → ResponseHeaderTimeout = 50ms
http.Client.Timeout — Absolute End-to-End Deadline
What: Time from request initiation to reading the last byte of response body.
Includes: DNS + TCP dial + TLS + write request + wait for headers + read body.
Effect: The outer hard deadline. Cancels the entire request if exceeded.
How to set it: ResponseHeaderTimeout + estimated body read time. Always set this — without it you can leak goroutines indefinitely.
ResponseHeaderTimeout = 600ms (detect slow server)
body read estimate = 200ms (for your expected payload size)
Client.Timeout = 800ms (slightly above their sum)
DialContext.Timeout — TCP Handshake Limit
What: Time allowed to establish the TCP connection itself.
Effect: Catches unreachable hosts fast. Completely independent of request latency.
How to set it: 1–3 seconds universally. TCP handshake should be near-instant on healthy networks; longer means routing problem, not slowness.
DialContext: (&net.Dialer{
Timeout: 2 * time.Second, // TCP dial hard limit
KeepAlive: 30 * time.Second, // TCP keepalive probes
}).DialContext,
TLSHandshakeTimeout — TLS Negotiation Limit
What: Time allowed for TLS handshake after TCP is established.
Effect: Catches TLS issues (expired certs, overloaded TLS terminator) separately from request latency.
How to set it: 3–5 seconds. TLS handshake involves a round trip plus crypto; 5s is generous but safe.
TLSHandshakeTimeout: 5 * time.Second,
IdleConnTimeout — Stale Connection Eviction
What: How long an idle connection is kept in the pool before being closed.
Effect: Prevents holding open TCP connections that the server has already closed (common after 60–90s).
How to set it: 60–90 seconds. Most servers close idle connections at ~90s. Setting this below their limit prevents "connection reset" errors.
IdleConnTimeout: 90 * time.Second, // safe default
QueueTimeout — Semaphore Wait Budget
What: How long a request waits for a semaphore slot before being rejected.
Effect: Controls the queue depth in time. Even if a slot frees in 10ms, you might not want to wait.
How to set it: Based on your own SLA minus downstream latency.
Your SLA: 500ms total
Downstream p99: 200ms
Budget for queuing: 500 - 200 - 50(overhead) = 250ms
QueueTimeout: ~100ms (conservative, leaves room for retries)
For slow external APIs: lower QueueTimeout (fail fast, don't queue) For fast internal services: even lower (if pool is full, something is wrong)
All Parameters at a Glance
| Parameter | Controls | Rule of Thumb |
|---|---|---|
MaxConnsPerHost |
TCP connection ceiling | = MaxConcurrent
|
MaxIdleConnsPerHost |
Warm connection cache | = MaxConnsPerHost (internal), ÷2 (external) |
MaxIdleConns |
Global idle pool | = sum of all per-host idle limits |
ResponseHeaderTimeout |
"Server is slow" detector | p99 latency × 1.2 |
Client.Timeout |
Absolute end-to-end limit |
ResponseHeaderTimeout + body read estimate |
DialContext.Timeout |
TCP dial limit | 1–2s universally |
TLSHandshakeTimeout |
TLS negotiation limit | 3–5s universally |
IdleConnTimeout |
Stale connection eviction | 60–90s (below server's idle timeout) |
QueueTimeout |
Semaphore wait budget | Your SLA − p99 − overhead |
4. Little's Law — Sizing Semaphores Correctly
L = λ × W
L = concurrent requests in-flight (what your semaphore controls)
λ = throughput (requests per second you want to sustain)
W = average time per request (latency in seconds)
Key Insight
Slower dependency = MORE semaphore slots needed to sustain the same RPS.
Target: 100 rps
Auth svc p99=100ms → L = 100 × 0.1 = 10 concurrent (fast, slots free quickly)
Payment p99=500ms → L = 100 × 0.5 = 50 concurrent (slow, slots held longer)
This is counterintuitive but correct: a slow service holds each semaphore slot for longer, so you need more slots to keep throughput flowing.
What Happens When Semaphore Is Too Small
Payment svc p99=500ms, target 100rps, but MaxConcurrent=10:
Max throughput = 10 / 0.5 = 20 rps ← you're throttling yourself to 20%
What Happens When Semaphore Is Too Large
Blast radius grows: more goroutines can pile up waiting on a misbehaving dep.
Blast radius is controlled by TIMEOUT, not concurrency.
The Correct Formula
min_concurrent = target_rps × p99_latency_seconds
semaphore_limit = min_concurrent × 1.3 // +30% headroom for bursts
NumCPU() is Wrong for IO-Bound Bulkheads
runtime.NumCPU() is the correct pool size only for CPU-bound work (hashing, compression, image processing) where goroutines burn CPU the entire time.
For IO-bound calls (HTTP, DB, Kafka): goroutines spend ~99% of their time parked waiting on the network. The Go scheduler puts them to sleep and runs other goroutines on the same OS thread. You can sustain thousands of concurrent IO goroutines on a 4-core machine.
IO call timeline:
[send ~50µs] [====== blocked waiting ~200ms ======] [read ~50µs]
↑
goroutine is PARKED here, OS thread runs other work
→ semaphore limit is a downstream capacity question, not a CPU question
5. Full Production Implementation
package bulkhead
import (
"context"
"fmt"
"net"
"net/http"
"sync/atomic"
"time"
"golang.org/x/sync/semaphore"
)
// Config defines the bulkhead parameters for one downstream dependency.
type Config struct {
Name string
MaxConcurrent int64 // semaphore size — derived from Little's Law: target_rps × p99_latency
MaxConnections int // TCP connection ceiling — keep equal to MaxConcurrent
RequestTimeout time.Duration // absolute end-to-end HTTP deadline (blast radius knob)
QueueTimeout time.Duration // how long to wait for a semaphore slot before rejecting
}
// Metrics holds live counters. All fields use atomic ops — safe to read from any goroutine.
type Metrics struct {
Accepted atomic.Int64 // requests that acquired a semaphore slot
Rejected atomic.Int64 // requests rejected because bulkhead was full
Errors atomic.Int64 // requests that acquired a slot but the HTTP call failed
}
// Bulkhead combines a per-dependency semaphore and an isolated HTTP client.
// Each instance is independent — a saturated bulkhead has zero effect on others.
type Bulkhead struct {
cfg Config
sem *semaphore.Weighted // Gate 1: concurrency cap
client *http.Client // Gate 2: connection pool cap + timeouts
Metrics Metrics
}
// New creates a Bulkhead. Call once per downstream dependency at startup.
func New(cfg Config) *Bulkhead {
transport := &http.Transport{
// --- Gate 2: TCP-level bulkhead ---
MaxConnsPerHost: cfg.MaxConnections, // hard ceiling: new reqs block when hit
MaxIdleConnsPerHost: cfg.MaxConnections, // warm connections: avoids TCP+TLS overhead
MaxIdleConns: cfg.MaxConnections, // global pool: set >= sum of all per-host
// --- TCP dial ---
DialContext: (&net.Dialer{
Timeout: 2 * time.Second, // TCP handshake hard limit (unreachable host detection)
KeepAlive: 30 * time.Second, // TCP keepalive probe interval
}).DialContext,
// --- TLS ---
TLSHandshakeTimeout: 5 * time.Second, // TLS negotiation limit
// --- Request lifecycle ---
// ResponseHeaderTimeout: time between sending request and receiving first response byte.
// This is the "server is hung" detector. Set to p99 × 1.2.
ResponseHeaderTimeout: cfg.RequestTimeout,
// Evict idle connections before the server closes them (~90s on most servers).
IdleConnTimeout: 90 * time.Second,
}
return &Bulkhead{
cfg: cfg,
sem: semaphore.NewWeighted(cfg.MaxConcurrent),
client: &http.Client{
Transport: transport,
// Absolute deadline covering DNS+TCP+TLS+write+headers+body.
// Always set — without this, goroutines can leak indefinitely.
Timeout: cfg.RequestTimeout,
},
}
}
// Do executes an HTTP request through the bulkhead.
// Returns an error immediately if the semaphore is full (after QueueTimeout).
func (b *Bulkhead) Do(ctx context.Context, req *http.Request) (*http.Response, error) {
// --- Gate 1: Semaphore ---
// Use a child context with QueueTimeout so we don't wait forever for a slot.
acquireCtx, cancel := context.WithTimeout(ctx, b.cfg.QueueTimeout)
defer cancel()
if err := b.sem.Acquire(acquireCtx, 1); err != nil {
b.Metrics.Rejected.Add(1)
return nil, fmt.Errorf("bulkhead [%s] rejected (semaphore full after %s): %w",
b.cfg.Name, b.cfg.QueueTimeout, err)
}
defer b.sem.Release(1) // always release, even on HTTP error
b.Metrics.Accepted.Add(1)
// --- Gate 2: HTTP client (connection pool + timeouts) ---
resp, err := b.client.Do(req.WithContext(ctx))
if err != nil {
b.Metrics.Errors.Add(1)
return nil, fmt.Errorf("bulkhead [%s] call failed: %w", b.cfg.Name, err)
}
return resp, nil
}
// Stats returns a human-readable snapshot of metrics.
func (b *Bulkhead) Stats() string {
return fmt.Sprintf("[%s] accepted=%d rejected=%d errors=%d",
b.cfg.Name,
b.Metrics.Accepted.Load(),
b.Metrics.Rejected.Load(),
b.Metrics.Errors.Load(),
)
}
Why a Separate Semaphore per Dependency is Correct
authBulkhead.sem ──► controls only auth calls
paymentBulkhead.sem ──► controls only payment calls
Auth saturated:
authBulkhead.sem full → auth calls rejected
paymentBulkhead.sem unaffected → payment calls proceed normally
A shared semaphore across all dependencies defeats the purpose entirely.
6. Wiring It Up
package main
import (
"context"
"fmt"
"io"
"net/http"
"time"
"yourmodule/bulkhead"
)
// Declare one bulkhead per downstream dependency at package level.
// These are long-lived, safe to use concurrently.
var (
authBulkhead = bulkhead.New(authConfig())
paymentBulkhead = bulkhead.New(paymentConfig())
)
func HandleOrder(ctx context.Context) error {
// Auth call — isolated behind its own semaphore + connection pool
req, _ := http.NewRequest("GET", "https://auth.internal/verify", nil)
resp, err := authBulkhead.Do(ctx, req)
if err != nil {
// Bulkhead full or auth down — fail this request fast.
// Payment bulkhead is completely unaffected.
return fmt.Errorf("auth unavailable: %w", err)
}
defer resp.Body.Close()
io.Copy(io.Discard, resp.Body) // always drain body to return connection to pool
// Payment call — completely isolated from auth
req2, _ := http.NewRequest("POST", "https://payments.internal/charge", nil)
resp2, err := paymentBulkhead.Do(ctx, req2)
if err != nil {
return fmt.Errorf("payment unavailable: %w", err)
}
defer resp2.Body.Close()
io.Copy(io.Discard, resp2.Body)
return nil
}
Important: Always
io.Copy(io.Discard, resp.Body)before closing. If you close without draining, Go cannot reuse the TCP connection — it gets thrown away, causing unnecessary TCP handshakes on the next request.
7. Correctly Sized Real-World Configs
Sizing Formula
min_concurrent = target_rps × p99_latency_seconds
semaphore_limit = min_concurrent × 1.3 // +30% burst headroom
MaxConnections = semaphore_limit // keep aligned
RequestTimeout = p99_latency × 1.2 // slightly above p99, blast radius knob
QueueTimeout = your_SLA - p99 - overhead_budget
External HTTP API (slow, p99=500ms, target 30 rps)
// Little's Law: 30 × 0.5 = 15, +30% = 20
// Blast radius: controlled by tight RequestTimeout (600ms), NOT by low concurrency
// Slower dep → more slots needed to sustain throughput
func externalAPIConfig() bulkhead.Config {
return bulkhead.Config{
Name: "external-auth-api",
MaxConcurrent: 20, // L = 30rps × 0.5s = 15, +30% = 20
MaxConnections: 20,
RequestTimeout: 600 * time.Millisecond, // p99(500ms) × 1.2 — blast radius knob
QueueTimeout: 100 * time.Millisecond, // don't queue long for slow dep
}
}
Internal Microservice (fast, p99=5ms, target 500 rps)
// Little's Law: 500 × 0.005 = 2.5 → floor at ~10 (connection overhead)
// Faster dep → naturally fewer slots needed, slots free in milliseconds
func internalServiceConfig() bulkhead.Config {
return bulkhead.Config{
Name: "inventory-internal",
MaxConcurrent: 10, // L = 500rps × 0.005s = 2.5, floor at 10
MaxConnections: 10,
RequestTimeout: 20 * time.Millisecond, // tight: it's internal, same DC
QueueTimeout: 5 * time.Millisecond, // fail fast: if pool busy, something is wrong
}
}
PostgreSQL (p99=20ms, target 200 rps, 5 app instances)
// Little's Law: 200 × 0.02 = 4 per instance
// Hard constraint: Postgres max_connections=100, shared across 5 instances → 20 per instance
// Use: max(Little's Law result × 1.3, safety floor) but never exceed hard cap
func postgresConfig() bulkhead.Config {
return bulkhead.Config{
Name: "postgres",
MaxConcurrent: 10, // L=4, ×1.3=5.2, safety floor=10, hard cap=20 ✓
MaxConnections: 10,
RequestTimeout: 100 * time.Millisecond, // DB should be fast; slow DB = problem
QueueTimeout: 20 * time.Millisecond,
}
}
Kafka Producer (p99=2ms, target 2000 rps)
// Little's Law: 2000 × 0.002 = 4 concurrent
// Kafka batches internally — keep TCP connections low, semaphore has headroom for bursts
func kafkaConfig() bulkhead.Config {
return bulkhead.Config{
Name: "kafka-producer",
MaxConcurrent: 10, // L=4, headroom for burst
MaxConnections: 3, // Kafka multiplexes internally; few TCP conns enough
RequestTimeout: 50 * time.Millisecond,
QueueTimeout: 5 * time.Millisecond,
}
}
Summary Table
| Dependency | p99 | Target RPS | L=λ×W | +30% | MaxConcurrent | RequestTimeout | QueueTimeout |
|---|---|---|---|---|---|---|---|
| External API | 500ms | 30 | 15 | 20 | 20 | 600ms | 100ms |
| Internal svc | 5ms | 500 | 2.5 | 4* | 10* | 20ms | 5ms |
| Postgres | 20ms | 200 | 4 | 5** | 10** | 100ms | 20ms |
| Kafka | 2ms | 2000 | 4 | 5 | 10 | 50ms | 5ms |
- floor applied: connection establishment overhead makes sub-10 impractical ** floor applied, hard cap from Postgres
max_connections / num_instances
8. Mental Model Summary
Two Laws, Two Knobs
Little's Law → sets MaxConcurrent
slow dep = MORE slots (slots held longer)
fast dep = FEWER slots (slots released quickly)
Blast Radius Control → sets RequestTimeout + QueueTimeout
NOT concurrency — tight timeouts release slots fast
even if MaxConcurrent is high, a 600ms timeout limits damage
The Failure Isolation Guarantee
Normal: Auth saturated:
User → Auth ✓ User → Auth ✗ (semaphore full → instant 503)
User → Payment ✓ User → Payment ✓ (own semaphore, unaffected)
User → S3 ✓ User → S3 ✓ (own semaphore, unaffected)
What Each Layer Protects Against
Layer | Protects against
────────────────────────┼──────────────────────────────────────────
Semaphore | Goroutine pile-up from slow/hung deps
Connection pool | TCP connection exhaustion from one host
ResponseHeaderTimeout | Server that accepts connection but never responds
Client.Timeout | Goroutine leak from infinite response body reads
QueueTimeout | Cascading slowdowns from queue buildup
Bulkhead vs Circuit Breaker
These are complementary:
| Pattern | Question it answers | Action |
|---|---|---|
| Bulkhead | "How much capacity do I allocate?" | Limits concurrent slots per dep |
| Circuit Breaker | "Should I even try this call?" | Opens when error rate exceeds threshold |
Use both together: bulkhead limits blast radius, circuit breaker stops calling a dep that's known to be down.
Reference covers: golang.org/x/sync/semaphore, net/http.Transport, Little's Law (L=λW), Go scheduler IO parking, per-dependency isolation.
Note
So the minimal complete config for a single-host (not shared amont multiple deps) bulkhead transport is:
transport := &http.Transport{
MaxConnsPerHost: 20, // bulkhead ceiling
MaxIdleConnsPerHost: 20, // full connection reuse
IdleConnTimeout: 90 * time.Second, // evict before server does
DialContext: (&net.Dialer{
Timeout: 2 * time.Second,
KeepAlive: 30 * time.Second,
}).DialContext,
TLSHandshakeTimeout: 5 * time.Second,
ResponseHeaderTimeout: cfg.RequestTimeout,
}
Everything else (MaxIdleConns, MaxConnsPerHost without PerHost suffix) is either irrelevant for single-host transports or has safe defaults for this use case.
Top comments (0)