This is Part 2 of my series building Loom.
π Missed Part 1? Read it here
Today: Reflection cache, stampede protection, and the deadlock that kept me up until 11 PM.
The problem
When 50 goroutines all need the same method descriptor at the same time, my naive code made ALL 50 hit the backend:
func (c *ReflectionCache) GetMethod(method string) (*MethodDescriptor, error) {
return c.fetchFromBackend(method) // π₯ 50x RPC calls
}
Result: 50 identical calls. 50x load. 50x latency. Not good.
The fix: singleflight
Go has singleflight in golang.org/x/sync β it ensures only one goroutine fetches, the rest wait for that result.
Final code:
import "golang.org/x/sync/singleflight"
type ReflectionCache struct {
cache map[string]*MethodDescriptor
mu sync.RWMutex
group singleflight.Group
}
func (c *ReflectionCache) GetMethod(method string) (*MethodDescriptor, error) {
// Fast path: already cached?
c.mu.RLock()
if desc, ok := c.cache[method]; ok {
c.mu.RUnlock()
return desc, nil
}
c.mu.RUnlock()
// Slow path: single fetch, everyone waits
result, err, _ := c.group.Do(method, func() (interface{}, error) {
desc, err := c.fetchFromBackend(method)
if err != nil {
return nil, err
}
c.mu.Lock()
c.cache[method] = desc
c.mu.Unlock()
return desc, nil
})
return result.(*MethodDescriptor), err
}
What changed: 1 backend call instead of 50. All 50 goroutines get the result in ~50ms instead of 2500ms.
The embarrassing deadlock
I tried building this myself first. Here's the bug that took 3 hours:
// β οΈ DEADLOCK β Don't do this
func (c *ReflectionCache) GetMethod(method string) (*MethodDescriptor, error) {
c.mu.Lock()
defer c.mu.Unlock() // β This will run later
// ... check cache ...
c.mu.Unlock() // Manual unlock
desc, _ := c.fetchFromBackend(method)
c.mu.Lock() // Re-lock
return desc, nil // defer still tries to unlock β panic
}
Lesson: Don't mix defer and manual lock/unlock. And just use singleflight.
Performance
Approach Backend calls (100 reqs) Total time
No cache 100 5000ms
Mutex only 1 5000ms
Singleflight 1 ~52ms
96% faster.
Key takeaways
Cache stampedes are real β they'll crush your backend
singleflight is your friend β don't roll your own
Test with -race β it catches deadlocks
Read locks (RLock) for cache hits β saves contention
Try Loom yourself
joshuabvarghese
/
Loom
gRPC L7 Debugging Proxy
Loom
A gRPC debugging proxy. Point it at your backend, point your client at Loom, and watch every call decoded in a browser tab.
Your gRPC Client β Loom (:9999) β Your Backend (:50051)
β
Web Inspector
http://localhost:9998
Why
gRPC traffic is binary. Wireshark can't read it. grpcurl is great for one-off calls but you can't watch a flow. I kept running it over and over trying to understand what was happening between services.
Loom sits transparently between your client and backend. It uses Server Reflection to decode every frame on the fly β no .proto files required β and streams the results into a browser UI. You see the JSON payloads, the status codes, how long each call took, and a ready-to-copy grpcurl command to replay any of them.
What it does
- Intercepts all four gRPC stream types β unary, server-streaming, client-streaming, bidi
- Auto-decodes using Server Reflection (no protoβ¦
Top comments (0)