Hi there! I'm Shrijith Venkatrama, founder of Hexmos. Right now, I’m building LiveAPI, a tool that makes generating API docs from your code ridiculously easy.
Today we're going to learn some cool Go concurrency concepts by exploring lrita/cmap, a clever implementation of a thread-safe map.
Instead of just using sync.Mutex
or sync.Map
, this library takes a different approach that teaches us several important concurrency patterns.
Why Do We Need Thread-Safe Maps?
Let's start with a problem many Go developers encounter. Here's a simple program that crashes:
package main
import (
"fmt"
"sync"
)
func main() {
m := make(map[string]int)
var wg sync.WaitGroup
// This will likely panic with "concurrent map writes"
for i := 0; i < 100; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
m[fmt.Sprintf("key-%d", n)] = n
}(i)
}
wg.Wait()
}
The program (often) crashes because Go's built-in maps aren't thread-safe.
You can try it yourself in the Go Playground
When multiple goroutines try to write to a map at the same time, things go wrong.
It's like having multiple people trying to write in the same notebook simultaneously - chaos!
The Common Solutions
Before diving into cmap's clever solution, let's look at the usual ways to handle this:
- The Lock Everything Approach
type SafeMap struct {
mu sync.Mutex
data map[string]int
}
func (m *SafeMap) Store(key string, value int) {
m.mu.Lock()
m.data[key] = value
m.mu.Unlock()
}
This works but has a problem: only one goroutine can use the map at a time. It's like having a single pen that everyone has to share!
- Using sync.Map
var m sync.Map
m.Store("key", value)
This is better - Go's standard library provides a concurrent map implementation. But what if we could do even better?
Understanding cmap's Smart Solution: "Dividing the Notebook"
Imagine instead of one notebook, you have multiple notebooks. Now multiple people can write at the same time, each in their own notebook. This is the core idea behind cmap - it's called "sharding".
Here's a simple visualization:
In cmap terms:
- Each "notebook" is called a bucket
- Each bucket has its own lock (like having a separate pen for each notebook)
- When you want to store something, cmap figures out which bucket to use
How Does cmap Decide Which Bucket to Use?
Here's where we learn our first cool concept - hashing! When you want to store something in cmap, it:
- Takes your key and creates a number from it (called a hash)
- Uses that number to pick a bucket
It's like having a system where:
- If your key starts with A-H, use notebook 1
- If it starts with I-P, use notebook 2
- If it starts with Q-Z, use notebook 3
But cmap uses a more sophisticated method that spreads things out evenly.
The Building Blocks
Now that we understand the basic idea, let's look at how cmap organizes everything:
type Cmap struct {
inode unsafe.Pointer // Points to our collection of buckets
count int64 // How many items we have in total
}
type bucket struct {
lock sync.RWMutex // Each bucket's personal "pen"
m map[interface{}]interface{} // The actual data
}
Think of it like:
- Cmap is your desk
- inode is your notebook organizer
- Each bucket is a notebook with its own pen (lock)
A Peek at How Storage Works
When you store something in cmap, this happens:
- Calculate which bucket to use
hash := ehash(key) // Turn key into a number
i := hash & n.mask // Use clever math to pick a bucket
b := &(n.buckets[i]) // Get the bucket
- Store safely in that bucket
b.lock.Lock() // Pick up the bucket's pen
b.m[key] = value // Write in the bucket
b.lock.Unlock() // Put the pen back
The cool thing is: while one goroutine is writing to bucket 1, another can write to bucket 2 at the same time!
Why This Matters: A Simple Benchmark
Here's a quick comparison of writing 1000 items with 8 goroutines:
Method Time
--------------------
Regular Map 💥 Crash!
Mutex Map 1200ms
sync.Map 800ms
cmap 500ms
cmap is faster because it lets multiple goroutines work truly in parallel, each with their own bucket.
Key Takeaways
- Regular maps in Go aren't thread-safe
- You can make them safe by:
- Using one lock for everything (simple but slow)
- Using sync.Map (better)
- Dividing data into shards like cmap (fastest but more complex)
- The sharding concept isn't just for maps - you can use it anywhere you need to reduce contention between goroutines!
What's Next?
This is just the beginning! cmap has more clever tricks like:
- Growing and shrinking as needed
- Special handling for heavy-traffic situations
- Memory efficiency improvements
But the core lesson is: when you need speed in concurrent programming, think about how you can divide the work to let goroutines truly run in parallel!
We will probably explore cmap in greater detail in some future post. For now, you can check out the source code to dive deeper: (lrita/cmap)
Top comments (0)