Note: This article focuses on the old Go map implementation (map_noswiss). In later parts of this series, we will also cover the Swiss map (map_swiss).
Welcome back to our deep dive into Go maps! 👋 In the first part of this series, we covered the basics of what Go maps are, their properties, and how to use them. Now, it's time to go deeper and uncover the magic that makes maps so efficient. 🗺️✨
In this article, we'll explore the inner workings of Go maps to understand:
- Why map operations are usually O(1) and what can affect this performance. ⏱️
- How Go handles hash collisions with "overflow" buckets. 📦➡️📦
- The clever way maps grow without bringing your application to a halt. 🌱
- The role of a unique "seed" in the hashing process. 🎲
Let's get started!
The O(1) Magic: It's All About Hashing
You've probably heard that map lookups, insertions, and deletions are O(1) operations, meaning they take constant time on average, regardless of the number of elements in the map. But how is this possible? The secret lies in hashing. 🔑➡️🔢
When you add a key-value pair to a map, Go uses a hash function to convert the key into a number. This number, or "hash," is then used to determine which "bucket" the key-value pair should be stored in. A bucket is essentially a small, fixed-size array that can hold a few key-value pairs.
Because the hash function is designed to be very fast and to distribute keys evenly across the available buckets, Go can quickly locate the correct bucket for a given key without having to check every element in the map. This is what gives maps their O(1) performance. 💪
However, I said usually O(1). What happens when two different keys generate the same hash? This is called a hash collision, and it can slightly degrade performance. We'll explore this more in the section on overflows. 💥
The Unique Seed: Why the Same Key Can Live in Different Places
Here's a fun fact that might surprise you: if you create two different maps and insert the same key into both, that key might end up in different buckets in each map. 🤯
Why? Because every new map in Go is created with a unique random seed. This seed is used in the hashing algorithm, which means that the hash generated for a key is unique to that specific map instance.
Let's look at an example:
package main
import "fmt"
func main() {
map1 := make(map[string]int)
map2 := make(map[string]int)
map1["a"] = 1
map2["a"] = 1
// The internal location of "a" in map1 and map2
// will likely be different!
}
This randomization is a security feature that helps prevent a type of attack called "hash-flooding," where an attacker could craft keys that all hash to the same bucket, effectively turning the map's O(1) operations into O(n) and slowing down your application. 🛡️
When Buckets Get Full: The Role of Overflows
Each bucket in a Go map can hold up to 8 key-value pairs. But what happens when more than 8 keys hash to the same bucket? Go doesn't just give up. Instead, it creates an overflow bucket and links it to the original bucket. 🔗
A simplified visualization of a hash table bucket with an overflow bucket attached, showing keys spilling into the overflow when the primary bucket is full.
If you have many keys that hash to the same bucket, you can end up with a chain of overflow buckets. When looking up a key in this scenario, Go first checks the main bucket and then has to traverse the chain of overflow buckets. This is one of the reasons why map operations are not always O(1). In the worst-case scenario (many collisions), a lookup could degrade to O(n) performance, where 'n' is the number of items in the bucket and its overflows. 🐢
The Growing Map: An Incremental Approach
As you add more and more elements to a map, it will eventually need to grow to accommodate the new data and maintain its performance. A map in Go will trigger a growth cycle in one of two situations:
The load factor is too high:
If the average number of elements per bucket exceeds a certain threshold (currently around 6.5), the map will grow. 📈
Too many overflow buckets:
If the map has too many overflow buckets, it's a sign that the keys are not well-distributed, and a resize is needed to spread them out more evenly. 🧹
But how does this growth happen? If you have a map with thousands or even millions of key-value pairs, resizing it all at once could cause a noticeable pause in your application. To avoid this, Go employs a clever strategy: incremental growth. 🚀
When a map needs to grow, Go allocates a new, larger array of buckets (usually double the size of the old one). However, it doesn't copy all the data over at once. Instead, the copying happens gradually. Each time you write to or delete from the map, Go moves a few buckets from the old array to the new one. ➡️➡️➡️
This incremental approach spreads the work of resizing over time, ensuring that no single map operation takes too long. It's a fantastic example of how Go is designed for building responsive, low-latency applications. 💖
🗝️About the hint in make(map[K]V, hint)
When you write something like:
x := make(map[int]int, 13)
the second argument (13 in this case) is not the length, size, or capacity of the map. It is just a hint to the runtime about how many elements you expect to insert.
Go maps use buckets internally, and the runtime uses the hint to decide how many buckets to preallocate. Each bucket can hold up to 8 key/value pairs. Go tries to keep the average load factor around 6.5 elements per bucket before growing.
For example,
With a hint of 13, the runtime will start with 2 buckets (capacity for 16 slots). Since 13 / 2 = 6.5, this is right at the threshold.
If the hint were 14, the runtime would start with 4 buckets (capacity for 32 slots). That’s because 14 / 2 = 7 would exceed the load factor limit, so it doubles the bucket count.
This is why the hint value causes jumps in allocation instead of scaling smoothly.
take a look at this table:
Hint Range | Bucket Count | Capacity |
---|---|---|
0 – 8 | 1 | 8 |
9 – 13 | 2 | 16 |
14 – 26 | 4 | 32 |
27 – 52 | 8 | 64 |
53 – 104 | 16 | 128 |
105 – 208 | 32 | 256 |
209 – 416 | 64 | 512 |
417 – 832 | 128 | 1024 |
833 – 1664 | 256 | 2048 |
What's Next?
We've covered a lot of ground in this article, from hashing and overflows to the incremental growth of maps. I hope you now have a better appreciation for the sophisticated engineering that makes Go maps so powerful and efficient. 🧠💡
But we're not done yet! In the next part of this series, we'll get our hands dirty and dive into the Go source code to see exactly how all of these concepts are implemented. It's going to be a fun and enlightening journey, so stay tuned! 💻🔍
Top comments (7)
This is a fantastic deep dive into Go maps! The explanation of how hashing works behind the scenes, the management of overflow buckets, and the gradual growth strategy of maps is incredibly insightful. I especially appreciate the point on using random seeds in the hashing algorithm for security purposes to prevent hash-flooding attacks. For Go developers and anyone interested in performance optimization, this article is a must-read. Looking forward to part 2!
Thank you so much for your thoughtful comment, Alireza! I really appreciate it, and I’m very happy to hear the article was useful for you. I truly enjoy working with Go and its internals, and I hope you enjoy exploring it as much as I do. Your encouragement means a lot—I’ll do my best to make part 2 just as insightful!
Love these kinds of deep dives. Keep them coming, more folks should know the inner workings of Go!
Thank you so much for your comment!
Golang internals are my favorite as well, and I really enjoy exploring them. Stay tuned for the next part on map internals—we’ll dive deep into the source code behind everything I covered in the first two parts.
The language is called Go, not Golang, which the AI slop article got right, but its author apparently doesn't.
I think i have found the right place for me in the Dev.to
I'm happy to hear that:)