DEV Community

Cover image for 🗺️ Go Maps Deep Dive (no_swiss) — Part 1: The Secrets Behind O(1) Performance, Overflows, and Growth
arshia_rgh
arshia_rgh

Posted on • Edited on

🗺️ Go Maps Deep Dive (no_swiss) — Part 1: The Secrets Behind O(1) Performance, Overflows, and Growth

Note: This article focuses on the old Go map implementation (map_noswiss). In later parts of this series, we will also cover the Swiss map (map_swiss).

Welcome back to our deep dive into Go maps! 👋 In the first part of this series, we covered the basics of what Go maps are, their properties, and how to use them. Now, it's time to go deeper and uncover the magic that makes maps so efficient. 🗺️✨

In this article, we'll explore the inner workings of Go maps to understand:

  • Why map operations are usually O(1) and what can affect this performance. ⏱️
  • How Go handles hash collisions with "overflow" buckets. 📦➡️📦
  • The clever way maps grow without bringing your application to a halt. 🌱
  • The role of a unique "seed" in the hashing process. 🎲

Let's get started!

The O(1) Magic: It's All About Hashing

You've probably heard that map lookups, insertions, and deletions are O(1) operations, meaning they take constant time on average, regardless of the number of elements in the map. But how is this possible? The secret lies in hashing. 🔑➡️🔢

When you add a key-value pair to a map, Go uses a hash function to convert the key into a number. This number, or "hash," is then used to determine which "bucket" the key-value pair should be stored in. A bucket is essentially a small, fixed-size array that can hold a few key-value pairs.

Because the hash function is designed to be very fast and to distribute keys evenly across the available buckets, Go can quickly locate the correct bucket for a given key without having to check every element in the map. This is what gives maps their O(1) performance. 💪

However, I said usually O(1). What happens when two different keys generate the same hash? This is called a hash collision, and it can slightly degrade performance. We'll explore this more in the section on overflows. 💥

The Unique Seed: Why the Same Key Can Live in Different Places

Here's a fun fact that might surprise you: if you create two different maps and insert the same key into both, that key might end up in different buckets in each map. 🤯

Why? Because every new map in Go is created with a unique random seed. This seed is used in the hashing algorithm, which means that the hash generated for a key is unique to that specific map instance.

Let's look at an example:

package main

import "fmt"

func main() {
    map1 := make(map[string]int)
    map2 := make(map[string]int)

    map1["a"] = 1
    map2["a"] = 1

    // The internal location of "a" in map1 and map2
    // will likely be different!
}
Enter fullscreen mode Exit fullscreen mode

This randomization is a security feature that helps prevent a type of attack called "hash-flooding," where an attacker could craft keys that all hash to the same bucket, effectively turning the map's O(1) operations into O(n) and slowing down your application. 🛡️

When Buckets Get Full: The Role of Overflows

Each bucket in a Go map can hold up to 8 key-value pairs. But what happens when more than 8 keys hash to the same bucket? Go doesn't just give up. Instead, it creates an overflow bucket and links it to the original bucket. 🔗

A simplified visualization of a hash table bucket with an overflow bucket attached, showing keys spilling into the overflow when the primary bucket is full.

If you have many keys that hash to the same bucket, you can end up with a chain of overflow buckets. When looking up a key in this scenario, Go first checks the main bucket and then has to traverse the chain of overflow buckets. This is one of the reasons why map operations are not always O(1). In the worst-case scenario (many collisions), a lookup could degrade to O(n) performance, where 'n' is the number of items in the bucket and its overflows. 🐢

The Growing Map: An Incremental Approach

As you add more and more elements to a map, it will eventually need to grow to accommodate the new data and maintain its performance. A map in Go will trigger a growth cycle in one of two situations:

The load factor is too high:

If the average number of elements per bucket exceeds a certain threshold (currently around 6.5), the map will grow. 📈

Too many overflow buckets:

If the map has too many overflow buckets, it's a sign that the keys are not well-distributed, and a resize is needed to spread them out more evenly. 🧹

But how does this growth happen? If you have a map with thousands or even millions of key-value pairs, resizing it all at once could cause a noticeable pause in your application. To avoid this, Go employs a clever strategy: incremental growth. 🚀

When a map needs to grow, Go allocates a new, larger array of buckets (usually double the size of the old one). However, it doesn't copy all the data over at once. Instead, the copying happens gradually. Each time you write to or delete from the map, Go moves a few buckets from the old array to the new one. ➡️➡️➡️

This incremental approach spreads the work of resizing over time, ensuring that no single map operation takes too long. It's a fantastic example of how Go is designed for building responsive, low-latency applications. 💖

🗝️About the hint in make(map[K]V, hint)

When you write something like:

x := make(map[int]int, 13)
Enter fullscreen mode Exit fullscreen mode

the second argument (13 in this case) is not the length, size, or capacity of the map. It is just a hint to the runtime about how many elements you expect to insert.

Go maps use buckets internally, and the runtime uses the hint to decide how many buckets to preallocate. Each bucket can hold up to 8 key/value pairs. Go tries to keep the average load factor around 6.5 elements per bucket before growing.

For example,

  • With a hint of 13, the runtime will start with 2 buckets (capacity for 16 slots). Since 13 / 2 = 6.5, this is right at the threshold.

  • If the hint were 14, the runtime would start with 4 buckets (capacity for 32 slots). That’s because 14 / 2 = 7 would exceed the load factor limit, so it doubles the bucket count.

This is why the hint value causes jumps in allocation instead of scaling smoothly.

take a look at this table:

Hint Range Bucket Count Capacity
0 – 8 1 8
9 – 13 2 16
14 – 26 4 32
27 – 52 8 64
53 – 104 16 128
105 – 208 32 256
209 – 416 64 512
417 – 832 128 1024
833 – 1664 256 2048

What's Next?

We've covered a lot of ground in this article, from hashing and overflows to the incremental growth of maps. I hope you now have a better appreciation for the sophisticated engineering that makes Go maps so powerful and efficient. 🧠💡

But we're not done yet! In the next part of this series, we'll get our hands dirty and dive into the Go source code to see exactly how all of these concepts are implemented. It's going to be a fun and enlightening journey, so stay tuned! 💻🔍

Top comments (7)

Collapse
 
alireza_ahmadi_55a36a3c3b profile image
Alireza Ahmadi

This is a fantastic deep dive into Go maps! The explanation of how hashing works behind the scenes, the management of overflow buckets, and the gradual growth strategy of maps is incredibly insightful. I especially appreciate the point on using random seeds in the hashing algorithm for security purposes to prevent hash-flooding attacks. For Go developers and anyone interested in performance optimization, this article is a must-read. Looking forward to part 2!

Collapse
 
arshiargh profile image
arshia_rgh

Thank you so much for your thoughtful comment, Alireza! I really appreciate it, and I’m very happy to hear the article was useful for you. I truly enjoy working with Go and its internals, and I hope you enjoy exploring it as much as I do. Your encouragement means a lot—I’ll do my best to make part 2 just as insightful!

Collapse
 
ryansgi profile image
Ryan B

Love these kinds of deep dives. Keep them coming, more folks should know the inner workings of Go!

Collapse
 
arshiargh profile image
arshia_rgh

Thank you so much for your comment!

Golang internals are my favorite as well, and I really enjoy exploring them. Stay tuned for the next part on map internals—we’ll dive deep into the source code behind everything I covered in the first two parts.

Collapse
 
andreis profile image
Andrei Simionescu

The language is called Go, not Golang, which the AI slop article got right, but its author apparently doesn't.

Collapse
 
smurf_j_c2342046c5106ca profile image
smurf j

I think i have found the right place for me in the Dev.to

Collapse
 
arshiargh profile image
arshia_rgh

I'm happy to hear that:)