Read this to understand how sync.Map
works internally: https://sreramk.medium.com/go-inside-sync-map-how-does-sync-map-work-internally-97e87b8e6bf
To anyone new to concurrency, it might not seem obvious that just the methods Load
, Store
and Delete
cannot be used to perform the basic read, write and delete operations commonly done using the regular non-thread-safe map
from within a single goroutine. Theoretically, without LoadOrStore
, it is impossible to have multiple goroutines simultaneously create a record associated with a key precisely once without the use of additional locks. The same way, without LoadAndDelete
it is impossible for a goroutine to delete and retrieve the most recently stored (or modified) record without using additional locks to synchronize Load
and Delete
.
Let’s try to understand why.
Using sync.Map to store integers that can be incremented or decremented concurrently:
In the following example, the structure Counter
is supposed to hold integer values associated with a string. These values are update-able, and lets you add arbitrary values to them. Even when these values are updated concurrently, the final value stored by any key will always be accurate. This can only be achieved with LoadAndDelete
and LoadOrStore
, or we may have to explicitly synchronize calls to Load
and Store
with locks:
package countableimport (
"sync"
"sync/atomic"
)
// Counter Stores counts associated with a key.
type Counter struct {
m sync.Map
}
// Get Retrieves the count without modifying it
func (c *Counter) Get(key string) (int64, bool) {
count, ok := c.m.Load(key)
if ok {
return atomic.LoadInt64(count.(*int64)), true
}
return 0, false
}
// Add Adds value to the stored underlying value if it exists.
// If it does not exist, the value is assigned to the key.
func (c *Counter) Add(key string, value int64) int64 {
count, loaded := c.m.LoadOrStore(key, &value)
if loaded {
return atomic.AddInt64(count.(*int64), value)
}
return *count.(*int64)
}
// DeleteAndGetLastValue Deletes the value associated with the key and retrieves it.
func (c *Counter) DeleteAndGetLastValue(key string) (int64, bool) {
lastValue, loaded := c.m.LoadAndDelete(key)
if loaded {
return *lastValue.(*int64), loaded
}
return 0, false
}
The Add
operation:
In the method Add
, the LoadOrStore
operation guarantees that a pointer of type *int64
will be written precisely once to the store if it does not exist regardless of the number of goroutines trying to perform the add operation with the same key. Let’s see a pseudo code that uses Load
and Store
instead of LoadOrStore
to do the same, and demonstrate why this cannot be done without risking race conditions (unless we use locks).
Function AddUnsafe(key, value):
1. addressOfPreviousValue := Load(key)
2. check condition: addressOfPreviousValue is nil (i.e., does not exists)
3. If the condition at (2) was true: Store(key, addressOf(value))
4. If the condition at (2) was false: AtomicAdd(addressOfPreviousValue, value)
An example of a race condition that results from calling AddUnsafe
from multiple goroutines simultaneously:
// Go 1 - represents the first goroutine
// Go 2 - represents the second goroutine1. Go 1: addressOfPreviousValue := Load(key)
2. Go 2: addressOfPreviousValue := Load(key)
3. Go 1: check condition: addressOfPreviousValue is nil
4. Go 2: check condition: addressOfPreviousValue is nil
5. Go 2: The condition (2) was true: Store(key, addressOf(value2))
6. Go 1: The condition (2) was true: Store(key, addressOf(value1))
In 5 and 6 the record was updated twice, first by goroutine 2 and then by goroutine 1. With update 6, the update at 5 will be completely lost, and the final count will always be incorrect as it would have disregarded goroutine 2’s update. This can only be fixed by introducing a lock but doing so defeats the purpose of using sync.Map
in the first place.
By using LoadOrStore
as in the below method, we can ensure that regardless of the number of goroutines simultaneously updating a record, the state stored by Counter
will always be consistent.
// Add Adds value to the stored underlying value if it exists.
// If it does not exist, the value is assigned to the key.
func (c *Counter) Add(key string, value int64) int64 {
count, loaded := c.m.LoadOrStore(key, &value)
if loaded {
return atomic.AddInt64(count.(*int64), value)
}
return *count.(*int64)
}
LoadOrStore
ensures that a record associated with a key is created for the first time precisely once, regardless of how many goroutines simultaneously tries to create a record. If Add
wasn’t an update operation that depends on the previous state, we might be better off with just using Store
.
If there was no record associated with the given key, a pointer to the new record is stored. If the previous value was value
, then the next value held by the key will be value + 1
. If we were to just use Load
, there would always be a time-gap between calling Load
and computing the next state before storing it. Within this time-gap, multiple goroutines could have updated the storage multiple times all of which will be lost when the update completes (following the initial load operation). Therefore, whenever we may have to use a state transition operation that has its next state dependent on its previous state, the record stored in sync.Map
should use an internal structure that uses locks to synchronize updates. If locks aren't used to synchronize access to the store, it must be made immutable.
sync.Map
Guarantees the reads and writes to be thread-safe, but it is the user’s responsibility to make the underlying stored structure thread safe. In the above example, we had used atomic.AddInt64
, along with atomic.LoadInt64
to ensure the access to the stored pointer is atomic.
The DeleteAndGetLastValue
operation:
A similar line of reasoning used for LoadOrStore
can be applied to understand why LoadAndDelete
will be useful. If a method attempts to delete a record and retrieve its most recent value, it would be theoretically impossible without LoadAndDelete
unless we use locks to synchronize Load
followed by Delete
.
The following method guarantees that the returned value (after deleting the record) was the most recent one, before it was deleted.
// DeleteAndGetLastValue Deletes the value associated with the key and retrieves it.
func (c *Counter) DeleteAndGetLastValue(key string) (int64, bool) {
lastValue, loaded := c.m.LoadAndDelete(key)
if loaded {
return *lastValue.(*int64), loaded
}
return 0, false
}
If we have to use Load
and Delete
to accomplish the above, there is always a possibility that a different goroutine might read a value at some time, and delete it at a later time, in-between which any number of goroutines could have updated the value. Therefore, when we return the loaded value after the call to Delete
we cannot be certain that there weren’t any updates prior to its call so the returned value could be outdated.
Effectively designing with async.Map
:
As mentioned before, async.Map
guarantees that the records can be read from and written to atomically. But it cannot control the objects stored within it, the same way it can control the access to them.
Therefore, in cases where the stored data is mutable, we may have to use locks to synchronize the operations performed by the structure. Or, if possible, the storage can be immutable, in which case no reference from outside will ever be able to modify its value.
Real world example use-cases for LoadAndDelete
and LoadOrStore
:
A comment on the original issue on github proposing to add LoadAndDelete
provides a really good example which is discussed below:
The discussion can be found here : https://github.com/golang/go/issues/33762
The issue was a proposal to add LoadAndDelete
, and therefore serves as a really convincing example for its necessity.
The issue’s author had a situation where he had to use a cache to store database records, to prevent calling an expensive operation to insert or remove the data from the database. For this, he could easily implement the insertion part, with just the use of LoadOrStore
as given below:
func (self *Cache) Add(key interface{}, value interface{}) {
_, loaded := self.syncMap.LoadOrStore(key, value)
if !loaded {
self.database.Insert(key, value) // expensive operation
}
}
When the above is being called by multiple goroutines, the write to the database happens precisely once as LoadOrStore
guarantees to add data precisely once. Therefore, the expensive call to insert the data into the database is executed only once. This is actually a very specific (and possibly rare) situation where all the goroutines are trying to write the exact same data. It is usually common to have each goroutine write different data to the same store.
Before Go had introduced the LoadAndDelete
method, the author didn’t have a definite way to implement something similar for his Remove(key)
method. And that lead him to propose LoadAndDelete
which makes the following possible without additional locks:
func (self *Cache) Remove(key interface{}) {
_, removed := self.syncMap.LoadAndDelete(key) // proposed function
if removed {
self.database.Delete(key) // expensive operation
}
}
Calling both Add
and Remove
concurrently might lead to race conditions. As an example,
// Go 1 - run Add(...)
// Go 2 - run Remove(...)
Go 1: _, loaded := self.syncMap.LoadOrStore(key, value)
Go 2: _, removed := self.syncMap.LoadAndDelete(key)
Go 2: if removed { // false
Go 1: if !loaded { // true (loaded -> false)
Go 1: self.database.Insert(key, value)
// In this case, the data-base will have an inserted record without
// the cache having the same record in its local store
Therefore, the methods Add
and Remove
are not supposed to be run concurrently. The issue’s author specifically mentioned that the Add
operations will first be run concurrently by multiple goroutines, followed by which the Remove
operations are run by the goroutines long after the calls to the Add
operations were completed.
This specific case clearly exemplifies a likely real-world scenario.
Code and other resources
The code for this article can be found here: https://github.com/sreramk/forblogarticles/tree/main/article_1
These test cases check for possible race conditions in the code: https://github.com/sreramk/forblogarticles/blob/main/article_1/countable_test.go
Dear reader,
I am available for remote contract jobs related to Go or any other technology. I write highly scalable microservices, and I can also help you with implementing reliable AI/ML solutions. Please email me here sreramk1@gmail.com if you are interested.
Also, please feel free to email me for any reason, other than business as well!
Sincerely,
Sreram K
mirror: https://sreramk.medium.com/go-why-does-sync-map-97342f12b3fa
Top comments (0)