DEV Community

Sreram K
Sreram K

Posted on • Updated on

Go: sync.Map's LoadAndDelete and LoadOrStore. Why are they needed?

Read this to understand how sync.Map works internally: https://sreramk.medium.com/go-inside-sync-map-how-does-sync-map-work-internally-97e87b8e6bf

To anyone new to concurrency, it might not seem obvious that just the methods Load , Store and Delete cannot be used to perform the basic read, write and delete operations commonly done using the regular non-thread-safe map from within a single goroutine. Theoretically, without LoadOrStore, it is impossible to have multiple goroutines simultaneously create a record associated with a key precisely once without the use of additional locks. The same way, without LoadAndDelete it is impossible for a goroutine to delete and retrieve the most recently stored (or modified) record without using additional locks to synchronize Load and Delete.

Let’s try to understand why.

Using sync.Map to store integers that can be incremented or decremented concurrently:

In the following example, the structure Counter is supposed to hold integer values associated with a string. These values are update-able, and lets you add arbitrary values to them. Even when these values are updated concurrently, the final value stored by any key will always be accurate. This can only be achieved with LoadAndDelete and LoadOrStore, or we may have to explicitly synchronize calls to Load and Store with locks:

package countableimport (
    "sync"
    "sync/atomic"
)

// Counter Stores counts associated with a key.
type Counter struct {
    m sync.Map
}

// Get Retrieves the count without modifying it
func (c *Counter) Get(key string) (int64, bool) {
    count, ok := c.m.Load(key)
    if ok {
        return atomic.LoadInt64(count.(*int64)), true
    }
    return 0, false
}

// Add Adds value to the stored underlying value if it exists. 
// If it does not exist, the value is assigned to the key.
func (c *Counter) Add(key string, value int64) int64 {

    count, loaded := c.m.LoadOrStore(key, &value)
    if loaded {
        return atomic.AddInt64(count.(*int64), value)
    }
    return *count.(*int64)
}

// DeleteAndGetLastValue Deletes the value associated with the key and retrieves it.
func (c *Counter) DeleteAndGetLastValue(key string) (int64, bool) {

    lastValue, loaded := c.m.LoadAndDelete(key)
    if loaded {
        return *lastValue.(*int64), loaded
    }
    return 0, false
}

Enter fullscreen mode Exit fullscreen mode

The Add operation:

In the method Add, the LoadOrStore operation guarantees that a pointer of type *int64 will be written precisely once to the store if it does not exist regardless of the number of goroutines trying to perform the add operation with the same key. Let’s see a pseudo code that uses Load and Store instead of LoadOrStore to do the same, and demonstrate why this cannot be done without risking race conditions (unless we use locks).

Function AddUnsafe(key, value):
    1. addressOfPreviousValue := Load(key)
    2. check condition: addressOfPreviousValue is nil (i.e., does not exists)
    3. If the condition at (2) was true: Store(key, addressOf(value))
    4. If the condition at (2) was false: AtomicAdd(addressOfPreviousValue, value)
Enter fullscreen mode Exit fullscreen mode

An example of a race condition that results from calling AddUnsafe from multiple goroutines simultaneously:

// Go 1 - represents the first goroutine
// Go 2 - represents the second goroutine1. Go 1: addressOfPreviousValue := Load(key)
2. Go 2: addressOfPreviousValue := Load(key)
3. Go 1: check condition: addressOfPreviousValue is nil
4. Go 2: check condition: addressOfPreviousValue is nil
5. Go 2: The condition (2) was true: Store(key, addressOf(value2))
6. Go 1: The condition (2) was true: Store(key, addressOf(value1))
Enter fullscreen mode Exit fullscreen mode

In 5 and 6 the record was updated twice, first by goroutine 2 and then by goroutine 1. With update 6, the update at 5 will be completely lost, and the final count will always be incorrect as it would have disregarded goroutine 2’s update. This can only be fixed by introducing a lock but doing so defeats the purpose of using sync.Map in the first place.

By using LoadOrStore as in the below method, we can ensure that regardless of the number of goroutines simultaneously updating a record, the state stored by Counter will always be consistent.

// Add Adds value to the stored underlying value if it exists. 
// If it does not exist, the value is assigned to the key.
func (c *Counter) Add(key string, value int64) int64 {

    count, loaded := c.m.LoadOrStore(key, &value)
    if loaded {
        return atomic.AddInt64(count.(*int64), value)
    }
    return *count.(*int64)
}
Enter fullscreen mode Exit fullscreen mode

LoadOrStore ensures that a record associated with a key is created for the first time precisely once, regardless of how many goroutines simultaneously tries to create a record. If Add wasn’t an update operation that depends on the previous state, we might be better off with just using Store.

If there was no record associated with the given key, a pointer to the new record is stored. If the previous value was value, then the next value held by the key will be value + 1 . If we were to just use Load , there would always be a time-gap between calling Load and computing the next state before storing it. Within this time-gap, multiple goroutines could have updated the storage multiple times all of which will be lost when the update completes (following the initial load operation). Therefore, whenever we may have to use a state transition operation that has its next state dependent on its previous state, the record stored in sync.Map should use an internal structure that uses locks to synchronize updates. If locks aren't used to synchronize access to the store, it must be made immutable.

sync.Map Guarantees the reads and writes to be thread-safe, but it is the user’s responsibility to make the underlying stored structure thread safe. In the above example, we had used atomic.AddInt64 , along with atomic.LoadInt64 to ensure the access to the stored pointer is atomic.

The DeleteAndGetLastValue operation:

A similar line of reasoning used for LoadOrStore can be applied to understand why LoadAndDelete will be useful. If a method attempts to delete a record and retrieve its most recent value, it would be theoretically impossible without LoadAndDelete unless we use locks to synchronize Load followed by Delete.

The following method guarantees that the returned value (after deleting the record) was the most recent one, before it was deleted.

// DeleteAndGetLastValue Deletes the value associated with the key and retrieves it.
func (c *Counter) DeleteAndGetLastValue(key string) (int64, bool) {

    lastValue, loaded := c.m.LoadAndDelete(key)
    if loaded {
        return *lastValue.(*int64), loaded
    }
    return 0, false
}
Enter fullscreen mode Exit fullscreen mode

If we have to use Load and Delete to accomplish the above, there is always a possibility that a different goroutine might read a value at some time, and delete it at a later time, in-between which any number of goroutines could have updated the value. Therefore, when we return the loaded value after the call to Delete we cannot be certain that there weren’t any updates prior to its call so the returned value could be outdated.

Effectively designing with async.Map:

As mentioned before, async.Map guarantees that the records can be read from and written to atomically. But it cannot control the objects stored within it, the same way it can control the access to them.

Therefore, in cases where the stored data is mutable, we may have to use locks to synchronize the operations performed by the structure. Or, if possible, the storage can be immutable, in which case no reference from outside will ever be able to modify its value.

Real world example use-cases for LoadAndDelete and LoadOrStore:

A comment on the original issue on github proposing to add LoadAndDelete provides a really good example which is discussed below:

The discussion can be found here : https://github.com/golang/go/issues/33762

The issue was a proposal to add LoadAndDelete, and therefore serves as a really convincing example for its necessity.

The issue’s author had a situation where he had to use a cache to store database records, to prevent calling an expensive operation to insert or remove the data from the database. For this, he could easily implement the insertion part, with just the use of LoadOrStore as given below:

func (self *Cache) Add(key interface{}, value interface{}) {
  _, loaded := self.syncMap.LoadOrStore(key, value)
  if !loaded {
    self.database.Insert(key, value) // expensive operation
  }
}
Enter fullscreen mode Exit fullscreen mode

When the above is being called by multiple goroutines, the write to the database happens precisely once as LoadOrStore guarantees to add data precisely once. Therefore, the expensive call to insert the data into the database is executed only once. This is actually a very specific (and possibly rare) situation where all the goroutines are trying to write the exact same data. It is usually common to have each goroutine write different data to the same store.

Before Go had introduced the LoadAndDelete method, the author didn’t have a definite way to implement something similar for his Remove(key) method. And that lead him to propose LoadAndDelete which makes the following possible without additional locks:

func (self *Cache) Remove(key interface{}) {
  _, removed := self.syncMap.LoadAndDelete(key) // proposed function
  if removed {
    self.database.Delete(key) // expensive operation
  }
}
Enter fullscreen mode Exit fullscreen mode

Calling both Add and Remove concurrently might lead to race conditions. As an example,

// Go 1 - run Add(...)
// Go 2 - run Remove(...)
Go 1:  _, loaded := self.syncMap.LoadOrStore(key, value)
Go 2: _, removed := self.syncMap.LoadAndDelete(key)
Go 2: if removed { // false
Go 1: if !loaded { // true (loaded -> false)
Go 1: self.database.Insert(key, value)
// In this case, the data-base will have an inserted record without 
// the cache having the same record in its local store
Enter fullscreen mode Exit fullscreen mode

Therefore, the methods Add and Remove are not supposed to be run concurrently. The issue’s author specifically mentioned that the Add operations will first be run concurrently by multiple goroutines, followed by which the Remove operations are run by the goroutines long after the calls to the Add operations were completed.

This specific case clearly exemplifies a likely real-world scenario.

Code and other resources

The code for this article can be found here: https://github.com/sreramk/forblogarticles/tree/main/article_1

These test cases check for possible race conditions in the code: https://github.com/sreramk/forblogarticles/blob/main/article_1/countable_test.go


Dear reader,

I am available for remote contract jobs related to Go or any other technology. I write highly scalable microservices, and I can also help you with implementing reliable AI/ML solutions. Please email me here sreramk1@gmail.com if you are interested.

Also, please feel free to email me for any reason, other than business as well!

Sincerely,
Sreram K

mirror: https://sreramk.medium.com/go-why-does-sync-map-97342f12b3fa


See also: Go: sync.RWMutex internals and usage explained

Discussion (0)