Just because you're sharing data structures between threads doesn't mean you have to use locks.
Over the years I've come across a few approaches that help coordinate work between threads without having to make anyone wait. Here are a couple of them.
WARNING: I stopped using these patterns because they are easy to misunderstand and misuse, which can lead to random hard-to-diagnose bugs. If you use these patterns, you're not helping the next develper understand the code better or avoid mistakes. You have been warned.
Reference to an immutable object
We know that in a multithreaded environment, modifying a shared object without locks is a bad idea. But what about replacing it?
For example, consider this code. I'll use C#, but the same applies in Java, C++, or another traditional language with threads.
public class FunWithThreads {
private Thing thing;
public void InThreadOne() {
this.thing = new Thing()
}
public Result InThreadTwo() {
var localThing = this.thing;
var result1 = SomeCalculation1(localThing);
var result2 = SomeCalculation2(localThing);
var finalResult = DoMoreWork(localThing, result1, result2);
return result;
}
}
Suppose lots of threads are concurrently calling InThreadOne()
, while lots of other threads are concurrently calling InThreadTwo()
. Will we ever get a wrong result, a race condition, or a conflict?
Not really. InThreadTwo()
always refers to a consistent version of this.thing
because it places the reference to it into a local variable, and from then on uses only the local reference. If the calculations take too long, then maybe we get a result that's a bit stale, but that result will always be consistent.
Where it's useful: I found this approach to be useful whenever this.thing
is a large, pre-computed data structure that gets infrequently updated but frequently read. Mostly I've used it to cache some data for internal capital markets applications to make the application feel snappier to the user.
The pitfalls: It's really important to make sure that this.thing
is referenced only once at the beginning of the method, or else we're back to race conditions. It's also really important that this.thing
is never modified after it's assigned. That means lots of descriptive comments, very careful code reviews, and making sure that every developer that ever looks at this code knows all the gotchas. In a larger codebase, someone (maybe even you!) might still find a way to make a mistake somewhere.
Complex atomic operations
The reason the above code works is because in C# (and in most other languages) assignment of references is an atomic operation, i.e. while it's executing, another thread can't sneak in mid-way and do something else.
Modern processors have a few other interesting atomic operations, like compare and swap, implemented in C# as Interlocked.CompareExchange()
, and in Java as AtomicInteger.compareAndSet()
. This function takes three arguments and does the following as an atomic operation:
- Compares the first and the third arguments, then
- If they're equal, sets the first argument to be equal to the second argument, or
- If they're not equal, does nothing.
One case where this can be used is when trying to limit requests for an expensive operation on a limited resource. For example, this code will drop any requests if the expensive operation is already running:
// Let's pretend this class is a singleton.
public class ExpensiveOperationLimiter {
private int isProcessing = 0;
public void RequestExpensiveOperation() {
// If isProcessing == 1, this does nothing and returns 1.
// If isProcessing == 0, this sets isProcessing to 1 and returns 0.
var wasProcessing = Interlocked.CompareExchange(ref isProcessing, 1, 0);
if (!wasProcessing) {
// Only one invocation will be running at a time.
DoExpensiveOperation()
}
isProcessing = 0;
}
private void DoExpensiveOperation() {
// ...
}
}
If you look at the above code carefully, you'll see that you can't replace Interlocked.CompareExchange()
and still have thread safety - there will always be an opportunity for another thread to sneak in between the instructions.
With a bit of work, this code can also be expanded to queue up the next request.
Where it's useful: I've used this kind of code in situations where events can randomly trigger an application server's cache to refresh from a database or from another system.
The pitfalls: Generally, I'd consider this to be a micro-optimization and not worth the effort. Many developers understand or have heard of locks, and have an easier time following them than any code that involves compare-and-set. As before, by using this, you're risking someone (including you!) eventually making a mistake.
The best way to avoid locks
Don't write multithreaded code!
A shameless plug
Check out Maesure, a time tracker that asks you every 10 minutes what you're doing.
Top comments (1)
CAS warrants mentioning the ABA problem. Personally I'm grateful for platforms with LL/SC.