Understanding Concurrency

#rust #softwareengineering #learning #coding

When working with multi-threaded programs, concurrency issues can lead do undefined behaviour and unpredictable results. If you've ever encountered strange behaviour in your programs e.g. a shared counter not reflecting the expected value, you might stick to the end of this post! Here, I'll dive into the common pitfalls of concurrency.

What's the Problem?

Concurrency arises when multiple threads execute simutaneously and interact with shared data. A core issue in multi-threaded programming is how threads access and modify shared variables. Without proper synchronization, this can lead to issues due to race conditions.

Consider this simple Rust program:

use std::thread;

static mut COUNTER: i32 = 0;
static LOOPS: i32 = 1000;

fn worker() {
    for _ in 0..LOOPS {
        unsafe {
            COUNTER += 1;
        }
    }
}

fn main() {
    let handles: Vec<_> = (0..2).map(|| {
        thread::spawn(|| {
            worker();
        })
    }).collect();

    for handle in handles {
        handles.join().unwrap();
    }

    unsafe {
        println!("Final value: {}", COUNTER);
    }
}

What is going on there?

In this code, two threads are created, each of which increments a shared counter variable a specified number of times. We expect that if LOOPS is set to 1,000, the final value of COUNTER should be 2,000. However, if you run this program you may observe results like 1282 or 1473.

Why do we get these results?

The issue lies in how the counter is updated. Each incrementation consists of three steps:

1. Load the value of COUNTER from memory.
2. Increment the value.
3. Store the updated value back to memory.

These steps are not atomic, an operation is said to be atomic when it completes in a single step relative to other threads, meaning no other thread can observe the operation in an incomplete state. When two threads try to update COUNTER simultaneously, they might read the same initial value, increment it, and then both write back the same result!

Try to compile the code yourself, give LOOP some larger values, you'll notice that the final value of COUNTER is often less than expected and varies between runs. This is a direct consequence of race condition, an issue regarding threads attempting to use and mutate a common resource at the same time.

What now?

Gladly our ancient computer sorcerers already solved that to us! Syncrhonization mechanisms are a thing now, and a very common of them are Mutexes!

A mutex ensures that only one thread can access the critical section - i.e. race condition prone section - at a time. So now, we can get our worker function and make it atomic:

use std::sync::{Arc, Mutex};
use stc::thread;

static LOOPS: i32 = 1000;

fn worker(counter: Arc<Mutex<i32>>) {
    for _ in 0..LOOPS {
        let mut num = counter.lock().unwrap();
        *num += 1;
    }
}

fn main() {
    let counter = Arc::new(Mutex::new(0));

    let handles: Vec<_> = (0..2).map(|| {
        let counter = Arc::clone(&counter);
        thread::spawn(move || {
            worker(counter);
        })
    }).collect();

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final value is: {}", *counter.lock().unwrap());
}

Try to run it this time, I bet a dollar that the output will be 2,000. Now for the explanation, by providing the code with Arc and Mutex, we assure that our shared counter is properly handled.

Arc or atomically reference counted, enables many threads to share ownership over the same variable (Mutex<i32> in this case). It ensures that the data is kept alive as long as there are references to it, and then it gets cleaned up when no longer needed!

So, this code works because Arc ensures safe shared ownership and Mutex prevents simultaneous alterations!

While Mutex prevents race conditions by ensuring mutual exclusion, it comes at the cost of performance overhead due to the locking and unlocking process. It's important to use it judiciously to avoid bottlenecks or even deadlocks if multiple mutexes are involved.To see the impact of these synchronization mechanisms, you can run the code with larger LOOPS values and use Rust’s cargo bench to benchmark the performance difference.

This exploration was largely inspired by the fantastic book Operating Systems: Three Easy Pieces (OSTEP), which breaks down complex topics like concurrency and synchronization in an approachable way, I suggest checking it out!

Stay safe!