absterdabster

Posted on May 11

What the futex? A linux concurrency fundamental

#programming #cpp #systems #linux

WTF?

Exactly.

What is a futex?

It is a tool that let's your program sleep when there is no work and wake up when work is ready.

Ooooh. That sounds pretty efficient!

(If you read this, comment "WTF" so we can confuse everyone else lol.)

What is a synchronization primitive

For you nerds out there, this may sound somewhat like another tool called conditional variables. Fun fact, in linux, they use futexes internally!

For all the noobs out there, these futexes are what let you synchronize between two threads in a program.

A thread is an execution unit. They are often used to:

Split up work by running in parallel
Do background work in parallel

Why would I use a futex?

Let's say I have a thread that is listening to a user's input for a background task to process.

We don't want this listening thread to stall, so we can pass this work to another background thread and wake it up.

Then the background thread can complete the task and go back to sleep.

Great!

How do I use a futex?

Under the hood a futex is just an integer. This is all the state that is needed for waking up a thread or putting one to sleep.

#include <linux/futex.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <cstddef>
#include <atomic>

std::atomic_int futex_var = 0;

This integer is what is used to check if the thread is ready to wake up.

FUTEX_WAIT → Sleep if value matches expected
FUTEX_WAKE → Wake one or more waiting threads

The sleep/wait function:

void futex_wait(std::atomic_int *addr, int expected) {
    syscall(SYS_futex, addr, FUTEX_WAIT, expected, NULL, NULL, 0);
}

The wake/notify function:

void futex_wake(std::atomic_int *addr, int threads=1) {
    syscall(SYS_futex, addr, FUTEX_WAKE, threads, NULL, NULL, 0);
}

Here threads lets you control how many threads to wake up from the waiting queue.

This syscall call is a generic call to make your kernel do protected work like interacting with hardware devices (like memory or network).

A quick crash course to syscalls

Skip to back to futexes if you already know this... nerd.

A syscall is invoked with syscall(). Standard library calls like read(), open(), and write() used for interacting with file descriptors (files, sockets, pipes) are wrappers for syscall.

The signature for calling it is like this where ... is the arguments taken for that specific syscall defined by the number:

long syscall(long number, ...);

On error, -1 will be returned (and ERRNO is set).

If I want to write "Hello" to your computer terminal via stdout (standard output), you could do something like this.

#include <unistd.h>
#include <sys/syscall.h>

int main() {
    syscall(SYS_write, 1, "Hello\n", 6);
    return 0;
}

Example of using futexes

We are going to try to make locks with these futexes.

A lock just protects a shared resource so that threads take turns accessing it.

We want to avoid corruption if 2 threads overwrite each other's changes in the shared resource.

Here it is. Let's go over it.

struct Lock{
    // 0 - unlocked, 1 - locked
    std::atomic_int state{0};

    void lock(){
        int unlocked = 0;
        while(!state.compare_exchange_strong(unlocked, 1)){
            unlocked = 0;
            futex_wait(&state, 1);
        }
    }

    void unlock(){
        state.store(0);
        futex_wake(&state);
    }
};

Locking

Let's start with the logic in lock:

int unlocked = 0;
while(!state.compare_exchange_strong(unlocked, 1)){
     unlocked = 0;
     futex_wait(&state, 1);
}

Here we are checking if state is 0 (unlocked), and if it is then we perform a successful swap and state will then store 1. This is what we call a CAS (Compare And Swap). This would return true and the lock function will return without entering the loop.

BUTTT, if the state was 1, the compare would fail. We enter the loop. We reset unlocked to 0 (because it gets updated with the last value of state). Then we sleep if the value does in fact match 1 still with futex_wait.

Here we can see how futex_wait comes in handy for a simple lock.

If the thread is sleeping, it must wait for futex_wake in unlock to wake up once again from the top of the waiting queue where it can contend for the lock again.

Unlocking

state.store(0);
futex_wake(&state);

Now let's look at the unlocking logic. We simply set the state to be unlocked, and wake up a thread from the waiting queue from futex_wait to recontend for the lock.

Conditional Variable

For fun, I will implement a conditional variable too with futexes.

A conditional variable is a tool that makes threads sleep when a function/condition returns false. And allows for condition rechecking with notify_one (1 thread) or notify_all (all threads) by waking them up.

This is useful: If a thread is preparing a task and then updates a condition once it is finished, it can notify the background thread to finish the task.

For the noobs who want to skip, go to the conclusion. And for the nerds:

struct CondVar{

    // each seq represents the notify epoch
    // this prevents lost wake up (if a notify was called between !predicate and futex_wait)
    // it prevents the lost wakeup because notify causes seq to increment, so we won't be sleeping anymore
    std::atomic_int seq{0};

    void wait(std::function<bool(void)>& predicate){
        // if !predicate -> futex_wait
        int cur = seq.load();
        while(!predicate()){
            futex_wait(&seq, cur);
            cur = seq.load();
        }
    }

    void notify_one(){
        seq.fetch_add(1);
        futex_wake(&seq);
    }

    void notify_all(){
        seq.fetch_add(1);
        futex_wake(&seq, INT_MAX);
    }
};

There is a common problem that conditional variables try to avoid called lost wakeup.

Lost wakeup is where you check the predicate condition, and suddenly another thread calls notify before you reach futex_wait where you go to sleep and the condition could've been updated to return true.

To avoid this, we use a seq to atomically increase on notifies. This way, we only sleep if the seq is the same AND predicate is false. seq being the same means that no new notifies were seen.

Adding a twist: a timeout

Let's say we don't want to sleep until the condition is true. We want to wake up if it takes too long. We can set a timeout by introducing a wait_for function.

Here is an implmentation:

    void wait_for(std::function<bool(void)>& predicate, std::chrono::nanoseconds ns){
        auto end = std::chrono::high_resolution_clock::now() + ns;
        int cur = seq.load();
        while(!predicate()){
            auto now = std::chrono::high_resolution_clock::now();
            if(now >= end){
                return;
            }
            futex_wait_for(&seq, cur, end-now);
            cur = seq.load();
        }
    }

Here we now pass in a timeout. The structure is mainly the same except now, we are dealing with times. We have to add a condition:

if(now >= end){
    return;
}

This let's us resume a timeout if we get woken up with a notify early.

We also have this interesting line:

futex_wait_for(&seq, cur, end-now);

We are using a new function. This futex_wait_for is the same syscall except that it also takes a timeout parameter too like so:

int futex_wait_for(std::atomic_int* addr, int expected, std::chrono::nanoseconds ns){
    timespec ts;
    ts.tv_sec  = ns.count() / 1'000'000'000;
    ts.tv_nsec = ns.count() % 1'000'000'000;
    return syscall(SYS_futex, reinterpret_cast<int*>(addr), FUTEX_WAIT, expected, &ts, nullptr, 0);
}