Multi-Threading in Python: Deep-dive

#beginners #webdev #python

One universal truth about cPython: One thread runs python, while 'N' others sleep or wait for I/O

The above holds true thanks to the Global Interpreter Lock (GIL). I won't dig deep into GIL here as that would be taking us off track from the current discussion.
So if anybody is telling u that 2 or more threads can simultaneously do something with your cPython program, then its time to call their bluff. But I don't deny that Python isn't a multi-threaded programming language, its just that the multiple threads can't access your program at a time. Then comes the logical question, what's the use of such a multi-threading then? We'll be getting to the answer of that in a bit.

Python supports 2 kinds of Multi-Tasking:

Co-operative
Preemptive

Co-operative Multi-Tasking in Python:

When Python knows that its trying to do some which might take some time like performing some I/O or network operation, it voluntarily drops the GIL. In other words, the Python thread knows that it won't be doing any python work in a while and co-operates with other threads by voluntarily submitting the GIL access.

Further explained through the following example:

import socket

messages = []

def socket_connect():
    # This is an example of cooperative multi-tasking
    s = socket.socket()
    messages.append('connecting')
    s.connect(('python.org', 80))
    # Python thread drops the GIL while doing the socket connect operation.
    messages.append('connected')

threads = []

for i in range(3):
    t = threading.Thread(target=socket_connect)
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print('\n'.join(messages))

Output:

connecting
connecting
connecting
connected
connected
connected

So why do we get 3 'connecting' strings followed by 3 'connected' strings instead of 3 pairs of 'connecting connected'?

The answer lies in the cooperative multi-tasking performed by Python thread.

Let's dig a bit deeper into what happens when we write the following line of code:

s.connect(('python.org', 80))

Internally it calls the following internal_connect function written in C: (https://github.com/python/cpython/blob/main/Modules/socketmodule.c)

Py_BEGIN_ALLOW_THREADS
res = connect(s->sock_fd, addr, addrlen);
Py_END_ALLOW_THREADS

The above follows a common paradigm of how GIL is released and re-acquired:

Save the thread state in a local variable.
Release the global interpreter lock.
Do some blocking I/O operation.
Reacquire the global interpreter lock.
Restore the thread state from the local variable.

Py_BEGIN_ALLOW_THREADS, and Py_END_ALLOW_THREADS are 2 macros
used to simplify the above processes.

So we can see that the GIL is released prior to establishing connect method being called and the same is acquired by another thread which again releases the GIL at the same point in the code. So by the time the 3rd thread has released the GIL, the 1st thread might have been successfully established the socket connection. If so, then the 1st thread is now ready to re-acquire the GIL and continue executing python code.

To see what difference it would make, if we didn't have the network operation, lets run the following code:

import socket

messages = []

def calculate_squares():
    messages.append('calculating')
    for i in range(5):
        i*i
    messages.append('calculated')

threads = []

for i in range(3):
    t = threading.Thread(target=calculate_squares)
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print('\n'.join(messages))

Output:

calculating
calculated
calculating
calculated
calculating
calculated

As expected, there isn't any network operation involved, hence the GIL isn't released co-operatively.

Pre-emptive Multi-Tasking in Python:

When some thread has held the GIL for a certain specific amount of time(Default being 5ms), the interpreter pitches in to force the current thread to drop the GIL and pass it on to some other thread for execution. Here, the verb preempt means to halt the execution of a task with a view to resuming it later.

The same can be understood better in the following example:

messages = []

def run(thread_name):
    for i in range(50):
        for j in range(50):
            pass
        messages.append(f'Thread: {thread_name}')

threads = []

for i in range(2):
    t = threading.Thread(target=run, args=(i,))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print('\n'.join(messages))

The output of the above code will be mostly 50 occurrences of 'Thread: 0' followed by 50 occurrences of 'Thread: 1'. If you change however the ranges inside the run function to 5000, you'll notice that there will be some occurrences of 'Thread: 0' and somewhat similar number of occurrences of 'Thread:1'. So in short, the GIL is being switched pre-emptively as per some default set interval.

The following code can tell us what the default interval for switching the GIL between threads is:

import sys

print(sys.getswitchinterval())

By default for python3.10 it is 5ms (https://github.com/python/cpython/blob/main/Python/ceval_gil.h)
However the same can be changed as follows:

import sys

sys.setswitchinterval(0.0001)

The above sets the switch interval to 0.1 ms.

Let's say that there are 2 threads: Thread A, and Thread B. If Thread A holds a lock and gets preempted, then maybe Thread B could run instead of Thread A.
If Thread B is waiting for the lock that Thread A is holding, then Thread B is not waiting for the GIL. In that case Thread A reacquires the GIL immediately after dropping it, and Thread A continues.

Purpose:

Purpose of Co-operative multi-tasking in python is to finish tasks faster, in case they're waiting for I/O.
The purpose of pre-emptive multi-tasking isn't to go faster but rather to simulate parallelism. So instead of one thread running till the end, there can be many threads that appear to run parallel.

Inspired from: https://opensource.com/article/17/4/grok-gil#:~:text=By%20default%20the%20check%20interval,of%20bytecodes%2C%20but%2015%20milliseconds.