Eugene Kozlov

Posted on Sep 28, 2023

Multithreading in Python: the obvious and the incredible

#python #threads #concurrency

In this article, I will show a practical example how multithreading works in Python, I will talk about threads, synchronization primitives and why they are needed.

Initially, I planned that this would be a simple and short note, but while I was preparing and testing the code, I found an interesting, undefined behaviour related to the internals of CPython, so don’t close the tab, even if you are sure that you know everything about threads in Python :)

Show me your code

Let's imagine that we need a counter in our program. It would seem nothing complicated:

class Counter:
  def __init__(self):
    self.val = 0

  def change(self):
    self.val += 1

We plan to change the counter from independent threads, each thread changes the counter value X times.

def work(counter, operationsCount):
  for _ in range(operationsCount):
      counter.change()

def run_threads(counter, threadsCount, operationsPerThreadCount):
  threads = []

  for _ in range(threadsCount):
    t = threading.Thread(
      target=work,
      args=(state, operationsPerThreadCount),
    )
    t.start()
    threads.append(t)

  for t in threads:
    t.join()

if __name__ == "__main__":  
  threadsCount = 10
  operationsPerThreadCount = 1000000 
  expectedCounterValue = threadsCount * operationsPerThreadCount
  counters = [Counter()]

  for counter in counters:
    run_threads(counter, threadsCount, operationsPerThreadCount)
    print(f"{counter.class.name}: expected val: {expectedCounterValue}, actual val: {counter.val}")

Question: what counter value will the program display?

Answer: The result of the program depends on the version of Python on which the script was run.

When I launched this program for the first time, I was slightly taken aback by the results, I was 100% sure that I would see the opposite result in the console. The result of running the script in Python 3.11.5:

Counter: expected val: 10000000, actual val: 10000000

CPython was able to ensure the atomicity of the unsafe operation increment by default.

How did he do it? Let's figure it out.

Testing on other versions of Python

Before diving into the details of the implementation of the standard library and the internals of runtime, I decided to check the behavior of the program in other versions of the language. The pyenv utility helped me a lot with this.

A script that automates our test on different versions of Python:

#!/bin/bash
array=(3.7 3.8 3.9 3.10 3.11)
for version in ${array[*]}
do
  pyenv shell $version
  python3 --version
  python3 main.py
  echo '\n'
done

Results:

Python 3.7.17
Counter: expected val: 10000000, actual val: 4198551

Python 3.8.18
Counter: expected val: 10000000, actual val: 4999351

Python 3.9.18
Counter: expected val: 10000000, actual val: 3551269

Python 3.10.13
Counter: expected val: 10000000, actual val: 10000000

Python 3.11.5
Counter: expected val: 10000000, actual val: 10000000

Why is the counter value expected in some versions of Python, but not in others? It's all because of the race condition.

Race condition using the example of the increment operation

Why is there a race condition with our counter? The thing is that the increment operation consists of several steps:

read value (currVal = self.val)
increase (newVal =currVal + 1)
write new value (self.val = newVal)

And a context switch between threads can occur after step 1 or step 2, which will lead to the fact that the thread will have invalid data at its disposal before executing step 3.

Do we need mutexes in Python?

Can we conclude that in Python 3.10 they got rid of the race condition and we don’t need synchronization primitives? No matter how it is :)

After doing a little research, I found this commit and tweet from Python Core Developer.

Lets go deeper

Let's consider an alternative implementation of the counter, which differs from the usual one in one line:

class IntCounterWithConversion:
  def __init__(self):
    self.val = 0

  def change(self):
    self.val += int(1) # magic here

And let's run the tests:

Python 3.7.17
CounterWithConversion: expected val: 10000000, actual val: 1960102

Python 3.8.18
CounterWithConversion: expected val: 10000000, actual val: 2860607

Python 3.9.18
CounterWithConversion: expected val: 10000000, actual val: 2558964

Python 3.10.13
CounterWithConversion: expected val: 10000000, actual val: 3387681

Python 3.11.5
CounterWithConversion: expected val: 10000000, actual val: 2310891

We see that such code breaks thread safety even on the latest versions of Python.

Synchronization is needed anyway

We tried different implementations and different versions of Python and each had its own problems. Therefore, to be sure, we need to add synchronization to it to get rid of data races:

class ThreadSafeIntCounter:
  def __init__(self):
    self.val = 0
    self.lock = threading.Lock()

  def change(self):
    with self.lock:
      self.val += 1

Results - No surprises this time :)

Python 3.7.17
ThreadSafeCounter: expected val: 1000000, actual val: 1000000

Python 3.8.18
ThreadSafeCounter: expected val: 1000000, actual val: 1000000

Python 3.9.18
ThreadSafeCounter: expected val: 1000000, actual val: 1000000

Python 3.10.13
ThreadSafeCounter: expected val: 1000000, actual val: 1000000

Python 3.11.5
ThreadSafeCounter: expected val: 1000000, actual val: 1000000

Results

In this article, I tried to show with a simple example how threads work, what a race condition is and how synchronization helps to avoid it, and also talked about an interesting bug/feature that I discovered while writing the article.

If you want to experiment on your own, I've published all the code from the article on GitHub.

Thank you for reading to the end, I hope you found it interesting!

Useful links:

DEV Community

Multithreading in Python: the obvious and the incredible

Show me your code

Testing on other versions of Python

Race condition using the example of the increment operation

Do we need mutexes in Python?

Lets go deeper

Synchronization is needed anyway

Results

Top comments (0)

Read next

Employee Management System using Python.

How to Deploy a Fast API Application to a Kubernetes Cluster using Podman and Minikube

How my open source tool got 100 stars in 4 days!

Back to Basics - Python #02