The concept of multithreading is often confusing to beginners. This is because of the new concepts that come with learning it and the complexity of writing a threaded function/application.
Some people may argue that multithreading is bad while others agree it is good.
However, Multithreading was introduced to make a program run asynchronously, especially long-running tasks that involve input and output from the user.
Knowing what multithreading is, when to use it and its applications can make you write better code with speed and scalability in mind.
By the end of this tutorial, you will have built a function that runs asynchronously. Remember your project from the last tutorial, you will improve and compare the speed.
Thread vs. Threading
Threads are one of the fundamental building blocks of modern computer programs. They are like small, lightweight processes that exist within a larger program and can execute tasks independently. Threads share the same memory space of the program, which means they can access and modify the same data. This allows them to communicate and cooperate easily.
Threads are subsets of a process—-It is the smallest unit of execution.
On the other hand, Threading refers to the concept of using threads within a single process.
In simpler terms, threading allows a program to multitask. It assigns different tasks to different workers (threads) from the same group(process), enabling them to work together on those tasks.
Multithreading vs Concurrency
Multithreading as the name implies means running multiple threads within a single process. Threads can be visualized as workers, each responsible for executing specific tasks. Tasks are split into subtasks, and each available thread takes up these subtasks. Threads share the same memory space which makes communication between threads efficient.
It works like division of labor. Each thread works together to achieve a goal within a function call. So once one thread is busy, and the user executes an instruction, another thread is waiting to carry out the instruction. It could also be that another instruction is waiting in the queue so once one thread is busy, another thread takes up the instruction and executes it. Now let’s shift our focus to the Global Interpreter Lock(GIL). It plays a vital role in multithreading.
Global Interpreter Lock(GIL)
To prevent race conditions and deadlocks, Python used a GIL. What it does is prevents multiple threads from executing at the same time.
GIL prevents you from achieving parallelism in multithreading. It only allows a single thread to get executed at a time which means the program is context switching. This was introduced to ensure thread safety.
Multithreading uses multiple threads but the threads are not executing at the same time.
But what is the point, if they can't. Isn't that the reason multithreading was introduced? All you have been reading earlier is that each thread executes a set of instructions so what's the point now.
This is where concurrency comes to play. This is where context switching and multitasking can be fully understood. Now you understand where we are going. Let's discuss concurrency.
What is concurrency?
Concurrency happens when our program gives the illusion of executing multiple tasks at the same time.
In concurrency, tasks are broken down into subtasks.
For example, let's imagine a scenario in an I/O bound task. While the program is waiting for the user to click a button or submit a form. Instead of the thread staying idle it goes ahead and executes other tasks. So what is happening is that the program is switching between threads giving the illusion of running simultaneously.
Picture a thread in your mind, picture multiple threads. Now that you have done that you should be able to understand it and if you don't, reread it, draw diagrams and you could also check the diagram below.
Concurrency can be observed when we open multiple tabs on our browser or when we use the graphical user interface for several things. In reality, it is context-switching.
In concurrency, several threads can rely on a single process.
Some tutorials can use CPU, core and Process interchangeably. Don't be confused. A CPU houses a core, a process is like a container for a thread and the cores are physical processing units.
I/O bound tasks
We discussed I/O bound tasks in concurrency.
What are I/O bound tasks?
I/O-bound tasks spend most of their time waiting for input/output operations to be completed.
Concurrency (e.g., multithreading) is often used to improve efficiency by allowing the CPU to switch to other tasks during I/O waits. Examples of I/O tasks are reading data from a file, making network requests, or interacting with the database.
Use cases of Multithreading
- It is used in GUIs applications.
- It is used in game development.
- It is used in web servers and real-time systems.
Implementation of Threading
Prerequisites
I assume you are a beginner and have read the previous article that introduced you to asynchronous applications.
- Basic knowledge of Python including knowing how to use functions and modules.
- Python installed.
- A code editor.
- Willingness to learn and explore.
So basically in Python there are two ways to achieve threading:
- Using the ‘threading module’
- Using the ‘concurrent.futures.ThreadPoolExecutor’ module
Using the ‘threading module’
Remember in the previous article we wrote a function that prints numbers 1-10 sequentially. In this article we would improve on it by using threading.
Below is the result from the last article. You should have something that is fairly similar.
import time
import threading
def print_numbers_sequentially(numbers):
start = time.perf_counter()
for number in range(1, numbers):
if number in [8, 12]:
# if number == 8 or number == 12:
print(f'Sleeping {number} second(s)...')
time.sleep(number)
print(f'Done sleeping...{number}')
else:
print(number)
print(number)
end = time.perf_counter()
elapsed_time = end - start
return elapsed_time
elapsed_time = print_numbers_sequentially(31)
print(f'Elapsed time: {elapsed_time} seconds')
This was my result:
‘Elapsed time: 20.0994832 seconds’
Now, let's improve on it.
import time
import threading
start = time.perf_counter()
Step 1.
Import the necessary modules: The code starts by importing the necessary modules. The threading
module provides tools for working with threads.
Step 2.
The time.perf_counter()
function is used to measure the current time and store it in the start
variable.
def print_numbers_async(seconds):
if seconds in [8, 12]:
print(f'Sleeping {seconds} second(s)...')
time.sleep(seconds)
print(f'Done sleeping...{seconds}')
else:
print(seconds)
Step 3.
print_numbers_async
Function: It takes in a single argument seconds
. If the value of seconds
is 8 or 12, the function prints a message, sleeps for the specified number of seconds, and then prints out another message indicating that it’s done sleeping. Otherwise, it simply prints the value of seconds
.
threads = []
for num in range(1, 31):
t = threading.Thread(target=print_numbers_async, args=[num])
# args is used if the fn has an argument
t.start()
threads.append(t)
Step 4.
Thread creation and Starting: A loop is used to create and start multiple threads. The loop iterates over from numbers 1 to 30. For each number a new thread t
is created using threading.Thread()
. The target
parameter specifies the function that the thread will execute, which is print_numbers_async
in this case. The args
parameter provides the argument to pass to the function, in this case, the current value of num
.
Step 5.
Thread Starting: After creating the thread, the start()
method is called on it. This initiates the execution of the print_numbers_async
function in a separate thread. The thread object is then added to the threads
list.
for thread in threads:
thread.join()
Step 6.
Thread Joining: Another loop iterates over the list of threads created earlier. For each thread in the list, the join()
method is called. This instruction tells the main program to wait until the thread has finished its execution before proceeding. This ensures that all threads have completed their work before continuing.
end = time.perf_counter()
elapsed_time = end - start
print(f'Finished in {elapsed_time} second(s)')
Step 7.
After all the threads have finished their work, the time.perf_counter()
function is called again to measure the current time and store it in the end
variable.
Step 8.
Printing elapsed time: The difference between end
and start
represents the elapsed time of the entire program’s execution.
And now we are done. Easy right? I guess it was a lot of work comprehending the whole concept. But don’t worry you can reread it and just keep practicing, in no time you will get used to it.
The final code should now look as follows:
import time
import threading
start = time.perf_counter()
def print_numbers_async(seconds):
if seconds in [8, 12]:
print(f'Sleeping {seconds} second(s)...')
time.sleep(seconds)
print(f'Done sleeping...{seconds}')
else:
print(seconds)
threads = []
for num in range(1, 31):
t = threading.Thread(target=print_numbers_async, args=[num]) # args is used if the fn has an argument
t.start()
threads.append(t)
for thread in threads:
thread.join() # waiting for the thread to finish
end = time.perf_counter()
elapsed_time = end - start
print(f'Finished in {elapsed_time} second(s)')
This was my result:
‘Finished in 12.0270528 second(s)’
Using threadPoolExecutor module
Step 1.
Import modules: The code starts by importing the concurrent.futures
module, which provides a high-level interface for asynchronously executing functions in a thread or process pool.
Step 2.
‘ThreadPoolExecutor’: A ThreadPoolExecutor
is created using the with
statement. The executor manages a pool of worker threads. Check Resources below to learn more about it.
Step 3.
Execution using executor.map()
: The executor.map()
function is used to schedule the print_numbers_async
function to be executed with each value from secs
range, which is from 1 to 30. This means that the function will be executed concurrently for each value in the range.
Learn more about maps.
import concurrent.futures
import time
start = time.perf_counter()
def print_numbers_async(seconds):
if seconds == 8 or seconds == 12:
print(f'Sleeping {seconds} second(s)...')
time.sleep(seconds)
print(f'Done sleeping...{seconds}')
else:
print(seconds)
with concurrent.futures.ThreadPoolExecutor() as executor:
secs = range(1, 31)
results = executor.map(print_numbers_async, secs)
end = time.perf_counter()
elapsed_time = end - start
print(f'Finished in {round(end - start, 2)} second(s)')
This was my result. You should have something close to similar.
‘Finished in 12.06 second(s)’
And there it is. We have implemented threading. But wait, this doesn’t seem like something you would want to do in real life. Check out this github repo to learn more and practice using it in a real-life scenario.
Conclusion
Every invention isn't entirely favorable to everyone. Everyone has preferences, and there is a need to learn and experiment to know what works best for you when writing asynchronous applications.
In programming, multithreading offers a powerful tool for running tasks concurrently. While it might seem complex at first, mastering the concepts of threads and concurrency can greatly enhance your ability to create efficient and responsive applications. By understanding when and how to utilize multithreading, you can optimize your code's performance and provide a better user experience. As you delve into asynchronous programming, remember that practice and exploration are key to harnessing the true potential of multithreading.
Top comments (2)
Amazing your post !!!!
Thanks for reading