DEV Community

Cover image for How to build Asynchronous applications in Python: Exploring Multiprocessing
Praise Idowu
Praise Idowu

Posted on • Updated on

How to build Asynchronous applications in Python: Exploring Multiprocessing

Multithreading and Multiprocessing are used to build asynchronous applications. Not many developers understand the difference between them or when to choose one over the other. In this article, we are exploring multiprocessing. We will create a function for it, picking up where we left off in the last article. We will also check its speed and see when it's best to use it. Let's begin!

Prerequisites

  • You have read the previous article on Multithreading.
  • You have Python installed.
  • You have a code editor ready.

Before you get started, check how many processors your CPU has. Open your code editor and run the following program.

import multiprocessing
print("Number of cpu :", multiprocessing.cpu_count())
Enter fullscreen mode Exit fullscreen mode

Thread vs. Process

Threads are subprocesses. They are the smallest unit of execution. A single process contains the main thread and several threads.

An image showing the main thread and several threads

A process is an instance of a computer program being executed. It has its own memory space and is independent of other processes.

An image showing a main process and several processes

To understand this, imagine a process as a container. The threads share the same resources that the container(program) has. So, a process is seen as the entire program, while a thread is a subset of the program.

Multiprocessing

Multiprocessing, as the name implies, means running multiple processes. In simpler terms, it means running multiple unrelated processes with its Python interpreter, memory, and processor.
For example, right now, you might be reading this article on Google and playing music on Spotify. These are unrelated processes and are independent of each other. When you run multiple apps, think of them as multiple processes.

Multiprocessing allows parallel execution of tasks by creating separate processes, each with its Python interpreter. This approach is useful for CPU-bound tasks that can benefit from utilizing multiple CPU cores.

In summary, multiprocessing speeds up unique processes. It is used to achieve parallel computing, and it is the ability of a system to support more than one processor at a time.

To better understand Multiprocessing, let’s delve into the concept of parallelism.

Parallelism

Parallelism means doing things at the same time. It involves executing tasks simultaneously and running multiple processes by utilizing the CPU. It is a parallel execution. The program designates tasks to available CPU processes. With parallelism, a program enables multiprocessing.

An image visualizing parallelism

To deepen our understanding of parallelism, let’s delve into CPU-intensive tasks and how to identify them in a program.

CPU-bound Task

These are tasks that rely on the CPU’s speed for execution. These tasks involve heavy computations and complex calculations. Examples of CPU-bound tasks are intensive mathematical calculations, data processing, and training machine learning models.

An image visualizing multiprocessing

Global Interpreter Lock

Multiprocessing and Parallelism bypass the global interpreter lock, allowing us to leverage the CPU fully and run parallel execution. The problem with threads doesn’t surface in multiprocessing because the processes are independent of each other so they don’t write to the same memory, and race conditions don't surface.

Multithreading and Multiprocessing

Let’s discuss the similarities and differences between the two.

  • Multiprocessing uses parallelism and multiple processors, while Multithreading uses concurrency and multiple threads.
  • To gain a better understanding of Multiprocessing, you need to delve deeper into CPUs, cores, and processes compared to Multithreading.
  • The Operating system handles process scheduling in both cases, but the Python interpreter handles thread scheduling in Multithreading.
  • In multiprocessing, multiple workers(processes) execute the instructions, whereas in Multithreading, multiple workers(threads) handle subsets of instructions.
  • A process is independent of other processes, while a thread is dependent on others.
  • A process has its memory space and Python interpreter, whereas a thread shares the same memory space with other threads.
  • Multiprocessing is used on tasks that rely heavily on the CPU. Multithreading is used on tasks that require input/output operations to be completed.
  • Race conditions and deadlocks can occur in multithreading, but they are less common in multiprocessing.

Implementation

ProcessPoolExecutor

I won’t be explaining in steps as it would be a repetition. Everything is similar to multithreading except that we use ProcessPoolExecutor() instead.

import concurrent.futures
import time

def do_something(seconds):
    if seconds == 8 or seconds == 12:
        print(f'Sleeping {seconds} second(s)...')
        time.sleep(seconds)
        print(f'Done sleeping...{seconds}')
    else:
        print(seconds)

if __name__ == '__main__':
    start = time.perf_counter()

    with concurrent.futures.ProcessPoolExecutor() as executor:
        secs = range(30)
        results = executor.map(do_something, secs)  

    end = time.perf_counter()
    elapsed_time = end - start
    print(f'Finished in {round(end - start, 2)} second(s)')
Enter fullscreen mode Exit fullscreen mode

This was my result.
Finished in 12.82 second(s)

You should get something close to similar. I ran the program multiple times, and I obtained different values, but they were all in the same range, with just a slight difference after the decimal point. Now, let’s focus on something important. Remember, a process is an independent application, and that application has several threads. So, the program is doing both parallel execution and concurrency. In each process, work is happening either sequentially or concurrently.

As a result, the program is a bit slower compared to Multithreading, but the difference is minimal, and I have explained the reason for this earlier.

In summary, multiple workers(processors) handle the printing of the numbers in the example code above.

However, this doesn’t seem like something we would typically do in the real world because it’s unnecessary to speed up small numbers. We only used it for experimentation. So, when you need to perform complex calculations or work with large datasets, like in data science or training machine learning models, leverage the power of parallelism and multiprocessing to speed it up.

Conclusion

We can draw the following conclusion from this discussion:

  • A process is an instance of a computer program.
  • A process contains one main thread and several threads.
  • A process is independent of other processes.
  • Multiprocessing is achieved by executing multiple processes.
  • Parallelism helps us achieve multiprocessing.
  • We use multiprocessing for tasks that rely heavily on the CPU.

Top comments (4)

Collapse
 
nitinsurya profile image
Nitin Surya

For I/o or network intensive tasks, async/await is also another paradigm that can be used to speed up applications.

Collapse
 
praise002 profile image
Praise Idowu

Yeah, I agree with you

Collapse
 
sc0v0ne profile image
sc0v0ne

Very good !!!

Collapse
 
praise002 profile image
Praise Idowu

Thanks for reading