DEV Community

Cover image for Multiprocessing in Python (Part 2)
Samuel Abolo
Samuel Abolo

Posted on • Edited on

Multiprocessing in Python (Part 2)

FACE DETECTION WITH PYTHON AND MULTIPROCESSING

In this post, we will be writing a simple script to illustrate how handy multiprocessing and multithreading can be in real world situations.
We will be writing a simple program that loads images from our computer, detects and draws rectangles on all human faces and upload to a server.

Tools

  1. Python
  2. OpenCV
  3. Flask for our server
  4. The Python requests module to push our images

OUR FACIAL RECOGNITION ALGORITHM

So this code basically handles all the local processing, To understand more about how the face detection works, check out
realpython.com


import cv2

from requests import post

# Create the haar cascade
cascPath = "haarcascade_frontalface_default.xml"
faceCascade = cv2.CascadeClassifier(cascPath)

def draw_face(image):

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # Detect faces in the image
    faces = faceCascade.detectMultiScale(
        gray,
        scaleFactor=1.1,
        minNeighbors=5,
        minSize=(30, 30)
    )

    # Draw a rectangle around the faces
    for (x, y, w, h) in faces:
        cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)

    return image

def read_image(image_path):
    return cv2.imread(image_path)

def write_image(image, image_path="output.jpg"):
    fn = f"out-{image_path}"
    cv2.imwrite(fn, image)
    return fn

def upload_image(fn):
    url = "http://localhost:5000/upload"
    files = {'file': open(fn, "rb")}

    return post(url, files=files).status_code

Enter fullscreen mode Exit fullscreen mode

THE SIMPLE SERVER

This is a simple WSGI server written in Flask, to allow us to upload our files.

# app.py
from flask import Flask, request

app = Flask(__name__)

@app.post("/upload")
def upload_file():

    file = request.files['file']

    file.save(f"./temp/{file.filename}")

    return {"status": True}


if __name__ == '__main__':
    app.run()

Enter fullscreen mode Exit fullscreen mode

Run this in a new terminal window

$ python app.py
Enter fullscreen mode Exit fullscreen mode

THE NORMAL WAY OF PROCESSING (Synchronously)

If we were to test this function with 1000 images, running our flask server locally, it’s gonna take A LOT of time because it’s processing it synchronously.

Synchronous programming or execution basically means that the next process has to wait for the current process to finish before it starts.

Synchronous execution is only suitable if the current process needs the data generated by the previous process.


import time

def normal_way(image_paths):
    for image_path in image_paths:
        image = read_image(image_path)
        image = draw_face(image)
        fn = write_image(image, image_path)
        upload_image(fn)

if __name__ == '__main__':
    # Trying this 100 times to simulate 100 images
    image_paths = ["a.jpg"] * 100

    start = time.time()
    normal_way(image_paths)
    stop = time.time()

    # Time taken for synchronous: 309.1052303314209
    print(f"Time taken for synchronous: {stop-start}")
Enter fullscreen mode Exit fullscreen mode

If we were to have 1000 samples, the time taken would be approximately 3091.05 seconds.

OUR OPTIMISED CODE (ASYNCHRONOUS EXECUTION)

Asynchronous programming is basically the opposite of synchronous programming, the next process can start concurrently or in parallel with the current process.

There are many ways to achieve this: An Event Loop, Multithreading, Multiprocessing, etc. For the purpose of this tutorial, we will be using Multithreading and Multiprocessing only.

We will be using multithreading for the IO-based functions: reading the images from the disc, writing to the disc, and uploading to the server.

We will be using multiprocessing for the CPU-intensive function, which is the function that detects and draws rectangles on the faces.


from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

def optimised_solution(image_paths: list):

    # Define our pool executors
    with ThreadPoolExecutor() as thread_pool:
        with ProcessPoolExecutor() as process_pool:

            # We use threading to read our images
            images = thread_pool.map(read_image, image_paths)

            # We use multiprocessing to process our images
            processed_images = process_pool.map(draw_face,images)

            # We use multithreadfing to write our images to disk
            files = thread_pool.map(write_image, processed_images)

            # And finally, we use multithreading to push our images to our server
            responses = thread_pool.map(upload_image, files)

            print("Processed: ", all([response == 200 for response in responses]))


if __name__ == '__main__':
    # Trying this 100 times to simulate 100 images
    image_paths = ["a.jpg"] * 100

    start = time.time()
    optimised_solution(image_paths)
    stop = time.time()

    # Time taken for Optimised 1: 147.17485308647156
    print(f"Time taken for Optimised 1: {stop-start}")
Enter fullscreen mode Exit fullscreen mode

KEY TAKEAWAYS

  • As we observe, notice that our solution optimised with multiprocessing took about half the time to finish executing.

  • Our program now has the ability to scale horizontally with the number of CPUs because of the concept of parrallel computing discussed in part 1.

  • If we where to have an Asynchronous server (ASGI), our program would be more optimised, but lets leave that for another day.

THE FULL CODE

import cv2
import time

from requests import post

from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor



cascPath = "haarcascade_frontalface_default.xml"

# Create the haar cascade
faceCascade = cv2.CascadeClassifier(cascPath)

def draw_face(image):
    # More Info: https://realpython.com/face-recognition-with-python/

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # Detect faces in the image
    faces = faceCascade.detectMultiScale(
        gray,
        scaleFactor=1.1,
        minNeighbors=5,
        minSize=(30, 30)
    )

    # Draw a rectangle around the faces
    for (x, y, w, h) in faces:
        cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)

    return image

def read_image(image_path):
    return cv2.imread(image_path)

def write_image(image, image_path="output.jpg"):
    fn = f"out-{image_path}"
    cv2.imwrite(fn, image)
    return fn

def upload_image(fn):
    url = "http://localhost:5000/upload"
    files = {'file': open(fn, "rb")}

    return post(url, files=files).status_code


def normal_way(image_paths):
    for image_path in image_paths:
        image = read_image(image_path)
        image = draw_face(image)
        fn = write_image(image, image_path)
        upload_image(fn)

def optimised_solution(image_paths: list):

    # Define our pool executors
    with ThreadPoolExecutor() as thread_pool:
        with ProcessPoolExecutor() as process_pool:

            # We use threading to read our images
            images = thread_pool.map(read_image, image_paths)

            # We use multiprocessing to process our images
            processed_images = process_pool.map(draw_face,images)

            # We use multithreadfing to write our images to disk
            files = thread_pool.map(write_image, processed_images)

            # And finally, we use multithreading to push our images to our server
            responses = thread_pool.map(upload_image, files)

            print("Processed: ", all([response == 200 for response in responses]))


if __name__ == '__main__':
    # Trying this 100 times to simulate 100 images
    image_paths = ["a.jpg"] * 100

    start = time.time()
    normal_way(image_paths)
    stop = time.time()

    # Time taken for synchronous: 309.1052303314209
    print(f"Time taken for synchronous: {stop-start}")

    start = time.time()
    optimised_solution(image_paths)
    stop = time.time()

    # Time taken for Optimised 1: 147.17485308647156
    print(f"Time taken for Optimised 1: {stop-start}")

Enter fullscreen mode Exit fullscreen mode

CONCLUSION

  • Multiprocessing and Multithreading can be combined to boost our code performance by a lot.

  • Synchronous execution can only scale up if we have faster CPUs, but multiprocessing allows us to scale up with more CPUs.

Feel free to ask any questions in the comment section, or reach me at ikabolo59@gmail.com and on Twitter

Thanks

Top comments (2)

Collapse
 
erinposting profile image
Erin Bensinger

Hey, awesome article! It looks like the link to your email at the end is broken — wanted to let you know so you can adjust and readers can reach you 😊

Collapse
 
tecnosam profile image
Samuel Abolo

Thank you so much Erin, will update right away. 🤗