FACE DETECTION WITH PYTHON AND MULTIPROCESSING
In this post, we will be writing a simple script to illustrate how handy multiprocessing and multithreading can be in real world situations.
We will be writing a simple program that loads images from our computer, detects and draws rectangles on all human faces and upload to a server.
Tools
- Python
- OpenCV
- Flask for our server
- The Python
requests
module to push our images
OUR FACIAL RECOGNITION ALGORITHM
So this code basically handles all the local processing, To understand more about how the face detection works, check out
realpython.com
import cv2
from requests import post
# Create the haar cascade
cascPath = "haarcascade_frontalface_default.xml"
faceCascade = cv2.CascadeClassifier(cascPath)
def draw_face(image):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Detect faces in the image
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30)
)
# Draw a rectangle around the faces
for (x, y, w, h) in faces:
cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)
return image
def read_image(image_path):
return cv2.imread(image_path)
def write_image(image, image_path="output.jpg"):
fn = f"out-{image_path}"
cv2.imwrite(fn, image)
return fn
def upload_image(fn):
url = "http://localhost:5000/upload"
files = {'file': open(fn, "rb")}
return post(url, files=files).status_code
THE SIMPLE SERVER
This is a simple WSGI server written in Flask, to allow us to upload our files.
# app.py
from flask import Flask, request
app = Flask(__name__)
@app.post("/upload")
def upload_file():
file = request.files['file']
file.save(f"./temp/{file.filename}")
return {"status": True}
if __name__ == '__main__':
app.run()
Run this in a new terminal window
$ python app.py
THE NORMAL WAY OF PROCESSING (Synchronously)
If we were to test this function with 1000 images, running our flask server locally, it’s gonna take A LOT of time because it’s processing it synchronously.
Synchronous programming or execution basically means that the next process has to wait for the current process to finish before it starts.
Synchronous execution is only suitable if the current process needs the data generated by the previous process.
import time
def normal_way(image_paths):
for image_path in image_paths:
image = read_image(image_path)
image = draw_face(image)
fn = write_image(image, image_path)
upload_image(fn)
if __name__ == '__main__':
# Trying this 100 times to simulate 100 images
image_paths = ["a.jpg"] * 100
start = time.time()
normal_way(image_paths)
stop = time.time()
# Time taken for synchronous: 309.1052303314209
print(f"Time taken for synchronous: {stop-start}")
If we were to have 1000 samples, the time taken would be approximately 3091.05 seconds.
OUR OPTIMISED CODE (ASYNCHRONOUS EXECUTION)
Asynchronous programming is basically the opposite of synchronous programming, the next process can start concurrently or in parallel with the current process.
There are many ways to achieve this: An Event Loop, Multithreading, Multiprocessing, etc. For the purpose of this tutorial, we will be using Multithreading and Multiprocessing only.
We will be using multithreading for the IO-based functions: reading the images from the disc, writing to the disc, and uploading to the server.
We will be using multiprocessing for the CPU-intensive function, which is the function that detects and draws rectangles on the faces.
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def optimised_solution(image_paths: list):
# Define our pool executors
with ThreadPoolExecutor() as thread_pool:
with ProcessPoolExecutor() as process_pool:
# We use threading to read our images
images = thread_pool.map(read_image, image_paths)
# We use multiprocessing to process our images
processed_images = process_pool.map(draw_face,images)
# We use multithreadfing to write our images to disk
files = thread_pool.map(write_image, processed_images)
# And finally, we use multithreading to push our images to our server
responses = thread_pool.map(upload_image, files)
print("Processed: ", all([response == 200 for response in responses]))
if __name__ == '__main__':
# Trying this 100 times to simulate 100 images
image_paths = ["a.jpg"] * 100
start = time.time()
optimised_solution(image_paths)
stop = time.time()
# Time taken for Optimised 1: 147.17485308647156
print(f"Time taken for Optimised 1: {stop-start}")
KEY TAKEAWAYS
As we observe, notice that our solution optimised with multiprocessing took about half the time to finish executing.
Our program now has the ability to scale horizontally with the number of CPUs because of the concept of parrallel computing discussed in part 1.
If we where to have an Asynchronous server (ASGI), our program would be more optimised, but lets leave that for another day.
THE FULL CODE
import cv2
import time
from requests import post
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
cascPath = "haarcascade_frontalface_default.xml"
# Create the haar cascade
faceCascade = cv2.CascadeClassifier(cascPath)
def draw_face(image):
# More Info: https://realpython.com/face-recognition-with-python/
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Detect faces in the image
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30)
)
# Draw a rectangle around the faces
for (x, y, w, h) in faces:
cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)
return image
def read_image(image_path):
return cv2.imread(image_path)
def write_image(image, image_path="output.jpg"):
fn = f"out-{image_path}"
cv2.imwrite(fn, image)
return fn
def upload_image(fn):
url = "http://localhost:5000/upload"
files = {'file': open(fn, "rb")}
return post(url, files=files).status_code
def normal_way(image_paths):
for image_path in image_paths:
image = read_image(image_path)
image = draw_face(image)
fn = write_image(image, image_path)
upload_image(fn)
def optimised_solution(image_paths: list):
# Define our pool executors
with ThreadPoolExecutor() as thread_pool:
with ProcessPoolExecutor() as process_pool:
# We use threading to read our images
images = thread_pool.map(read_image, image_paths)
# We use multiprocessing to process our images
processed_images = process_pool.map(draw_face,images)
# We use multithreadfing to write our images to disk
files = thread_pool.map(write_image, processed_images)
# And finally, we use multithreading to push our images to our server
responses = thread_pool.map(upload_image, files)
print("Processed: ", all([response == 200 for response in responses]))
if __name__ == '__main__':
# Trying this 100 times to simulate 100 images
image_paths = ["a.jpg"] * 100
start = time.time()
normal_way(image_paths)
stop = time.time()
# Time taken for synchronous: 309.1052303314209
print(f"Time taken for synchronous: {stop-start}")
start = time.time()
optimised_solution(image_paths)
stop = time.time()
# Time taken for Optimised 1: 147.17485308647156
print(f"Time taken for Optimised 1: {stop-start}")
CONCLUSION
Multiprocessing and Multithreading can be combined to boost our code performance by a lot.
Synchronous execution can only scale up if we have faster CPUs, but multiprocessing allows us to scale up with more CPUs.
Feel free to ask any questions in the comment section, or reach me at ikabolo59@gmail.com and on Twitter
Thanks
Top comments (2)
Hey, awesome article! It looks like the link to your email at the end is broken — wanted to let you know so you can adjust and readers can reach you 😊
Thank you so much Erin, will update right away. 🤗