DEV Community

TildAlice
TildAlice

Posted on • Originally published at tildalice.io

multiprocessing vs concurrent.futures: 2.1x Speed Gap

The API You Choose Changes Everything

Python's multiprocessing module and concurrent.futures.ProcessPoolExecutor both spawn processes to sidestep the GIL. They solve the same problem. But the overhead gap between them can hit 2.1x on small tasks.

I ran 10,000 CPU-bound jobs (prime factorization up to 10^6) on both APIs, Python 3.11, M1 MacBook. ProcessPoolExecutor finished in 4.2 seconds. Raw multiprocessing.Pool took 8.9 seconds. Same work, same process count, wildly different runtimes.

The culprit? Task submission overhead and result serialization. concurrent.futures batches tasks more aggressively and amortizes pickle costs. multiprocessing.Pool.map() incurs per-task IPC overhead that compounds at scale.

This post benchmarks both APIs on three real workloads: embarrassingly parallel number crunching, I/O-mixed tasks, and large result sets. You'll see where ProcessPoolExecutor wins, where raw multiprocessing fights back, and the one scenario where neither matters.

A person reads 'Python for Unix and Linux System Administration' indoors.

Photo by Christina Morillo on Pexels

Continue reading the full article on TildAlice

Top comments (0)