Python is often accused of being “slow.” While it’s true that Python isn’t as fast as C or Rust in raw computation, with the right techniques, you can significantly speed up your Python code—especially if you're dealing with I/O-heavy workloads.
In this post, we’ll dive into:
- When and how to use
threading
in Python. - How it differs from
multiprocessing
. - How to identify I/O-bound and CPU-bound workloads.
- Practical examples that can boost your app’s performance.
Let’s thread the needle.
🧠 Understanding I/O-Bound vs CPU-Bound
Before choosing between threading or multiprocessing, you must understand the type of task you're optimizing:
Type | Description | Example | Best Tool |
---|---|---|---|
I/O-bound | Spends most time waiting for external resources | Web scraping, File downloads |
threading , asyncio
|
CPU-bound | Spends most time performing heavy computations | Image processing, ML inference | multiprocessing |
💡 Rule of thumb:
If your program is slow because it's waiting, use threads.
If it's slow because it's calculating, use processes.
🧵 Using Threading in Python
Python’s Global Interpreter Lock (GIL) limits true parallelism for CPU-bound threads, but for I/O-bound tasks, threading
can bring a huge speed boost.
Example: Threading for I/O-bound Tasks
import threading
import requests
import time
urls = [
'https://example.com',
'https://httpbin.org/delay/2',
'https://httpbin.org/get'
]
def fetch(url):
print(f"Fetching {url}")
response = requests.get(url)
print(f"Done: {url} - Status {response.status_code}")
start = time.time()
threads = []
for url in urls:
t = threading.Thread(target=fetch, args=(url,))
threads.append(t)
t.start()
for t in threads:
t.join()
print(f"Total time: {time.time() - start:.2f} seconds")
🕒 Without threads, this would take ~6 seconds (2s per request).
With threads, it runs in ~2 seconds, showing real speedup.
💡 Threading Caveats
Threads share memory → race conditions possible.
Use threading.Lock() to avoid shared resource conflicts.
Ideal for I/O, but not effective for CPU-heavy work.
🧮 Multiprocessing for CPU-Bound Tasks
For CPU-heavy workloads, the GIL becomes a bottleneck. That’s where the multiprocessing module comes in. It spawns separate processes, each with its own Python interpreter.
Example: CPU-bound Task with Multiprocessing
from multiprocessing import Process, cpu_count
import math
import time
def compute():
print(f"Process starting")
for _ in range(10**6):
math.sqrt(12345.6789)
if __name__ == "__main__":
start = time.time()
processes = []
for _ in range(cpu_count()):
p = Process(target=compute)
processes.append(p)
p.start()
for p in processes:
p.join()
print(f"Total time: {time.time() - start:.2f} seconds")
Here, we divide the work across all available CPU cores — a massive boost for computationally expensive tasks.
🔍 How to Tell if a Task is CPU-Bound or I/O-Bound
Use profiling tools or observation:
- Visual inspection Waiting on API calls, file reads → I/O-bound
Math loops, data crunching → CPU-bound
- Use profiling tools
pip install line_profiler
kernprof -l script.py
python -m line_profiler script.py.lprof
Or use cProfile:
python -m cProfile myscript.py
Check where time is spent: in I/O calls or computation.
🧰 Bonus: concurrent.futures for Clean Code
Instead of manually managing threads or processes, use:
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
ThreadPool for I/O:
with ThreadPoolExecutor(max_workers=5) as executor:
executor.map(fetch, urls)
ProcessPool for CPU:
with ProcessPoolExecutor() as executor:
executor.map(compute, range(cpu_count()))
✅ Final Thoughts
Python isn’t inherently slow—it just needs the right tools.
Task Type Use This
I/O-bound threading, asyncio, ThreadPoolExecutor
CPU-bound multiprocessing, ProcessPoolExecutor
Start small, profile your code, and choose the right parallelization strategy. Your app—and your users—will thank you.
Top comments (0)