If you're new to Python and wondering how to make your programs faster or handle multiple tasks at once, this guide is for you. We'll explore the basics of processes and threads, and how to use them in Python with practical examples.
🧠 What Are Processes and Threads?
Understanding the difference between processes and threads is key to writing efficient Python programs.
-
Process:
- A process is an independent program running in its own memory space.
- It doesn’t share data with other processes unless explicitly told to (e.g., using queues or pipes).
- Think of each process as a separate tab in your browser—each runs independently.
-
Thread:
- A thread is a lightweight unit of execution within a process.
- Multiple threads in the same process share memory and resources.
- Threads are great for tasks that wait a lot (like downloading files), because they can switch while waiting.
🧪 Example Explained
import threading
def task():
print("Running in thread")
thread1 = threading.Thread(target=task)
thread2 = threading.Thread(target=task)
thread1.start()
thread2.start()
This code creates two threads that run the same function. They execute concurrently, meaning they can run at the same time, depending on CPU scheduling.
🧵 Multithreading in Python (Practical Example)
Multithreading is ideal for I/O-bound tasks, which spend time waiting for external resources.
🧪 Example Explained
import threading
import requests
def download(url):
response = requests.get(url)
print(f"{url}: {len(response.text)} bytes")
urls = ["https://example.com", "https://httpbin.org"]
threads = [threading.Thread(target=download, args=(url,)) for url in urls]
for t in threads: t.start()
for t in threads: t.join()
-
requests.get(url)waits for a response from the server. - While one thread is waiting, another can start downloading.
-
start()begins the thread, andjoin()waits for it to finish.
⚙️ Multiprocessing in Python (Practical Example)
Multiprocessing is best for CPU-bound tasks, which require heavy computation.
🧪 Example Explained
from multiprocessing import Process
def square(n):
print(n * n)
numbers = [1, 2, 3]
processes = [Process(target=square, args=(n,)) for n in numbers]
for p in processes: p.start()
for p in processes: p.join()
- Each
Processruns in its own memory space. - Python creates separate processes for each number.
- This avoids the Global Interpreter Lock (GIL), which limits multithreading in Python for CPU tasks.
🧰 ThreadPoolExecutor vs ProcessPoolExecutor
These are part of the concurrent.futures module and simplify working with threads and processes.
🧪 Example Explained
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def task(n):
return n * n
# Thread pool
with ThreadPoolExecutor() as executor:
results = executor.map(task, [1, 2, 3])
print(list(results))
# Process pool
with ProcessPoolExecutor() as executor:
results = executor.map(task, [1, 2, 3])
print(list(results))
-
ThreadPoolExecutorcreates a pool of threads and assigns tasks. -
ProcessPoolExecutorcreates a pool of processes. -
executor.map()runs the function on each item in the list. - These tools handle starting, joining, and managing resources automatically.
🌐 Web Scraping with Multithreading
Multithreading speeds up web scraping by fetching multiple pages in parallel.
🧪 Example Explained
import threading
import requests
from bs4 import BeautifulSoup
def scrape(url):
html = requests.get(url).text
soup = BeautifulSoup(html, 'html.parser')
print(f"{url}: {soup.title.string}")
urls = ["https://example.com", "https://httpbin.org"]
threads = [threading.Thread(target=scrape, args=(url,)) for url in urls]
for t in threads: t.start()
for t in threads: t.join()
- Each thread downloads and parses a webpage.
-
BeautifulSoupextracts the title from the HTML. - Threads run concurrently, reducing total scraping time.
🏭 Real-World Use Case with Multiprocessing
Multiprocessing is perfect for tasks like image processing, simulations, or data analysis.
🧪 Example Explained
from multiprocessing import Pool
import time
def heavy_task(n):
time.sleep(1)
return n * n
with Pool() as pool:
results = pool.map(heavy_task, [1, 2, 3, 4])
print(results)
-
Pool()creates a pool of worker processes. -
map()distributes the tasks across processes. -
time.sleep(1)simulates a slow, CPU-heavy task. - All tasks run in parallel, so total time is much shorter than running them sequentially.
🧩 Summary Table Explained
| Feature | Multithreading | Multiprocessing |
|---|---|---|
| Best for | I/O-bound tasks (e.g., web scraping, file I/O) | CPU-bound tasks (e.g., data crunching, simulations) |
| Memory | Shared among threads | Separate for each process |
| Speed | Faster for I/O due to overlapping waits | Faster for computation due to parallel CPU usage |
| Python Module |
threading, concurrent.futures
|
multiprocessing, concurrent.futures
|
This table helps you choose the right tool:
- Use threads when your program waits a lot.
- Use processes when your program computes a lot.
Top comments (0)