Before moving into the interview questions and practice set, please check out these articles for a better understanding of various things related to multiprocessing in Python:
- Python Multiprocessing: Start Methods, Pools, and Communication.
- Inter Process Communication in Python Multiprocessing (With Examples).
- Race Conditions, Deadlocks, and Synchronisation in Python Multiprocessing.
Practice Questions
CPU-bound example: Calculate factorials or primes across multiple processes
from multiprocessing import Process, current_process
def factorial(number):
result = 1
for i in range(1, number + 1):
result *= i
print(f"Current process ID: {current_process().ident}, name: {current_process().name}")
print(result)
if __name__ == "__main__":
p1 = Process(target=factorial, args=(5,))
p2 = Process(target=factorial, args=(7,))
p1.start()
p2.start()
p1.join()
p2.join()
Output:
Current process ID: 11271, name: Process-2
5040
Current process ID: 11270, name: Process-1
120
Use a Pool
to parallelise computations
from multiprocessing import current_process, Pool
def factorial(number):
result = 1
for i in range(1, number + 1):
result *= i
print(f"Current process ID: {current_process().ident}, name: {current_process().name}")
return f"{number}! = {result}"
if __name__ == "__main__":
numbers = [1, 2, 3, 4, 5]
with Pool() as pool:
results = pool.map(factorial, numbers)
for result in results:
print(result)
Output:
Current process ID: 11860, name: SpawnPoolWorker-3
Current process ID: 11860, name: SpawnPoolWorker-3
Current process ID: 11860, name: SpawnPoolWorker-3
Current process ID: 11860, name: SpawnPoolWorker-3
Current process ID: 11860, name: SpawnPoolWorker-3
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
Share data across processes with Manager.dict()
or Value
"""
from multiprocessing import Process, Value, Manager
def get_count(shared_dict):
shared_dict["count"] = shared_dict.get("count", 0) + 1
if __name__ == "__main__":
with Manager() as manager:
shared_dict = manager.dict()
processes = [Process(target=get_count, args=(shared_dict,)) for _ in range(3)]
for p in processes: p.start()
for p in processes: p.join()
print("Final dict:", dict(shared_dict))
# Output:
# Final dict: {'count': 2}
"""
# Using Both
from multiprocessing import Process, Value, Manager
import ctypes
def worker(shared_counter, shared_dict):
# increment shared counter safely
for _ in range(1000):
with shared_counter.get_lock():
shared_counter.value += 1
# update shared dictionary
shared_dict["tasks"] = shared_dict.get("tasks", 0) + 1
if __name__ == "__main__":
with Manager() as manager:
shared_dict = manager.dict()
shared_counter = Value(ctypes.c_int, 0) # shared integer
processes = [Process(target=worker, args=(shared_counter, shared_dict)) for _ in range(5)]
for p in processes:
p.start()
for p in processes:
p.join()
print("Final counter value:", shared_counter.value) # should be 5000
print("Final dict:", dict(shared_dict)) # should show tasks=5
Output:
Final counter value: 5000
Final dict: {'tasks': 5}
Interview Questions
Trade-offs: higher memory cost than threads
In Python, using multiprocessing consumes more memory than multithreading because each process runs its own Python interpreter with a separate memory space. This independence allows true parallel execution across multiple CPU cores, but it comes at the cost of higher memory usage. Threads, on the other hand, are lightweight since they share memory within the same process, but they are limited by the Global Interpreter Lock (GIL). The primary trade-off is between memory efficiency and parallel performance.
Overhead of process creation
Creating a process is significantly more expensive than building a thread. On platforms like Windows or macOS, Python’s default spawn
start method launches a fresh interpreter and re-imports modules, which adds startup overhead. This means that for very small or short-lived tasks, the time spent creating processes can outweigh the benefits of parallelism. However, for long-running or CPU-heavy tasks, this overhead becomes negligible compared to the performance gain of using multiple cores.
Multiprocessing vs Multithreading in terms of performance and memory use
Multiprocessing and multithreading serve different purposes. Multiprocessing excels in CPU-bound workloads because processes bypass the GIL, allowing true parallel execution across cores. The trade-off is higher memory usage and process management overhead. Multithreading, however, is better for I/O-bound tasks such as file operations, networking, or database queries because threads share the same memory space, making them lightweight. But since threads are subject to the GIL, they cannot achieve parallelism in CPU-intensive tasks.
Use cases for Multiprocessing
Multiprocessing is commonly used when applications need to fully utilise the CPU for heavy computations. Examples include training machine learning models, large-scale data processing like ETL pipelines, image and video processing with libraries like OpenCV, and simulation or numerical workloads. It is also used for background workers in production systems, such as resizing images or handling resource-intensive tasks outside of the main web server process.
Be able to explain IPC clearly (Queues vs Pipes)
In multiprocessing, inter-process communication (IPC) is essential because processes do not share memory directly. Two common mechanisms are queues and pipes. A Queue
is a thread- and process-safe structure that allows multiple producers and consumers, making it ideal for scenarios like task pipelines. A Pipe
, on the other hand, provides a direct one-to-one connection between two processes. It is lightweight and faster, but not suitable for multiple producers or consumers. The choice depends on whether you need many-to-many or one-to-one communication.
Real-world usage: background workers, ML model training, data crunching
In real-world systems, multiprocessing is often applied to tasks where computation is the bottleneck. For example, background worker systems offload heavy operations like image resizing, data aggregation, or sending emails. Machine learning tasks such as model training or parallel hyperparameter tuning take advantage of multiple cores. Similarly, video encoding/decoding or data-crunching workloads in analytics pipelines are typical examples where multiprocessing significantly improves performance.
If you need to handle 1 million I/O tasks, will you use multiprocessing or asyncio?
If the requirement is to handle one million I/O tasks, multiprocessing is not the right choice because it will consume excessive memory and suffer from process creation overhead. Since the workload is I/O-bound rather than CPU-bound, asyncio
or multithreading is a much better fit. Async I/O allows the program to scale to thousands or even millions of connections with minimal overhead because it does not block on I/O operations. The general rule of thumb is: use multiprocessing for CPU-bound tasks and use asyncio or threads for I/O-bound tasks.
What are the start methods in multiprocessing, and how do they differ?
Python’s multiprocessing supports three start methods: fork
, spawn
, and forkserver
. On Unix, fork
is the default and is the fastest because it clones the parent process. However, it can sometimes lead to subtle bugs due to state being copied over, especially with threads or network connections. spawn
starts a fresh Python interpreter, which is slower but safer, and it’s the default on Windows. forkserver
is a compromise where a separate server process starts child processes on demand. Understanding these is important because the choice affects performance, memory usage, and stability.
What are race conditions in multiprocessing, and how do you prevent them?
Race conditions occur when multiple processes try to update shared state at the same time without proper synchronisation. For example, incrementing a shared counter without a lock may result in incorrect results because updates overlap. To prevent race conditions, Python provides synchronisation primitives such as Lock
, RLock
, Semaphore
, and Event
. By protecting critical sections with these tools, only one process at a time can access shared data safely. Choosing the right synchronisation primitive depends on whether you need exclusive access, limited access, or coordination among processes.
How do you debug multiprocessing issues?
Debugging multiprocessing programs is harder than debugging single-process code because exceptions in child processes don’t always propagate clearly to the parent. Common techniques include using logging instead of print
, because logging is process-safe and supports timestamps and process IDs. Another approach is to add timeouts to Queue.get()
or Process.join()
to detect deadlocks. Using tools like multiprocessing.set_start_method('spawn')
also helps isolate issues caused by forking. In production, structured logging and monitoring of worker processes are crucial for catching subtle bugs.
What is the role of shared memory in multiprocessing?
Shared memory allows processes to directly access the same block of memory instead of copying large datasets between them. In Python 3.8+, the multiprocessing.shared_memory
module enables this. It is especially useful in data-heavy workloads like image or array processing, where copying gigabytes of data to each process would be inefficient. However, since shared memory introduces the risk of race conditions, it must usually be combined with locks or semaphores to ensure data consistency.
What are common causes of deadlocks in multiprocessing?
Deadlocks typically occur when processes are circularly waiting for resources, so none of them can proceed. In multiprocessing, this can happen if a process acquires a lock and never releases it, or if a Queue.get()
call blocks forever because no data is being put in. Another scenario is when processes are waiting for each other’s results using join()
in a circular dependency. The solution usually involves careful design: always releasing locks, using non-blocking calls or timeouts, and sending sentinel values to signal termination.
How do you choose between Pool
and ProcessPoolExecutor
?
Both Pool
and ProcessPoolExecutor
provide abstractions for managing multiple worker processes, but they differ in style and flexibility. Pool
is part of the multiprocessing
module and is more traditional, offering functions like map
and apply
. ProcessPoolExecutor
, from the concurrent.futures
module, provides a modern interface with submit
and as_completed
, making it easier to integrate with async code and better for structured concurrency. In interviews, you can say that ProcessPoolExecutor
is generally preferred for cleaner, more Pythonic code, unless you are working with legacy multiprocessing code.
How do you handle serialisation issues in multiprocessing?
Since processes don’t share memory directly, data passed between them must be serialised. By default, Python uses pickle
, but not all objects are pickleable, such as open file handles or sockets. For more complex objects, libraries like cloudpickle
can be used because they handle functions and lambdas better. Serialisation overhead can also become a performance bottleneck when passing large data structures frequently, so in such cases, shared memory or memory-mapped files might be better solutions.
Can you give a real-world example of using multiprocessing?
A good real-world example is training machine learning models. Suppose you’re performing hyperparameter tuning where you need to train multiple models with different parameters. Each training task is CPU-intensive and independent, making it ideal for multiprocessing. By creating a pool of worker processes, each process can train one model in parallel, significantly reducing the total training time. Another example is video encoding, where multiple frames or chunks of video can be processed simultaneously using separate processes.
Top comments (0)