Meeth Gangwar

Posted on Sep 12

Need Speed in Python? When to Use Threading vs. Multiprocessing.

#python #computerscience #backend #programming

🐢 Is your Python code stuck in the slow lane? What if you could tell it to stop waiting around and get more done? 🤔

The secret lies in the world of concurrency! While it's a deep and powerful topic, we're starting with a game-changer: the threading package. 🧵
In this article, we're diving deep into how you can use threading to supercharge your programs, making those sluggish I/O operations—like downloading files, reading databases, or calling APIs—blazingly fast! ⚡

Get ready to unlock a new level of performance. Let's untangle the threads! 🔓

🤔 What on Earth are Processes & Threads?

Let's break it down without the scary jargon!

Imagine your computer is a giant kitchen 🧑🍳👩🍳. This kitchen's goal is to cook meals (aka run your programs).

What's a PROCESS? 🍳

Technical Jargon Buster: A process is an instance of a program.

Fun Explanation:
A process is like a single chefgetting a recipe and their own private kitchen station to work in. This station has its own oven, bowls, ingredients, and tools. No other chef can use them!

My laptop has 4 CPUs? That means my kitchen has 4 separate stations. So, 4 chefs can cook 4 different recipes (processes) at the exact same time! More stations (CPUs) = more chefs working = a faster kitchen! 🚀

The Catch: Hiring a new chef and building them a whole new station (creating a process) takes a lot of time and effort. It's powerful, but not always the quickest solution.

What's a THREAD? 🧵

Technical Jargon Buster: A thread is an entity within a process that can be scheduled.

Fun Explanation:
A thread is like a single task a chef is doing. One chef (process) can have multiple threads! They can be chopping veggies 🥕 (thread 1) while the water is boiling 💨 (thread 2).

They Share Everything! All the threads (tasks) for one chef share the same station—the same oven, the same bowl of sugar, the same knives. This makes it super easy for them to collaborate!
It's Super Efficient! Telling your chef to start another task (creating a thread) is way faster than hiring a whole new chef and building a new station (creating a process).

🚨 The Python Plot Twist: The GIL (Global Interpreter Lock)
So if threads are so great, why isn't Python code lightning fast all the time? Enter the GIL, Python's quirky bodyguard. 🕵️‍♂️

Imagine our chef is using a single, magical recipe book. The GIL is a rule that says:

"Only ONE hand (thread) can turn the pages of the recipe book at a time!" 📖➡️🤚

Why? To prevent chaos! If two hands tried to read and change the recipe at the same time, the instructions could get messed up (this is called a race condition and it corrupts data).
The Result: Even though our chef can do multiple tasks, they can only follow one step from the recipe at any single moment. They quickly switch between tasks—chopping, then stirring, then checking the oven—so fast it feels simultaneous, but it's not truly parallel.

So, if the GIL only allows one thread at a time... why do we even use threads in Python? 🤯

Ah-ha! That's the million-dollar question! The answer is the key to unlocking real speed in your programs.

Stay tuned for the next section, where we'll crack the GIL's code and learn exactly when threading makes Python FLY! ⚡

🏁 Understanding Multithreading with a Classic Example! 🧵⚡

Hey everyone! Let's break down this classic example of multithreading in Python. It demonstrates a common pitfall and its superhero solution: the Lock. 🦸‍♂️

But first, hold on! Let's understand the villain of our story...

🤯 What is a Race Condition?
Imagine a race condition is like two people trying to update the same Google Doc cell at the exact same time. ✍️💥

Person A reads the number: 0.

Person B also reads the number: 0 (before Person A can save their change).

Person A adds 1 and saves: the doc now says 1.

Person B adds 1 and saves: the doc overwrites Person A's work and now says 1.

Final result: 1
Expected result: 2 (because two people each added 1)

This chaos is exactly what happens between threads! They "race" to update a shared resource, and the result is incorrect. Our code above brilliantly replicates this chaos and shows us how to fix it!

🔎 Code Breakdown: Taming the Race

The Guardian of the Gate: if name == 'main': This isn't just a random line—it's a security guard for your code! 🛡️

name is a special Python variable.

When you run the file directly (python script.py), Python sets name to 'main'.
If someone imports your script as a module, name becomes the module's name.

So, this line means: "Only run the code below if this is the main file being executed, not if it's being imported." This is a crucial best practice, especially with threads!

2.** The Mission: The increase() Function**
This function has one job: grab the shared database_value, add 1 to it, and put it back. Simple, right? Not when two threads are doing it at once!

Summoning the Threads: The Cast of Characters

thread1 = Thread(target=increase, args=(lock,))
thread2 = Thread(target=increase, args=(lock,))
Thread(): This is how we create a new thread (a new worker). 🧑💻🧑💻

target=increase: Tells the thread which function to run.

args=(lock,): Passes the same Lock object to both threads. Heads up! The comma inside (lock,) is essential. It tells Python it's a tuple with one item. Without it, it's just lock in parentheses.

2)Understanding the code inside the increase function:
The increase function is a function which we have defined and this function basically what it does is that it helps in increasing the value of the database by 1 .
Now the important part is inside the if statement the part where we wrote
thread1=Thread(target=increase,args=(lock,))
thread2=Thread(target=increase,args=(lock,))
this makes sure that there are two threads thread1 and thread2 and these both the threads are aimed to the target function!

🎭 Act 1: The Problem (No Lock = Chaos!)
Let's play out the scenario WITHOUT the lock.acquire() and lock.release() lines:

Thread 1 enters the increase() function.
It reads database_value (0) into its local_copy.
It increments local_copy to 1.
time.sleep(0.1) is called! 😴 Thread 1 hits the pause button and goes to sleep.
The operating system sees Thread 1 is sleeping and switches to Thread
Thread 2 enters the increase() function.
It reads database_value (which is still 0 because Thread 1 hasn't saved yet!).
It increments its local_copy to 1.
time.sleep(0.1) is called! 😴 Thread 2 also goes to sleep.
Thread 1 wakes up and saves its value (1) to database_value.
Thread 2 wakes up and saves its value (1) to database_value, overwriting the update from Thread 1!

The Final Tragedy: database_value = 1 ❌
What we wanted: database_value = 2 ✅
This is the infamous race condition!

🦸‍♂️ Act 2: The Solution (With Lock = Order!)
Now, let's use the Lock! The lock.acquire() and lock.release() lines are the heroes.

Thread 1 enters the function and immediately calls lock.acquire(). It grabs the lock! 🔒 "It's my turn!"
It does its work (read, increment) and then goes to sleep. It still holds the lock while sleeping.
Thread 2 enters the function and tries to call lock.acquire(). 🚫 The lock is taken! Thread 2 is now BLOCKED and forced to wait.
Thread 1 wakes up, saves the value (1), and calls lock.release(), freeing the lock. 🔓 "I'm done!"
Thread 2, which was waiting, can now acquire the lock and proceed.
Thread 2 reads database_value (which is now correctly 1).
It increments it to 2, sleeps, and saves the result.

The Victory: database_value = 2 ✅

🎯 Key Takeaway

time.sleep(0.1) mocks a real-world slow operation (like waiting for a database response or a network API call). 🌐

The Lock ensures that only one thread at a time can execute the critical section of code (the part that touches the shared data).

Correctness is maintained because the lock forces threads to wait for their turn, preventing them from reading dirty or intermediate data.

Using a Lock is like giving a single key to a shared room. Only the person with the key can enter and use the room. Everyone else has to wait outside until the key is returned! 🗝️

🧠 Leveling Up: Threading vs. Async Await - Why Learn Both?

First off, pat yourself on the back! 👏 If you looked at the threading example and thought, "Hang on, this feels familiar... isn't this what async/await does?", then you're already thinking like a senior engineer. That's an incredible connection to make!

You're absolutely right. Threading and Async both solve the same core problem: making I/O-bound code faster. They are two different tools from the toolbox, both designed to stop your program from sitting idle, twiddling its thumbs, while it waits for slow external services (databases, APIs, file systems). ⏳➡️⚡

But they are not the same thing. Let's break down the "why".

🏗️** The Great Illusion: How Do They Work?**
Imagine threading is like hiring multiple chefs for a kitchen.

1. The OS is the kitchen manager. It has the power to force a chef to stop chopping veggies immediately and put another chef on the stove. This is called preemptive multitasking.
2. It's powerful but heavy. Each new chef (thread) needs their own set of resources and space. Switching between them (context switching) takes time and effort.
3. There's always overhead in communication (like our Lock) to make sure they don't burn the same sauce.

Async: The "Single Master Chef" Approach 🧙‍♂️ Now, imagine async is a single, incredibly efficient master chef.

This chef starts a task (e.g., putting water on to boil). Instead of waiting and staring at the pot, they immediately check the recipe book to see what else they can do (e.g., chop vegetables). 🥕
They are cooperatively switching tasks. They themselves decide when to pause one task and switch to another. This is called cooperative multitasking.
It's incredibly lightweight. There's only one chef, so no communication overhead, no context switching cost. But it requires every task to be well-behaved and say "I'm going to wait now, someone else can go."

https://dev-to-uploads.s3.amazonaws.com/uploads/articles/q2df5ohuhjtrvcet90sp.webp
This image perfectly shows the difference: Threading uses multiple OS-managed threads (multiple lanes of traffic with a traffic cop), while Async uses a single thread with cooperative scheduling (cars politely yielding to each other on a single lane).

🤔 So Why Did We Just Spend Time Learning Threading?!
This is the million-dollar question! If Async is so lightweight and efficient, why does threading even exist? Here’s the deal:

🧪 The Legacy Codebase Reality
You hit the nail on the head. The world runs on legacy code. Mountains of enterprise software, scripts, and systems were built before asyncio became mature in Python. These systems use threading and it works. Rewriting them entirely in async would be a massive, expensive, and risky project. Understanding threading is essential for maintaining and updating a huge portion of the software that powers the world today.
🛠️ The Right Tool for the Job
While async is fantastic for I/O-bound tasks (network calls, file ops), threading can also handle CPU-bound tasks that can truly run in parallel on multiple cores... if you can avoid the GIL. How? By using multiprocessing (which creates separate processes) or by offloading heavy number-crunching to libraries like numpy that release the GIL. Async can't do that.
🧠 Conceptual Foundation
Threading teaches you the fundamental problems of concurrency. 🎓

Race Conditions
Deadlocks
Locks & Synchronization
Shared State

These are universal concepts. If you understand the pain of managing a Lock() in threading, you deeply understand why async was invented to avoid that pain. It makes you a better async programmer because you know what problems it's solving under the hood.

⚔️ It's Not a Total Replacement There are scenarios where threading is still a simpler or more appropriate solution than a full async framework, especially for simpler scripts or when you need to run background tasks in a framework like Django that isn't built on async.

So, you learned threading not just to use it, but to understand the core problem. You now know why modern tools like Async were created. It's like learning to drive a manual transmission before an automatic—it gives you a deeper understanding of how the engine works, making you a better driver overall.

Think of it this way: Threading is the theory. Async is one of the modern applications of that theory. You can't truly master one without understanding the other. 🚀

🎉 Conclusion & What's Next?

And... that's a wrap! 🎬

A massive thank you for joining me on this deep dive into the world of Python threading. 🙏 Your time and curiosity are what make writing this so rewarding. I truly hope you're walking away with that satisfying "Aha! 💡" feeling and some new tools for your coding toolkit.

🚀 Your Coding Superpower
You've just leveled up. You now understand a concept that trips up many developers. The next time you see a Lock() or hear about a "race condition," you can nod knowingly instead of sweating nervously. That's a big win!

🔜 What's on the Horizon?
We've mastered the concurrency puzzle with threading, but the adventure doesn't stop here! The world of parallel execution in Python has more exciting chapters:

⚡ Multiprocessing: The true key to unlocking the full power of your multi-core CPU for CPU-bound tasks (like number crunching or image processing), completely bypassing the GIL!
🤖 The Async Await Universe: A deeper look into the modern, lightweight alternative to threading for I/O-bound chaos.
🧠 Advanced Threading Patterns: Exploring thread pools, queues, and other sophisticated ways to manage your threads like a pro.

Each of these is a fascinating topic in itself, and I can't wait to explore them with you in future articles!

Keep the Connection Alive!
As always, I'll be here, constantly breaking down complex tech topics into bite-sized, fun-to-learn pieces. Stay tuned for more articles daily!

Until the next one, happy coding! May your bugs be minor and your coffee be strong. ☕

Cheers,
Meeth
Backend Engineer