I read a few posts that made me question whether I understood these concepts or could explain them clearly. There were lots of diagrams, and at least for me, too many words. That's not their problem, it's mine. That's a little bit embarrassing as I've written code that applies all three.
Here's a stab at simplifying it. Feel free to offer corrections, although the intent is to make these concepts easier to understand so that someone can learn about them in more detail, not to include all of those details up front.
While concurrency, parallelism, and multithreading are not the same thing, I think the biggest confusion is mixing those three related concepts with asynchronous execution (
Another confusion is that in the context of .NET code the words "concurrent" and "parallel" differ from their use elsewhere. That's unfortunate. I've added some clarifications at the end of this post.
Different threads are doing different things at the same time. A simple example is a web application which may start processing one request on one thread and then, if another request comes in while it's still processing the first one, start processing the next one on another thread.
I need perform 100 of some task. If I divide up that work between multiple threads that work simultaneously, I'll finish faster.
void ProcessThingsInParallel(IEnumerable<ThingINeedToProcess> things)
Parallel.ForEach(things, thing => thing.Process());
This gets mixed up with the other two, likely because it has something to do with threads. But it's entirely different.
Async describes how individual threads are used.
- If I execute a stored procedure using
SqlCommand.ExecuteNonQueryand the procedure runs for ten seconds, the thread used to call that method will do nothing for ten seconds except wait.
- If I execute it using
await SqlCommand.ExecuteNonQueryAsyncin an
asyncmethod, that thread can do other things for ten seconds. When the command is finished executing, that thread (or another one - we don't really care) will pick up where the first thread left off.
These are not comprehensive definitions or detailed technical descriptions. In my experience, sometimes we need simpler explanations before trying to process the big articles with lots of diagrams. Or maybe it's just me that needs that.
Perhaps when this is polished up I can add a second post with some of the details with which I didn't want to clutter this one.
I really wanted to make this post short and simple. It wasn't meant to be.
In the context of .NET applications, concurrency is almost always associated with execution on simultaneous threads. That's probably not intentional.
ConcurrentQueue and other collections in the same namespace. The description is:
Represents a thread-safe first in-first out (FIFO) collection.
If our concurrency was not achieved using multiple threads we would not need a thread-safe collection. We would just use a
Use of the word "concurrent" in the namespace and classes is accurate - the word means "simultaneous, at the same time." But the result is that when working with .NET concurrency and multithreading have become intertwined.
If "concurrency" means multithreading then it's not related to
async/await. If it means doing multiple things at once then
async/await supports concurrency by allowing a single thread to start one process and then do something else instead of waiting for the first process to finish.
Parallelism broadly means achieving concurrency by distributing work across multiple CPUs. So in .NET discussions when we talk about concurrency we mean parallelism.
In .NET world when we talk about parallelism we're often referring to a subset, a particular application of parallelism. It's when we have a very specific set of computations to perform and we distribute it across multiple threads.
Why? Look at Microsoft's Task Parallel Library and you'll see that most links referring to "parallel programming" lead to
Parallel.ForEach or other implementations which start with a set of tasks and distribute them.
This is not to imply that all .NET developers are confused about these concepts. It's just that in documentation or StackOverflow discussions we tend to use the terms differently.
Consider this paragraph from the description of a book entitled Concurrency In .NET, emphasis mine:
Unlock the incredible performance built into your multi-processor machines. Concurrent applications run faster because they spread work across processor cores, performing several tasks at the same time. Modern tools and techniques on the .NET platform, including parallel LINQ, functional programming, asynchronous programming, and the Task Parallel Library, offer powerful alternatives to traditional thread-based concurrency.
Okay, now I know it's not just me. Concurrency is equated with threads executing on different processors, and thread-based concurrency is "traditional." What's more, the Task Parallel Library is called an alternative to thread-based concurrency when it's explicitly a way to implement concurrency using multiple threads. I think I know what he means, but we've blurred some meanings.
Even referring to ".NET" is a bit vague at this point because we used to equate it with software running on multi-CPU Windows computers, but now it runs on all sorts of things.
Now my clarification is longer than the original post and involves five definitions of three terms. I'm really sorry.