Kamil Bugno

Posted on May 4, 2022

Parallel programming in C#

Do you know what parallel programming is, how it differs from asynchronous programming and why it can enhance your application? If not, let's delve into this incredibly interesting topic.

What is parallel programming?

As the name suggests it means that certain pieces of code can run at the same time. Parallelism is something that you should even know from your everyday life. If this morning you prepared a coffee or a tea, probably you didn't wait until the water boiled to get the cup out of the cupboard. Most likely you did several things at once: you poured water into the kettle, turned it on, and while it was boiling you prepared a cup and poured in the right amount of coffee/tea. Thanks to it the process of preparing your morning drink was faster and you saved time.

If you want to translate the above situation into the programming world, we will use terms such as CPU and threads. Look at this picture:

As you can see, we have two CPUs, and some threads (black rectangular elements). At the same time, we do two operations: boil the water and prepare the cup. It is good to notice, that making a coffee/tea in a parallel way assumes that we use more than one available CPU to execute our tasks simultaneously.

It is easy to confuse parallelism with asynchronism because these things seem to be similar, but there is a huge difference between them - let's explore it.

Parallelism vs. asynchronism

Parallelism means that we can do two or more things at the same time, and you need at least two CPUs for that, while asynchronism can be achieved with one CPU and means that your threads don't actively wait for the end of some external operation. Also, async programming is used more frequently for some IO operations such as connecting to the database, access the file or call the API, whereas parallel programming is more focused of heavy CPU operations such as complex calculations. What is more, asynchronism can be achieved without parallelism: JavaScript is able to use asynchronous operations, but it is impossible to do some parallel stuff (since there is only a single-core). If you are interested in the details of asynchronous programming, read one of my posts that explains it.

Prons and cons of the parallel programming

Parallel programming is not something magical that always have only benefits for our application. Like most things it has its advantages and disadvantages. Let's take a look at them:
✅ As we mentioned earlier, parallelism can reduce the time of our operations because we are able to execute several actions at the same time.
✅ We can use our resources more efficient when we write a code with parallelism in mind.
❌ Parallel code is usually more complex to write, read, and debug.
❌ Parallelism has a strong connection to the hardware, there can be some differences in the execution of our code depending on the underlying hardware.

You should already understand the basic concept of parallelism, so let's go straight to how it can be achieved in C#.

Parallel programming in C#

Let's assume that we have several heavy computations that we would like to conduct. In our scenario, I prepared a special method for it:



public static int HeavyComputation(string name)
{
    Console.WriteLine("Start: " + name);
    var timer = new Stopwatch();
    timer.Start();
    var result = 0;
    for (var i = 0; i < 10_000_000; i++)
    {
        var a = ((i + 1_500) / (i + 30)) * (i + 10);
        result += (a % 10) - 120;
    }
    timer.Stop();
    Console.WriteLine("End: "+ name + ' ' + timer.ElapsedMilliseconds);
    return result;
}

The math operations simulate our heavy computation. Without the parallel programming, when we have several computation actions, we can run them in a sequence:



static void Main()
{
    var timer = new Stopwatch(); 
    timer.Start();
    HeavyComputation("A");
    HeavyComputation("B");
    HeavyComputation("C");
    HeavyComputation("D");
    HeavyComputation("E");
    timer.Stop();
    Console.WriteLine("All: " + timer.ElapsedMilliseconds);
}

The output is following:



Start: A
End: A 106
Start: B
End: B 106
Start: C
End: C 102
Start: D
End: D 110
Start: E
End: E 114
All: 575

Thanks to the example, we can see that All actions take similar or more time than the sum of the times of each action: All >= A + B + C + D + E.

Using `Parallel.Invoke` method

There is a way of improving the performance of it. We can simply use Task Parallel Library. Our HeavyComputation method remains unchanged, but we modify the way of calling it by using Parallel.Invoke:



static void Main()
{
    var timer = new Stopwatch(); 
    timer.Start();
    Parallel.Invoke(
        () => HeavyComputation("A"),
        () => HeavyComputation("B"),
        () => HeavyComputation("C"),
        () => HeavyComputation("D"),
        () => HeavyComputation("E")
    );
    timer.Stop();
    Console.WriteLine("All: " + timer.ElapsedMilliseconds);
}

The output is following:



Start: B
Start: A
Start: D
Start: C
Start: E
End: B 103
End: C 109
End: E 111
End: A 112
End: D 112
All: 200

Based on the input, there are several things that I would like to discuss:

time - as you can see, in the above example All operations take less time than the sum of each action: All < A + B + C + D + E. The conclusion is one - our computation tasks were run at the same time. It is important that Parallel.Invoke doesn't guarantee the parallel execution, because, as we mentioned earlier, it is dependent on other things such as hardware, but it enables your code to run at once.
order of operations - Parallel.Invoke doesn't provide the same execution order as we specified in the method, so this is the reason why our actions were executed in a different order.

It is also good to know that there are three more useful methods that can help us with executing our code in parallel:

Parallel.For
Parallel.ForEach
Parallel.ForEachAsync - it was added in .NET 6

Parallelism with asynchronism

Even though the parallel solution may seem to be correct, it can be improved even further, because it has one drawback: it is executed in a synchronous way. What exactly does it mean? I think that visual representation can clarify it:

our current solution (with some simplification) is the following The first CPU run the Main method and execute our computations (A-E) on separate CPUs. It is important that during the computations, the calling thread is actively waiting for the end of the execution of all computations.
in async way the picture will be as follows: During the heavy computation, the first CPU is free and can be used as needed.

How to write a code that will use parallel and asynchronism? The answer in not very complicated:



static async Task Main()
{
    var timer = new Stopwatch();
    timer.Start();
    await Task.Run(() =>
    {
        Parallel.Invoke(
            () => HeavyComputation("A"),
            () => HeavyComputation("B"),
            () => HeavyComputation("C"),
            () => HeavyComputation("D"),
            () => HeavyComputation("E"));
    });
    timer.Stop();
    Console.WriteLine("All: " + timer.ElapsedMilliseconds);
}

We just wrapped the Parallel.Invoke call inside Task.Run. Thanks to it we can use the await/async keywords and enhance the solution with asynchronous programming. Although everything looks fine now, there is still one thing to focus on: our computation returns some value, but so far, we haven't used it. Let's change that!

Save values from computations

In C# we have special collections designed for concurrent use. One of them is ConcurrentBag<T> that is thread-safe and can store unordered collection of objects. Our final solution will look like this:



static async Task Main()
{
    var timer = new Stopwatch();
    timer.Start();

    var myData = new ConcurrentBag<int>();

    await Task.Run(() =>
    {
        Parallel.Invoke(
            () => { myData.Add(HeavyComputation("A")); },
            () => { myData.Add(HeavyComputation("B")); },
            () => { myData.Add(HeavyComputation("C")); },
            () => { myData.Add(HeavyComputation("D")); },
            () => { myData.Add(HeavyComputation("E")); });
    });
    timer.Stop();
    Console.WriteLine("All: " + timer.ElapsedMilliseconds);

    Console.WriteLine(string.Join(",", myData));
}

We simply add new values to our ConcurrentBag collection and display it. Besides ConcurrentBag, there are more thread-safe entities that can store our data:

ConcurrentStack<T>
ConcurrentQueue<T>
ConcurrentDictionary<TKey,TValue>

It is good to know that from the performance perspective, there is one situation when they are slower than their nonconcurrent equivalent: when our solution doesn't use a lot of parallelism. So we have the most to gain from using them if our code is indeed heavily concurrent.

Exception handling

Let's modify our code to throw an exception at some point:



try
{
    await Task.Run(() =>
    {
        Parallel.Invoke(
            () => { myData.Add(HeavyComputation("A")); },
            () =>
            {
                Console.WriteLine("Starting B");
                throw new Exception("B");
                myData.Add(HeavyComputation("B"));
            },
            () => { myData.Add(HeavyComputation("C")); },
            () => { myData.Add(HeavyComputation("D")); },
            () => { myData.Add(HeavyComputation("E")); });
    });
}
catch (Exception e)
{
    Console.WriteLine("Exception: " + e);
}

I added try/catch block and raised an exception for B operation - all the rest remains unchanged. There are two things to keep in mind when it comes to exceptions:
a) When one parallel action raises an exception, it doesn't mean, that other operations will be stopped. They will continue their execution despite the fact of the exception. The output from the above code is the following:



Start: C
Start: A
Starting B
Start: E
Start: D
End: C 125
...

It shows that after the failure of B operation, the system still executed operation E and D. Taking this into consideration, you can assume that parallel operations are separated when it comes to the exception perspective.
b) When we catch the exception, you can see that we are dealing with AggregateException:

The AggregateException contain list of all exceptions throw by our parallel operations (in our case it is only 1) and the details of them (in our case it is "B"). It is important that thanks to AggregateException no exception will be lost, even when they will be raised at the same time.

There are two ineresting methods for handling AggregateException:

Flatten() - sometimes there is a need to have a parallel operation that run some nested parallel operations. As a name suggests, Flatten helps with converting all nested exceptions to one-level exceptions.
Handle() - it provides you with the ability to invoke a handler on each of your exceptions.

Let's see a full example that uses both Flatten() and Handle():



try
{
    await Task.Run(() =>
    {
        Parallel.Invoke(
            () => { myData.Add(HeavyComputation("A")); },
            () =>
            {
                //some nested operations
                Parallel.Invoke(
                        () =>
                        {
                            Console.WriteLine("Starting B1");
                            throw new InvalidTimeZoneException("B1");
                            myData.Add(HeavyComputation("B1"));
                        },
                        () => { myData.Add(HeavyComputation("B2")); },
                        () =>
                        {

                            Console.WriteLine("Starting B3");
                            throw new ArgumentNullException("B3");
                            myData.Add(HeavyComputation("B3"));
                        });
            },
            () => {
                Console.WriteLine("Starting C");
                throw new BadImageFormatException("C");
                myData.Add(HeavyComputation("C"));
            },
            () => { myData.Add(HeavyComputation("D")); },
            () => { myData.Add(HeavyComputation("E")); });
    });
}
catch (AggregateException e)
{
    e.Flatten().Handle(myException =>
    {
        var result = myException switch
        {
            InvalidTimeZoneException _ => HandleInvalidTimeZoneException(),
            ArgumentNullException _ => HandleArgumentNullException(),
            BadImageFormatException _ => HandleBadImageFormatException(),
            _ => false
        };
        return result;
    });
}

In the code above, we throw three types of exceptions: BadImageFormatException, ArgumentNullException, and InvalidTimeZoneException. Some of the parallel operations are nested, so in the catch block we use Flatten(). Using Handle helps us to deal independently with each of the exceptions. There are three helpers methods:



private static bool HandleInvalidTimeZoneException()
{
    Console.WriteLine("HandleInvalidTimeZoneException");
    //...
    return true;
}

private static bool HandleArgumentNullException()
{
    Console.WriteLine("HandleArgumentNullException");
    //...
    return true;
}

private static bool HandleBadImageFormatException()
{
    Console.WriteLine("HandleBadImageFormatException");
    //...
    return true;
}

Each of them handles the exact type of the exceptions and returns true. For Handle method value true means that the exception was handled correctly and should not be rethrown. This is the reason why we have this operation in the switch statement: _ => false - an exception that is not BadImageFormatException, ArgumentNullException, or InvalidTimeZoneException will be rethrown.

So far, our example was not complex. In real life scenarios, it may be different, and we will need some synchronization. Let's take a look at it.

Synchronization

Synchronization is a response to a situation when a lot of threads access the same data. Why might this situation be problematic? Let's assume that we want to perform some computation in parallel and store the sum of the results in a variable:



var finalResult = 0;

await Task.Run(() =>
{
    Parallel.For(0, 20, i => 
    { 
         finalResult += HeavyComputation(i.ToString()); 
    });
});

We used the Parallel.For method that is a loop whose iterations are executed at the same time. The first argument is the starting index (in our case 0), second is the final index (in our case 20) and third is an action that contains actual index (in our case i variable). Every parallel operation adds its result to the finalResult variable, and here is the issue: we end up with race condition.

What is race condition?

Race condition usually occurs when a lot of threads share the same data and want to modify them. Let's see how it can work in our case assuming that HeavyComputation returns value 10 all the time:

Thread X enters the line finalResult += HeavyComputation(i.ToString());. It checks the finalResult value and currently it is zero.
Thread Y enters the same line, checks the finalResult value that is 0, conduct the computations and modify the value to 10.
Thread X execute the computation and add ten to the value of finalResult. Because it checked the finalResult before it was modified by thread Y, the final value is ten instead of twenty.

Synchronization can deal with the race conditions by, among other things, using locks.

Locks

Locks are part of exclusive locking constructs. It means that only one thread can be in a section of the code that is protected by the lock. All other threads are blocked until the section is free and a new one can enter. Because of it, you should always lock a code that execute quickly to avoid bottlenecks. Let's see how it can be implemented in C#:



var finalResult = 0;
var syncRoot = new object();

await Task.Run(() =>
{
    Parallel.For(0, 20, i =>
    {
        var localResult = HeavyComputation(i.ToString());
        lock (syncRoot)
        {
            //one thread at the same time
            finalResult += localResult;
        }
    });
});

We used lock keyword that introduces the section where only one thread can be at the same time. As you can see, the section contains an operation of adding the value to finalResults - the computation is done before and, thanks to this, it can still be done in parallel. As we mentioned earlier, the code inside lock shouldn't be time consuming, because we will have a bottleneck and won't really use the power of parallelism.

One of the risks of using lock is dead lock. It is a situation when all threads are waiting (usually for each other) and the application cannot continue working. To avoid dead lock, it is good to have a separate syncRoot for every lock and minimizing the nesting of the locks.

It is worth mentioning that under the hood of lock we have a Monitor.Enter and Monitor.Exit operations. Let's see how the above code looks in Intermediate Language (IL):



//...
int num = HeavyComputation(i.ToString());
object obj = syncRoot;
bool lockTaken = false;
try
{
    Monitor.Enter(obj, ref lockTaken);
    finalResult += num;
}
finally
{
    if (lockTaken)
    {
        Monitor.Exit(obj);
    }
}

When we execute Monitor.Enter without any exception, the value of lockTaken will be changed to true. Later our code will be run (in this case finalResult += num;) and in the finally block we exit the monitor. finally section is extremely important because it protects us from the situation that our exception resulted in having a thread that enter the lock block but doesn't exit.

It is good to know that inside lock block we are not allowed to use async operations - otherwise you get an error: cannot await in the body of a lock statement. Why is there this limitation? Because from the thread perspective await keyword can create a new thread, but inside the lock only one thread is allowed.

Summary

In this post I tried to explain fundamentals of parallel computing: we learned why it is important, what are the advantages and disadvantages of parallelism and how to write correct concurrent code in C# using Task Parallel Library and locks. I hope that now you know all the essentials to start your journey into the parallelism!