DEV Community

Sıddık AÇIL
Sıddık AÇIL

Posted on • Updated on

Simple .NET Core Benchmark for Whitespace Count

I am open to job offers, feel free to contact me for any vacancies abroad.

In this installment, I will take you through a simple benchmark using BenchmarkDotNet. We are going implement and compare different functions that use different approaches to find how many whitespace characters a string has via the static function char.IsWhiteSpace. Let us get to work :)

Methods

These methods are pretty familiar to any programmer out there except, maybe, unsafe functions.

Naive For Loop with Index Accessor


    public int IndexAccessorVersion()
    {
        int cnt = 0;
        for(int i = 0;i < test.Length;i++)
        {
            if (char.IsWhiteSpace(test[i]))
                cnt++;
        }
        return cnt;
    }

Foreach Loop


    public int ForeachVersion()
    {
        int cnt = 0;
        foreach(char c in test)
        {
            if (char.IsWhiteSpace(c))
                cnt++;
        }
        return cnt;
    }

LINQ Expression on String


public int LINQStringVersion() => test.Count(c => char.IsWhiteSpace(c));

LINQ Expression on Char Array


public int LINQCharArrVersion() => test.ToCharArray().Where(c => char.IsWhiteSpace(c)).Count();

Unsafe Looping without Unrolling

An unsafe function is a block of function that enables managed code to use unmanaged facilities. We can obtain raw pointers with the fixed keyword to perform pointer arithmetic for iteration and dereference the acquired pointers to get the data at the address.


    public unsafe int UnsafeVersion()
    {
        int cnt = 0;
        fixed(char* p = test)
        {
            for (int i = 0; i < test.Length; i+=1) 
            {
                if (char.IsWhiteSpace( *(p+i) )) 
                    cnt++;
            }
        }
        return cnt;
    }

Benchmark

Setup

We need to do 3 things:

  1. Create a simple class that holds a string which contains the methods above. These methods should be marked with attribute Benchmark.

    
    [MemoryDiagnoser] // Report Allocations and GC Generations
    public class WhiteSpaceCounter
    {
        private string test {get; set;}
        [Benchmark]
        public int LINQCharArrVersion() => test.ToCharArray().Where(c => char.IsWhiteSpace(c)).Count();
        [Benchmark]
        public int ForeachVersion()
        {
            int cnt = 0;
            foreach(char c in test)
            {
                if (char.IsWhiteSpace(c))
                    cnt++;
            }
            return cnt;
        }
    
    }
    
    
  2. This string should be initialized at the constructor.

    
     public WhiteSpaceCounter()
        {
    
            test = "Split your lungs with blood and thunder when you see the white whale";
            for (int i = 0; i < 10; i++)
            {
                test += test;
            }
        }
    
    
  3. BenchmarkRunner's static Run method should be invoked with a template argument of our class.

    
    BenchmarkRunner.Run<WhiteSpaceCounter>();
    
    

Results

Change project build target to release and run.

Benchmark Results

As you can see, LINQ version is the slowest. We can fasten it up with an explicit conversion to Char Array which in turn hogs more memory. Naive loop and foreach version are faster than LINQ. Finally, unmanaged version, as you can expect, is the fastest.

Related SO Post(s)

The title is the question. Below is my attempt to answer it through research. But I don't trust my uninformed research so I still pose the question (What is the fastest way to iterate through individual characters in a string in C#?).

Occasionally I want to cycle through the characters…



This is it for now. Any corrections are welcome.

Top comments (0)