DEV Community

Discussion on: The newly announced future of .NET - unifying all the things

Collapse
 
mindplay profile image
Rasmus Schultz

Since you mention ImageSharp, take a look at this article:

devblogs.microsoft.com/dotnet/net-...

There is a radical performance difference between ImageSharp and something like SkiaSharp, which outsources to C++.

People outsource this kind of work to C++ for good reasons - and languages like C# offer run-time interop with C and C++ binaries for some of the same reasons.

My main point isn't so much whether bytecode VM languages need to outsource - more the fact that they do. Languages like C#, JavaScript, PHP, Ruby, Python and so on, all have a bunch of run-time APIs that were written in C or C++, most of it inaccessible to the developers who code in those languages.

C# isn't much different in that regard. While, yes, the C# compiler itself was written in C#, the CLR and all the low-level types etc. are C code, mostly inaccessible to developers who work in C#.

You could argue that CLR isn't part of the C# language - but C# isn't really complete without the ability to run it, and I'd argue the same for any language that isn't fully bootstrapped and self-contained.

From what I've seen, VM-based languages (with JIT and garbage-collection and many other performance factors) aren't really suitable for very low-level stuff, which always gets outsourced.

Something common like loading/saving/resizing images is an excellent example of something that practically always gets outsourced - and whenever anybody shows up claiming they've built a "fast" native alternative, these typically turn out to be anywhere from 2 to 10 times slower, usually with much higher memory footprint.

From my point of view, languages like C#, JavaScript and PHP are all broadly in the same performance category: they're fast enough. But not fast enough to stand alone without outsourcing some of the heavy lifting to another language.

And yeah, maybe that's just the cost of high level languages. 🤷‍♂️

Thread Thread
 
turnerj profile image
James Turner • Edited

That article you've linked to is an interesting read and you're right about the CLR, just had a look at the CoreCLR project on GitHub and it is 2/3rds C# and 1/3rd C++ (seems to be the JIT and GC).

With the article and the ImageSharp comparison, I am curious to how much faster it would be now 2 years on with newer versions of .NET Core (there have been significant improvements) as well as any algorithmic updates that may have occurred - might see if I can run the test suite on my laptop. (EDIT: See bottom of comment for test results)

That C# isn't (in various tests and applications) as fast as C++ doesn't mean it always is or has to be that way. Like, it isn't that C++ code is executed on the processor, it is machine code. If C# and C++ generated the same machine code, arguably they should be identical in performance.

C# usually can't generate the same machine code due in part to overhead in protecting us from ourselves as well as things like GC etc. C# though does have the ability to work with pointers and operate in "unsafe" manners which may very well generate the exact same code.

All that said, I do expect though that it is naturally easier to write fast code in C++ than C# for that exact reason - you are always working with pointers and the very lowest level of value manipulation. The average C# developer likely would never use the various unsafe methods in C# to get close to the same level of access.

Just an aside, it seems like the tables of data and the graphs displayed in that article you linked to don't actually match each other. 🤷‍♂️


EDIT: I did run the tests from the article you linked me as they had it all configured up on GitHub. Short answer: Skia is still faster but the results are interesting.

Note 1: I disabled the tests of the other image libraries as I was only curious about ImageSharp.
Note 2: If you want to replicate it on your machine, you will need to change the Nuget package referenced for ImageSharp as it isn't available. I am using the latest beta version instead. This means one line needs to be altered in the tests as a property doesn't exist - you'll see if you compile it.

BenchmarkDotNet=v0.11.5, OS=Windows 10.0.17134.706 (1803/April2018Update/Redstone4)
Intel Core i7-6700HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
Frequency=2531249 Hz, Resolution=395.0619 ns, Timer=TSC
.NET Core SDK=2.2.101
  [Host]            : .NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT
  .Net Core 2.2 CLI : .NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT

Job=.Net Core 2.2 CLI  Jit=RyuJit  Platform=X64  
Toolchain=.NET Core 2.2  IterationCount=5  WarmupCount=5  

|                                Method |     Mean |      Error |    StdDev | Ratio | RatioSD |     Gen 0 | Gen 1 | Gen 2 |  Allocated |
|-------------------------------------- |---------:|-----------:|----------:|------:|--------:|----------:|------:|------:|-----------:|
|   'System.Drawing Load, Resize, Save' | 594.9 ms | 113.475 ms | 29.469 ms |  1.00 |    0.00 |         - |     - |     - |   79.96 KB |
|       'ImageSharp Load, Resize, Save' | 318.8 ms |  10.496 ms |  2.726 ms |  0.54 |    0.02 |         - |     - |     - |  1337.7 KB |
| 'SkiaSharp Canvas Load, Resize, Save' | 273.5 ms |  10.264 ms |  2.665 ms |  0.46 |    0.02 | 1000.0000 |     - |     - | 4001.79 KB |
| 'SkiaSharp Bitmap Load, Resize, Save' | 269.1 ms |   6.619 ms |  1.719 ms |  0.45 |    0.02 | 1000.0000 |     - |     - | 3995.29 KB |

So SkiaSharp actually uses 3x the memory for the ~26% performance improvement.

Thread Thread
 
jimbobsquarepants profile image
James Jackson-South

The performance differences between the two libraries boil down to the performance of our jpeg decoder. Until .NET Core 3.0 the equivalent hardware intrinsics APIs simply haven't been available to C# to allow equivalent performance but that is all about to change.

devblogs.microsoft.com/dotnet/hard...

The ImageSharp decoder currently has to perform its IDCT (Inverse Discrete Cosine Transforms) operations using the generic Vector<T> struct and co from System.Numeric.Vectors, we have to move into single precision floating point in order to do so and then alter our results to make them less correct and more inline with the equivalent integral hardware intrinsics driven operations found in libjpeg-turbo (This is what Skia and others use to decode jpegs). This, of course takes more time to do and we want to do better.

With .NET Core 3.0 and beyond we will be finally be able to use the very same approach other libraries use and achieve equivalent performance with the added bonus of a less cryptic API and easier installation story.

Incidentally our Resize algorithms not only offer equivalent performance to Skia but also do a better job, yielding higher quality output and better handling edge cases that trip up other graphics APIs.

It's perfectly possible to write very high performance code with C# - Many great APIs already exist to help do so Span<T>, Memory<T> etc and there is a great focus at Microsoft currently to further improve the performance story.

Thread Thread
 
turnerj profile image
James Turner

I'm writing an article about maximising performance in .NET which starts off with some simple things but moves onto things like using Span<T>, unsafe operations or even the dense matrix you helped explain to me on Twitter.

When I saw that blog post about hardware intrinsics, that got me even more excited to share the big performance gains that can be had.

There definitely hasn't been a better time to be working with .NET!