DEV Community

Joni 【ジョニー】
Joni 【ジョニー】

Posted on • Originally published at Medium on

Evaluating “ReadLine using System.IO.Pipelines” Performance in C# — Part 2

Evaluating “ReadLine using System.IO.Pipelines” Performance in C# — Part 2

Read string line by line using System.IO.Pipelines API in C

On part 1 of this series, I concluded:

In terms of speed, it is surprisingly slower than the ordinary ReadLine version given the string length ≤ 80 (perhaps I am doing it wrong? Let me know! I am still learning!). It is starting to shine, getting faster and faster if the string length ≥ 90. (270% 🚀 faster for string length = 1000).

I decided to take a further look and added more patterns to the benchmarks. The new code, instead of blindly using SequenceReader, is a mix of fast ReadOnlySpan and slow SequenceReader; by inspecting ReadOnlySequence.IsSingleSegment property.

public async Task<string> ReadLineUsingPipelineVer2Async()
{
var reader = PipeReader.Create(_stream, new StreamPipeReaderOptions(leaveOpen: true));
string str;
while (true)
{
ReadResult result = await reader.ReadAsync();
ReadOnlySequence<byte> buffer = result.Buffer;
str = ProcessLine(ref buffer);
reader.AdvanceTo(buffer.Start, buffer.End);
if (result.IsCompleted) break;
}
await reader.CompleteAsync();
return str;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static string ProcessLine(ref ReadOnlySequence<byte> buffer)
{
string str = null;
if (buffer.IsSingleSegment)
{
var span = buffer.FirstSpan;
int consumed;
while (span.Length > 0)
{
var newLine = span.IndexOf(NewLine);
if (newLine == -1) break;
var line = span.Slice(0, newLine);
str = Encoding.UTF8.GetString(line);
// simulate string processing
str = str.AsSpan().Slice(0, 5).ToString();
consumed = line.Length + NewLine.Length;
span = span.Slice(consumed);
buffer = buffer.Slice(consumed);
}
}
else
{
var sequenceReader = new SequenceReader<byte>(buffer);
while (!sequenceReader.End)
{
while (sequenceReader.TryReadTo(out var line, NewLine))
{
str = Encoding.UTF8.GetString(line);
// simulate string processing
str = str.AsSpan().Slice(0, 5).ToString();
}
buffer = buffer.Slice(sequenceReader.Position);
sequenceReader.Advance(buffer.Length);
}
}
return str;
}
view raw ProcessLine.cs hosted with ❤ by GitHub

The idea is, get a Span once and then pull as many lines as possible out of it before moving to the next segment of the sequence, as also pointed out by a generous Reddit user u/scalablecory, thanks! (apparently a member of the .NET Team?!)

Here is the result:

As you can see, it performs well under every test case, rendering my previous conclusion obsolete!!!

And here is the gist version:

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.20190
Intel Core i5-9400F CPU 2.90GHz (Coffee Lake), 1 CPU, 6 logical and 6 physical cores
.NET Core SDK=5.0.100-preview.4.20258.7
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.25106, CoreFX 5.0.20.25106), X64 RyuJIT
  DefaultJob : .NET Core 5.0.0 (CoreCLR 5.0.20.25106, CoreFX 5.0.20.25106), X64 RyuJIT

Method LineNumber LineCharMultiplier Mean Error StdDev Ratio Code Size Gen 0 Gen 1 Gen 2 Allocated
ReadLineUsingStringReaderAsync 20 1 1.617 μs 0.0083 μs 0.0073 μs 1.00 0.36 KB 1.3428 0.0153 - 6.17 KB
ReadLineUsingPipelineAsync 20 1 3.285 μs 0.0191 μs 0.0169 μs 2.03 0.4 KB 0.4196 - - 1.94 KB
ReadLineUsingPipelineVer2Async 20 1 1.565 μs 0.0036 μs 0.0034 μs 0.97 0.4 KB 0.4215 - - 1.94 KB
ReadLineUsingStringReaderAsync 20 2 1.834 μs 0.0028 μs 0.0025 μs 1.00 0.36 KB 1.4095 0.0172 - 6.48 KB
ReadLineUsingPipelineAsync 20 2 3.389 μs 0.0061 μs 0.0051 μs 1.85 0.4 KB 0.4883 - - 2.25 KB
ReadLineUsingPipelineVer2Async 20 2 1.644 μs 0.0027 μs 0.0021 μs 0.90 0.4 KB 0.4883 - - 2.25 KB
ReadLineUsingStringReaderAsync 20 8 3.467 μs 0.0099 μs 0.0077 μs 1.00 0.36 KB 1.9875 0.0229 - 9.13 KB
ReadLineUsingPipelineAsync 20 8 3.772 μs 0.0168 μs 0.0157 μs 1.09 0.4 KB 0.9995 - - 4.59 KB
ReadLineUsingPipelineVer2Async 20 8 1.902 μs 0.0035 μs 0.0032 μs 0.55 0.4 KB 0.9995 - - 4.59 KB
ReadLineUsingStringReaderAsync 20 1000 292.398 μs 2.1848 μs 2.0437 μs 1.00 0.36 KB 211.9141 0.4883 - 950.83 KB
ReadLineUsingPipelineAsync 20 1000 86.472 μs 0.1001 μs 0.0781 μs 0.30 0.4 KB 85.8154 5.7373 - 395.79 KB
ReadLineUsingPipelineVer2Async 20 1000 87.354 μs 0.3165 μs 0.2806 μs 0.30 0.4 KB 85.8154 0.1221 - 395.79 KB

Noticed that this time it has fewer test cases; I reduced some LineCharMultiplier variations, as I don’t think we need it.

You can find the source code in my GitHub repository.

Conclusion

  • Pipelines versions are better in terms of memory usage (using less memory).
  • In terms of speed, it is 103% ~ 333% 🚀🚀🚀 faster, depends on the string length.
  • Less GC pressure (a good thing) for Pipelines versions (Gen 0, Gen 1).
  • The amount of code to write for the Pipelines version is super longer!

DISCLAIMER: Your mileage may vary. As with all performance work, each of the scenarios chosen for your application should be measured, measured and measured. There is no silver bullet.

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (0)

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay