Evaluating “ReadLine using System.IO.Pipelines” Performance in C# — Part 2
Read string line by line using System.IO.Pipelines API in C
On part 1 of this series, I concluded:
In terms of speed, it is surprisingly slower than the ordinary ReadLine version given the string length ≤ 80 (perhaps I am doing it wrong? Let me know! I am still learning!). It is starting to shine, getting faster and faster if the string length ≥ 90. (270% 🚀 faster for string length = 1000).
I decided to take a further look and added more patterns to the benchmarks. The new code, instead of blindly using SequenceReader, is a mix of fast ReadOnlySpan and slow SequenceReader; by inspecting ReadOnlySequence.IsSingleSegment property.
public async Task<string> ReadLineUsingPipelineVer2Async() | |
{ | |
var reader = PipeReader.Create(_stream, new StreamPipeReaderOptions(leaveOpen: true)); | |
string str; | |
while (true) | |
{ | |
ReadResult result = await reader.ReadAsync(); | |
ReadOnlySequence<byte> buffer = result.Buffer; | |
str = ProcessLine(ref buffer); | |
reader.AdvanceTo(buffer.Start, buffer.End); | |
if (result.IsCompleted) break; | |
} | |
await reader.CompleteAsync(); | |
return str; | |
} | |
[MethodImpl(MethodImplOptions.AggressiveInlining)] | |
private static string ProcessLine(ref ReadOnlySequence<byte> buffer) | |
{ | |
string str = null; | |
if (buffer.IsSingleSegment) | |
{ | |
var span = buffer.FirstSpan; | |
int consumed; | |
while (span.Length > 0) | |
{ | |
var newLine = span.IndexOf(NewLine); | |
if (newLine == -1) break; | |
var line = span.Slice(0, newLine); | |
str = Encoding.UTF8.GetString(line); | |
// simulate string processing | |
str = str.AsSpan().Slice(0, 5).ToString(); | |
consumed = line.Length + NewLine.Length; | |
span = span.Slice(consumed); | |
buffer = buffer.Slice(consumed); | |
} | |
} | |
else | |
{ | |
var sequenceReader = new SequenceReader<byte>(buffer); | |
while (!sequenceReader.End) | |
{ | |
while (sequenceReader.TryReadTo(out var line, NewLine)) | |
{ | |
str = Encoding.UTF8.GetString(line); | |
// simulate string processing | |
str = str.AsSpan().Slice(0, 5).ToString(); | |
} | |
buffer = buffer.Slice(sequenceReader.Position); | |
sequenceReader.Advance(buffer.Length); | |
} | |
} | |
return str; | |
} |
The idea is, get a Span once and then pull as many lines as possible out of it before moving to the next segment of the sequence, as also pointed out by a generous Reddit user u/scalablecory, thanks! (apparently a member of the .NET Team?!)
Here is the result:
As you can see, it performs well under every test case, rendering my previous conclusion obsolete!!!
And here is the gist version:
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.20190
Intel Core i5-9400F CPU 2.90GHz (Coffee Lake), 1 CPU, 6 logical and 6 physical cores
.NET Core SDK=5.0.100-preview.4.20258.7
[Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.25106, CoreFX 5.0.20.25106), X64 RyuJIT
DefaultJob : .NET Core 5.0.0 (CoreCLR 5.0.20.25106, CoreFX 5.0.20.25106), X64 RyuJIT
Method | LineNumber | LineCharMultiplier | Mean | Error | StdDev | Ratio | Code Size | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|---|
ReadLineUsingStringReaderAsync | 20 | 1 | 1.617 μs | 0.0083 μs | 0.0073 μs | 1.00 | 0.36 KB | 1.3428 | 0.0153 | - | 6.17 KB |
ReadLineUsingPipelineAsync | 20 | 1 | 3.285 μs | 0.0191 μs | 0.0169 μs | 2.03 | 0.4 KB | 0.4196 | - | - | 1.94 KB |
ReadLineUsingPipelineVer2Async | 20 | 1 | 1.565 μs | 0.0036 μs | 0.0034 μs | 0.97 | 0.4 KB | 0.4215 | - | - | 1.94 KB |
ReadLineUsingStringReaderAsync | 20 | 2 | 1.834 μs | 0.0028 μs | 0.0025 μs | 1.00 | 0.36 KB | 1.4095 | 0.0172 | - | 6.48 KB |
ReadLineUsingPipelineAsync | 20 | 2 | 3.389 μs | 0.0061 μs | 0.0051 μs | 1.85 | 0.4 KB | 0.4883 | - | - | 2.25 KB |
ReadLineUsingPipelineVer2Async | 20 | 2 | 1.644 μs | 0.0027 μs | 0.0021 μs | 0.90 | 0.4 KB | 0.4883 | - | - | 2.25 KB |
ReadLineUsingStringReaderAsync | 20 | 8 | 3.467 μs | 0.0099 μs | 0.0077 μs | 1.00 | 0.36 KB | 1.9875 | 0.0229 | - | 9.13 KB |
ReadLineUsingPipelineAsync | 20 | 8 | 3.772 μs | 0.0168 μs | 0.0157 μs | 1.09 | 0.4 KB | 0.9995 | - | - | 4.59 KB |
ReadLineUsingPipelineVer2Async | 20 | 8 | 1.902 μs | 0.0035 μs | 0.0032 μs | 0.55 | 0.4 KB | 0.9995 | - | - | 4.59 KB |
ReadLineUsingStringReaderAsync | 20 | 1000 | 292.398 μs | 2.1848 μs | 2.0437 μs | 1.00 | 0.36 KB | 211.9141 | 0.4883 | - | 950.83 KB |
ReadLineUsingPipelineAsync | 20 | 1000 | 86.472 μs | 0.1001 μs | 0.0781 μs | 0.30 | 0.4 KB | 85.8154 | 5.7373 | - | 395.79 KB |
ReadLineUsingPipelineVer2Async | 20 | 1000 | 87.354 μs | 0.3165 μs | 0.2806 μs | 0.30 | 0.4 KB | 85.8154 | 0.1221 | - | 395.79 KB |
Noticed that this time it has fewer test cases; I reduced some LineCharMultiplier variations, as I don’t think we need it.
You can find the source code in my GitHub repository.
Conclusion
- Pipelines versions are better in terms of memory usage (using less memory).
- In terms of speed, it is 103% ~ 333% 🚀🚀🚀 faster, depends on the string length.
- Less GC pressure (a good thing) for Pipelines versions (Gen 0, Gen 1).
- The amount of code to write for the Pipelines version is super longer!
DISCLAIMER: Your mileage may vary. As with all performance work, each of the scenarios chosen for your application should be measured, measured and measured. There is no silver bullet.
Top comments (0)