Oleksandr Viktor

Posted on Apr 2

NPoco vs UkrGuru.Sql: When Streaming Beats Buffering

#dotnet #sqlserver #bench

When we talk about database performance in .NET, we often compare ORMs as if they were interchangeable. In practice, the API shape matters just as much as the implementation.

In this post, I benchmark NPoco and UkrGuru.Sql using BenchmarkDotNet, focusing on a very common task: reading a large table from SQL Server. The interesting part is not which library wins, but why the numbers differ so much.

TL;DR: Streaming rows with IAsyncEnumerable<T> is faster, allocates less, and scales better than loading everything into a list.

Test Scenario

The setup is intentionally simple and realistic.

Database: SQL Server
Table: Customers
Dataset: SampleStoreLarge (large enough to stress allocations)
Columns:
- CustomerId
- FullName
- Email
- CreatedAt

All benchmarks execute the same SQL:

SELECT CustomerId, FullName, Email, CreatedAt FROM Customers

No filters, no projections — just raw read performance.

Benchmark Code

using BenchmarkDotNet.Attributes;
using Microsoft.Data.SqlClient;
using NPoco;
using UkrGuru.Sql;

public class SqlBenchmark
{
    private const string ConnectionString =
        "Server=(local);Database=SampleStoreLarge;Trusted_Connection=True;TrustServerCertificate=True;";

    private const string CommandText =
        "SELECT CustomerId, FullName, Email, CreatedAt FROM Customers";

    [Benchmark]
    public async Task<int> NPoco_LoadList()
    {
        using var connection = new SqlConnection(ConnectionString);
        await connection.OpenAsync();

        using var db = new Database(connection);

        var list = await db.FetchAsync<Customer>(CommandText);
        return list.Count;
    }

    [Benchmark]
    public async Task<int> UkrGuru_LoadList()
    {
        await using var connection = await DbHelper.CreateConnectionAsync(ConnectionString);

        var list = await connection.ReadAsync<Customer>(CommandText);
        return list.Count();
    }

    [Benchmark]
    public async Task<int> UkrGuru_StreamRows()
    {
        int count = 0;

        await using var command = await DbHelper.CreateCommandAsync(
            CommandText,
            connectionString: ConnectionString);

        await foreach (var _ in command.ReadAsync<Customer>())
            count++;

        return count;
    }
}

All benchmarks were run in Release mode with BenchmarkDotNet.

Results (Execution Time)

Method	Mean	StdDev	Median
NPoco_LoadList	8.23 ms	0.33 ms	8.22 ms
UkrGuru_LoadList	5.30 ms	0.57 ms	5.34 ms
UkrGuru_StreamRows	3.29 ms	0.14 ms	3.22 ms

At first glance, streaming is already ~2.5× faster than NPoco. But the real story starts when we look at memory.

Results (Memory & GC)

Method	Gen0	Gen1	Gen2	Allocated
NPoco_LoadList	367	258	109	4.39 MB
UkrGuru_LoadList	203	188	70	2.33 MB
UkrGuru_StreamRows	164	–	–	2.08 MB

This table explains almost everything.

What’s Actually Being Measured?

NPoco_LoadList

Uses FetchAsync<T>()
Fully materializes a List<Customer>
Allocates buffers and intermediate objects

✅ Idiomatic NPoco usage

❌ No streaming support

NPoco optimizes for developer productivity, not minimal allocations. That’s a valid trade‑off, but it shows up clearly in GC pressure.

UkrGuru_LoadList

Also builds a full list
Uses a leaner mapping pipeline
Roughly half the allocations of NPoco

✅ Same algorithm as NPoco

✅ Less overhead

This is a fair apple‑to‑apple comparison with NPoco’s approach.

UkrGuru_StreamRows

Uses IAsyncEnumerable<T>
Processes rows one at a time
No list allocation
No Gen2 collections

✅ True async streaming

✅ Lowest latency

✅ Most stable GC behavior

This is not a micro‑optimization — it’s a different execution model.

Why Streaming Wins

The biggest improvement is not raw speed — it’s memory behavior.

Fewer allocations
Almost no object promotion
No Gen2 collections

That matters a lot under real load: ASP.NET requests, background workers, message consumers, etc.

Streaming doesn’t just run faster — it scales better.

About Fairness

This benchmark is not trying to prove that one ORM is “better” than another.

It compares three distinct patterns:

Buffered list materialization (NPoco)
Buffered list materialization with fewer abstractions
True async streaming

Comparing streaming to buffering is not “ORM vs ORM” — it’s algorithm vs algorithm.

When Should You Use Each?

Use NPoco when:

You want simple, expressive data access
Loading lists is acceptable
Developer time matters more than raw throughput

Use streaming (e.g. UkrGuru.Sql) when:

Result sets are large
Latency and GC pressure matter
You want full control over execution

Final Thoughts

Benchmarks don’t just measure libraries — they measure abstractions and APIs.

If your workload is dominated by large reads, switching from buffered lists to async streaming can cut both execution time and memory pressure dramatically.

Choose the tool that matches your data access pattern, not just the one you’re used to.

DEV Community