DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’» is a community of 963,274 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Create account Log in
vbilopav
vbilopav

Posted on • Updated on

What Makes Norm Micro ORM for .NET Fast As Raw DataReader

Norm has been one of my favorites side projects that recently got very serious.

Basically, it's just, yet another micro ORM library (a Dapper clone) for .NET with a few extra tricks.

Originally, I wrote it so I can utilize features of modern C# like tuples and async streaming like for example:

// Map single values from a tuple to variables:
var (id, foo, bar) = connection.Single<int, string, string>("select id, foo, bar from my_table limit 1");

// Map to enumerable of named tuples:
IEnumerable<(int id, string foo, string bar)> results = connection.Read<int, string, string>("select id, foor, bar from my_table");

// Asynchronously stream values directly from database
await foreach(var (id, foo, bar) in connection.ReadAsync<int, string, string>("select id, foor, bar from my_table"))
{
    //...
}

// etc...
Enter fullscreen mode Exit fullscreen mode

That works fine, however, in order to be real Micro ORM, it needs to have a proper object mapper, that would map results from your query to your classes.

And it did, however, the first versions although fairly decent, were not as nearly as performant as famous mapper from Dapper -which is labeled as the "King of Micro ORM"

Recently, I've got some insights and ideas that I wanted to try out, so I rewrote Norm mapper from scratch.

Results surprised even me, I wasn't expecting performances so fast that is basically indistinguishable from the raw DataReader.

How is that possible?

Let's deep dive into Norm:

Reading the data

Norm implements one basic extensions data is used for data reading in preparation for mapping:

public static IEnumerable<(string name, object value)[]> Read(this DbConnection connection, string command) 
Enter fullscreen mode Exit fullscreen mode

As we can see, it returns an enumerator that yields an array that contains the name and value tuple.

It does not create a list of any kind, it simply uses yield to return a value when the enumeration is triggered.

And that enumerator item is an array of fields in the form of name and value tuple, where name is field name and value is an actual field value of object type (that requires casting later).

Actual mapping itself is implemented as an extension to that same structure:

public static IEnumerable<T> Map<T>(this IEnumerable<(string name, object value)[]> tuples)
Enter fullscreen mode Exit fullscreen mode

Similarly, this mapping extension will also yield the mapped result from enumeration, rather than creating some sort of list.

That's why when working with Norm, you would have to use Linq extension ToList to create an actual list and thus triggering the enumeration and mapping as well:

// build the enumerator, does not start reading 
// you can use also use connection.Query<MyClass>(query);
var enumaration = connection.Read(query).Map<MyClass>();
// start actual reading from the database and creates a list
var results = enumaration.ToList();
Enter fullscreen mode Exit fullscreen mode

This is neat because now I can build my Linq expressions before any serialization or mapping.

So now, all we have to do is the map (string name, object value)[] to an actual instance.

Mapping the data

Mapping, in a nutshell, would be just copying values from this array (string name, object value)[] to a class instance.

Let's say for example that we have a simple class:

public class MyClass 
{
    public int Id { get; init; }
    public string Foo { get; init; }
    public string Bar { get; init; }
}
Enter fullscreen mode Exit fullscreen mode

And naturally, the database query returns id, foo, and bar.

In order to map those values, we would have to compare names for each iteration step to map id to Id, foo to Foo, and bar to Bar.

But what if I already know that field id is always the first value and foo is always the second value, and so on. That would be much more efficient because we don't have to compare name strings on each iteration.

That's mapping by position - which is much more efficient than mapping by name that is normally used.

However, that is a suboptimal solution because if we switch the order in the query (or in the class for that matter) while leaving the names unchanged - it will introduce errors and confusion.

For example, trying to map the query select foo, id, bar - to a class in this example - would break the program.

But, what if we can do the mapping by name - only on the first record and remember the position indexes - so we can use them later in all other iterations of the same type - we would be doing mapping by name only once for each type. That would be fast, right.

And that is precisely how Norm mapping works.

This also gives me extra breathing space, so now, I can do more complex mappings, like for example the one that uses camel case naming or normal case naming - without sacrificing performances, because mapping by name is now pretty cheap.

Still, in order for the mapping algorithm to be efficient, there is a couple of other things that need to be remembered and reused in each iteration.

Like for example each set operation for each field in the class. Norm creates and remembers the set delegate functions for each property or fields:

var method = Delegate.CreateDelegate(typeof(Action<T, TProp?>), property.GetSetMethod(true));
Enter fullscreen mode Exit fullscreen mode

Which is later invoked when setting property values on instances:

((Action<T, TProp?>)method).Invoke(instance, (TProp?)value);
Enter fullscreen mode Exit fullscreen mode

Object construction

The algorithm described above worked pretty well and efficiently.

Except, when I tested the record type, for example:

public record MyRecord(int Id, string Foo, string Bar);
Enter fullscreen mode Exit fullscreen mode

This record is basically (almost) the same as normal the POCO class, except that it doesn't have a parameterless constructor. It only has one constructor and it receives three parameters.

In order to construct such an instance object on each iteration - we would have to send an array of three-parameters default object values.

This is basically why was the mapping of the plain POCO class faster than the mapping of the Record type. POCO classes have a default parameterless constructor.

And no, caching constructor delegates and default parameter arrays didn't help much either.

So, I had to rethink the object construction part.

Before iteration even begins, I would create one blueprint instance using reflection:

var defaultCtor = type.GetConstructors()[0];
var blueprint = (T)defaultCtor.Invoke(Enumerable.Repeat<object>(default, defaultCtor.GetParameters().Length).ToArray());
Enter fullscreen mode Exit fullscreen mode

Now that I have a blueprint instance containing the default values, I would also create a delegate on an Object.MemberwiseClone method:

var clone = (Func<T, object>)Delegate.CreateDelegate(typeof(Func<T, object>), type.GetMethod("MemberwiseClone", BindingFlags.Instance | BindingFlags.NonPublic));
Enter fullscreen mode Exit fullscreen mode

This method is a protected method on the root object type that creates a shallow copy of that object.

With the expression above, I have a lambda delegate that receives one parameter of type object and returns a copied instance.

So, now I can do fast cloning from my blueprint like this:

var instance = (T)clone.Invoke(blueprint);
Enter fullscreen mode Exit fullscreen mode

And that was it. This method does a really fast object creation for my mapping algorithm.

Final words

As I said, I knew it will be fast, by I wasn't really expected to be as fast as a raw data reader.

You can see benchmarks for yourself.

You can also clone or download the code and try it yourself.

However, that being said, if you wish to optimize your data access code, checking out your indexes and queries would be a much wiser choice.

Although, it's nice to have data-mapping routines as fast as using a generic data reader.

Top comments (0)

Take a look at this:

Settings

Go to your customization settings to nudge your home feed to show content more relevant to your developer experience level. πŸ›