DEV Community

Cover image for Building the Fastest .NET Object Mapper
Arijeet Ganguli
Arijeet Ganguli

Posted on

Building the Fastest .NET Object Mapper

How compiled expression trees and cycle analysis beat Mapster, AutoMapper, and hand-tuned reflection.


Every .NET project eventually needs object mapping. UserUserDto. OrderOrderResponse. You've done it a thousand times. And you've probably reached for AutoMapper or Mapster — because why write dest.Name = src.Name fifty times?

But there's a problem nobody talks about.

The Friday Incident

A developer on our team added a Parent property to a Node class. Standard tree structure. The mapper tried to map Node.Parent.Parent.Parent... until the stack overflowed. Production went down. No exception was caught — StackOverflowException is uncatchable in .NET.

We looked at every major mapping library. None of them had cycle detection. Not AutoMapper. Not Mapster. Not PanoramicData.Mapper.

So we built one that does.

But We Didn't Want to Be the Slowest

The first version used reflection. It was safe — cycle detection worked, max-depth enforcement worked, configuration validation caught missing mappings at startup. But it was slow. Really slow.

Method Mean vs Manual
Manual Mapping ~17 ns baseline
Mapster ~23 ns 1.4x
AutoMapper ~63 ns 3.7x
Mapture v0 (refl.) ~773 ns 45x

45x slower than hand-written code. That's not a rounding error.

The Rewrite: Compiled Expression Trees

The insight: reflection is only slow when you do it on every call. What if you did reflection once — at configuration time — and compiled the result into a native delegate?

That's exactly what System.Linq.Expressions lets you do:

// At configuration time, Mapture builds this:
var param = Expression.Parameter(typeof(User));
var body = Expression.MemberInit(
    Expression.New(typeof(UserDto)),
    Expression.Bind(destNameProp, Expression.Property(param, srcNameProp)),
    Expression.Bind(destAgeProp, Expression.Property(param, srcAgeProp))
);
var lambda = Expression.Lambda<Func<User, UserDto>>(body, param);
Func<User, UserDto> compiled = lambda.Compile();

// On every Map() call, Mapture just does:
return compiled(source);
Enter fullscreen mode Exit fullscreen mode

The compiled delegate is essentially the same IL that the JIT would produce for hand-written new UserDto { Name = src.Name, Age = src.Age }. No reflection. No dictionary lookups. No allocations beyond the destination object.

The Trick: Separate Fast and Slow Paths

Here's the key optimization most mappers miss. Most types don't have cycles. UserUserDto will never cause infinite recursion. Only self-referencing types like NodeNodeDto (where Node has a Node Parent property) can cycle.

So at configuration time, Mapture runs a graph traversal on your type maps:

User → UserDto          → acyclic (fast path)
Order → OrderDto        → acyclic (fast path)  
Node → NodeDto          → CYCLIC (needs tracking)
Enter fullscreen mode Exit fullscreen mode

Acyclic types get a pure delegate — no HashSet<object>, no depth counter, zero overhead. Cyclic types get wrapped with a visited set and a depth counter. You get safety where you need it and raw speed everywhere else.

Three More Tricks

1. Embedded nested delegates. When Order has an Address property, the compiled delegate for Address → AddressDto is embedded directly into the Order → OrderDto expression tree as a constant. No type-pair lookup per nested object.

2. Typed delegates. Instead of Func<object, object> (which boxes value types), Mapture caches Func<TSource, TDestination> per generic type pair. Zero boxing.

3. Thread-static cache. The hot path doesn't even hit a ConcurrentDictionary. It reads from a [ThreadStatic] slot in a static generic class TypePairCache<TSource, TDestination>. That's a single field read — essentially free.

The Result

Rank Method Mean vs Manual Allocated
🥇 Manual Mapping ~17 ns baseline 96 B
🥈 Mapture ~25 ns 1.5x 96 B
🥉 Mapster ~27 ns 1.6x 96 B
4 AutoMapper ~68 ns 4.0x 96 B
5 PanoramicData.Mapper ~283 ns 16.9x 272 B

From 773 ns to 25 ns. From last place to first.

And it still has cycle detection, max-depth enforcement, and configuration validation — features none of the faster alternatives offer.

The ~8 ns Gap

Why not zero overhead? Two things:

  • Delegate invocation (~2–3 ns): calling a compiled Func<T,R> is slightly slower than inlined code
  • Cache lookup (~3–5 ns): reading the thread-static delegate slot

For context, a single database query takes 500,000–5,000,000 ns. The 8 ns gap is undetectable in any real application.

Migration from AutoMapper

Mapture uses the same API patterns. Most migrations are a find-and-replace:

// Before
using AutoMapper;
services.AddAutoMapper(typeof(Startup));

// After
using Mapture;
services.AddMapture(typeof(Startup));
Enter fullscreen mode Exit fullscreen mode

Same Profile, CreateMap, ForMember, Ignore, ReverseMap. The API was designed so you can migrate in under 30 minutes.

Try It

dotnet add package Mapture
dotnet add package Mapture.Extensions.DependencyInjection
Enter fullscreen mode Exit fullscreen mode

GitHub: github.com/arijeetganguli/Mapture
NuGet: nuget.org/packages/Mapture

Targets: .NET Framework 4.8, .NET Standard 2.0, .NET 8, .NET 10.

MIT licensed. Zero telemetry. 81 tests across 3 frameworks.


Benchmarks measured with BenchmarkDotNet on .NET 10.0, X64 RyuJIT AVX2. Source code and benchmark project included in the repository.


Top comments (1)

Collapse
 
gramli profile image
Daniel Balcarek

Interesting work and benchmarks.

Honestly, in the AI era, for new projects I’d probably just generate mappings using AI. It takes only a few seconds, produces reusable extension methods, and avoids adding another dependency to the project altogether.