How compiled expression trees and cycle analysis beat Mapster, AutoMapper, and hand-tuned reflection.
Every .NET project eventually needs object mapping. User → UserDto. Order → OrderResponse. You've done it a thousand times. And you've probably reached for AutoMapper or Mapster — because why write dest.Name = src.Name fifty times?
But there's a problem nobody talks about.
The Friday Incident
A developer on our team added a Parent property to a Node class. Standard tree structure. The mapper tried to map Node.Parent.Parent.Parent... until the stack overflowed. Production went down. No exception was caught — StackOverflowException is uncatchable in .NET.
We looked at every major mapping library. None of them had cycle detection. Not AutoMapper. Not Mapster. Not PanoramicData.Mapper.
So we built one that does.
But We Didn't Want to Be the Slowest
The first version used reflection. It was safe — cycle detection worked, max-depth enforcement worked, configuration validation caught missing mappings at startup. But it was slow. Really slow.
| Method | Mean | vs Manual |
|---|---|---|
| Manual Mapping | ~17 ns | baseline |
| Mapster | ~23 ns | 1.4x |
| AutoMapper | ~63 ns | 3.7x |
| Mapture v0 (refl.) | ~773 ns | 45x |
45x slower than hand-written code. That's not a rounding error.
The Rewrite: Compiled Expression Trees
The insight: reflection is only slow when you do it on every call. What if you did reflection once — at configuration time — and compiled the result into a native delegate?
That's exactly what System.Linq.Expressions lets you do:
// At configuration time, Mapture builds this:
var param = Expression.Parameter(typeof(User));
var body = Expression.MemberInit(
Expression.New(typeof(UserDto)),
Expression.Bind(destNameProp, Expression.Property(param, srcNameProp)),
Expression.Bind(destAgeProp, Expression.Property(param, srcAgeProp))
);
var lambda = Expression.Lambda<Func<User, UserDto>>(body, param);
Func<User, UserDto> compiled = lambda.Compile();
// On every Map() call, Mapture just does:
return compiled(source);
The compiled delegate is essentially the same IL that the JIT would produce for hand-written new UserDto { Name = src.Name, Age = src.Age }. No reflection. No dictionary lookups. No allocations beyond the destination object.
The Trick: Separate Fast and Slow Paths
Here's the key optimization most mappers miss. Most types don't have cycles. User → UserDto will never cause infinite recursion. Only self-referencing types like Node → NodeDto (where Node has a Node Parent property) can cycle.
So at configuration time, Mapture runs a graph traversal on your type maps:
User → UserDto → acyclic (fast path)
Order → OrderDto → acyclic (fast path)
Node → NodeDto → CYCLIC (needs tracking)
Acyclic types get a pure delegate — no HashSet<object>, no depth counter, zero overhead. Cyclic types get wrapped with a visited set and a depth counter. You get safety where you need it and raw speed everywhere else.
Three More Tricks
1. Embedded nested delegates. When Order has an Address property, the compiled delegate for Address → AddressDto is embedded directly into the Order → OrderDto expression tree as a constant. No type-pair lookup per nested object.
2. Typed delegates. Instead of Func<object, object> (which boxes value types), Mapture caches Func<TSource, TDestination> per generic type pair. Zero boxing.
3. Thread-static cache. The hot path doesn't even hit a ConcurrentDictionary. It reads from a [ThreadStatic] slot in a static generic class TypePairCache<TSource, TDestination>. That's a single field read — essentially free.
The Result
| Rank | Method | Mean | vs Manual | Allocated |
|---|---|---|---|---|
| 🥇 | Manual Mapping | ~17 ns | baseline | 96 B |
| 🥈 | Mapture | ~25 ns | 1.5x | 96 B |
| 🥉 | Mapster | ~27 ns | 1.6x | 96 B |
| 4 | AutoMapper | ~68 ns | 4.0x | 96 B |
| 5 | PanoramicData.Mapper | ~283 ns | 16.9x | 272 B |
From 773 ns to 25 ns. From last place to first.
And it still has cycle detection, max-depth enforcement, and configuration validation — features none of the faster alternatives offer.
The ~8 ns Gap
Why not zero overhead? Two things:
-
Delegate invocation (~2–3 ns): calling a compiled
Func<T,R>is slightly slower than inlined code - Cache lookup (~3–5 ns): reading the thread-static delegate slot
For context, a single database query takes 500,000–5,000,000 ns. The 8 ns gap is undetectable in any real application.
Migration from AutoMapper
Mapture uses the same API patterns. Most migrations are a find-and-replace:
// Before
using AutoMapper;
services.AddAutoMapper(typeof(Startup));
// After
using Mapture;
services.AddMapture(typeof(Startup));
Same Profile, CreateMap, ForMember, Ignore, ReverseMap. The API was designed so you can migrate in under 30 minutes.
Try It
dotnet add package Mapture
dotnet add package Mapture.Extensions.DependencyInjection
GitHub: github.com/arijeetganguli/Mapture
NuGet: nuget.org/packages/Mapture
Targets: .NET Framework 4.8, .NET Standard 2.0, .NET 8, .NET 10.
MIT licensed. Zero telemetry. 81 tests across 3 frameworks.
Benchmarks measured with BenchmarkDotNet on .NET 10.0, X64 RyuJIT AVX2. Source code and benchmark project included in the repository.
Top comments (1)
Interesting work and benchmarks.
Honestly, in the AI era, for new projects I’d probably just generate mappings using AI. It takes only a few seconds, produces reusable extension methods, and avoids adding another dependency to the project altogether.