DEV Community

Steve Bjorg for LambdaSharp

Posted on • Edited on

Benchmarking .NET JSON Serializers on AWS Lambda

Virtually all .NET code on AWS Lambda has to deal with JSON serialization. Historically, Newtonsoft Json.NET has been the go-to library. More recently, System.Text.Json was introduced in .NET Core 3. Both libraries use reflection to build their serialization logic. The newest technique, called source generator, was introduced in .NET 6 and uses a compile-time approach that avoids reflection.

So, now we have three approaches to choose from, which begs the question: Is there a clear winner or is it more nuanced?

For these benchmarks, the code deserializes a fairly bloated JSON data structure taken from the GitHub API documentation and then returns an empty response.

Newtonsoft Json.NET

This library has been around for so long and has been so popular that it broke the download counter when it exceeded 2 billion on nuget.org. The counter has been fixed since, but this impressive milestone remains!

using System.IO;
using System.Threading.Tasks;
using Amazon.Lambda.Core;
using Amazon.Lambda.Serialization.Json;

[assembly:LambdaSerializer(typeof(JsonSerializer))]

namespace Benchmark.NewtonsoftJson {

    public sealed class Function {

        //--- Methods ---
        public async Task<Stream> ProcessAsync(Root request) {
            return Stream.Null;
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Minimum Cold Start Duration

The 4 fastest cold start durations use the x86-64 architecture and ReadyToRun. The fastest uses Tiered Compilation as well. The PreJIT option is always slower when enabled, but still makes the top 4 cut.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Cold Start
x86_64 1769MB no yes no 262.942 186.097 449.039
x86_64 1769MB no yes yes 317.328 151.456 468.784
x86_64 1769MB yes yes no 236.714 170.028 406.742
x86_64 1769MB yes yes yes 295.209 137.727 432.936

Newtonsoft Json.NET - Cold Start Duration

Fullsize Image

Minimum Execution Cost

I'll admit, I was a bit surprised here. I would have expected ARM64 to be the obvious choice since the execution cost is 20% lower. However, that was not the case. Instead, we have a 50/50 split with x86-64 winning ever so slightly.

Also interesting is that the cheapest execution cost always uses the PreJIT option. That makes intuitively sense since this option shifts some cost from the first INVOKE phase to the free INIT phase and only has a small overhead penalty otherwise.

Similarly, Tiered Compilation is disabled for all because it introduces additional overhead during the warm INVOKE phases.

Most fascinating to me is that ARM64 is cheaper with 512 MB memory, while x86-64 is cheaper with 256 MB. This is probably just an oddity, but it serves to highlight that nothing is ever obvious and why benchmarking the actual code is so important!

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Warm Used (100) Cost (µ$)
arm64 256MB no yes yes 346.884 1598.711 406.117 26.88279408
arm64 512MB no yes yes 348.615 753.974 238.541 26.81680042
x86_64 256MB no yes yes 317.574 1186.12 377.718 26.71600553
x86_64 512MB no yes yes 314.298 562.768 234.544 26.84427746

Newtonsoft Json.NET - Execution Cost and Total Warm Execution Time

Fullsize Image

System.Text.Json - Reflection

System.Text.Json was introduced in .NET Core 3. The initial release was not feature-rich enough to be a compelling choice. However, that is no longer the case. By .NET 5, all my concerns were addressed, and it has been my preferred choice since. Sadly, we had to wait until .NET 6, which is LTS, for it to become supported on AWS Lambda.

using System.IO;
using System.Threading.Tasks;
using Amazon.Lambda.Core;
using Amazon.Lambda.Serialization.SystemTextJson;

[assembly:LambdaSerializer(typeof(DefaultLambdaJsonSerializer))]

namespace Benchmark.SystemTextJson {

    public sealed class Function {

        //--- Methods ---
        public async Task<Stream> ProcessAsync(Root request) {
            return Stream.Null;
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Minimum Cold Start Duration

Similar to Json.NET, the 4 fastest cold start durations use the x86-64 architecture. Unlike the previous benchmark, all of them have Tiered Compilation enabled. ReadyToRun provides an ever so slight benefit, but not much. That's likely due to the fact that the JSON serialization code lives in the .NET framework. Same as before, PreJIT makes things slower, but it's still among the 4 fastest configurations.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Cold Start
x86_64 1769MB yes no no 231.55 97.37 328.92
x86_64 1769MB yes no yes 276.791 74.063 350.854
x86_64 1769MB yes yes no 226.864 93.64 320.504
x86_64 1769MB yes yes yes 273.615 71.244 344.859

System.Text.Json - Reflection - Cold Start Duration

Fullsize Image

Minimum Execution Cost

Identical to the Json.NET benchmark, the 4 cheapest execution costs disable Tiered Compilation and enable the PreJIT option. Also, results are evenly split between ARM64 and x86-64.

Again, the optimal configuration uses the x86-64 architecture with ReadyToRun enabled. However, this time, all 4 optimal configurations agree on 256 MB for memory.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Warm Used (100) Cost (µ$)
arm64 256MB no no yes 335.019 977.84 344.601 24.60815771
arm64 256MB no yes yes 330.424 966.123 347.232 24.57787356
x86_64 256MB no no yes 302.287 688.363 341.735 24.49208483
x86_64 256MB no yes yes 293.871 679.57 299.889 24.28108858

System.Text.Json - Reflection - Execution Cost and Total Warm Execution Time

Fullsize Image

System.Text.Json - Source Generator

New in .NET 6 is the ability to generate the JSON serialization code during compilation instead of relying on reflection at runtime.

Personally, as someone who cares a lot about performance, I find source generators a really exciting addition to our developer toolbox. However, I don't consider this iteration to be production ready, because it is missing some features I rely on. In particular, the lack of custom type converters to override the default JSON serialization behavior is a blocker for me. That said, for some smaller projects, it might be viable. My biggest recommendation here is to thoroughly validate the output to ensure any behavior changes are caught during development.

using System.Text.Json.Serialization;
using Amazon.Lambda.Core;
using Amazon.Lambda.Serialization.SystemTextJson;
using Benchmark.SourceGeneratorJson;

[assembly: LambdaSerializer(typeof(SourceGeneratorLambdaJsonSerializer<FunctionSerializerContext>))]

namespace Benchmark.SourceGeneratorJson;

[JsonSerializable(typeof(Root))]
public partial class FunctionSerializerContext : JsonSerializerContext { }

public sealed class Function {

    //--- Methods ---
    public async Task<Stream> ProcessAsync(Root request) {
        return Stream.Null;
    }
}
Enter fullscreen mode Exit fullscreen mode

Minimum Cold Start Duration

This time, the 4 fastest cold starts all use Tiered Compilation and ReadyToRun. Since source generators create more code to jit, it makes sense that these options improve cold start performance, since that's their purpose. Also, unlike the previous benchmarks, ARM64 and x86-64 are now competing for the top spot. PreJIT again slows things down a bit, but still makes it into the top 4.

Despite ARM64 finally making an appearance in the Minimum Cold Start Duration benchmark, the x86-64 architecture still secures the top two spots.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Cold Start
arm64 1769MB yes yes no 249.244 65.429 314.673
arm64 1769MB yes yes yes 276.097 60.221 336.318
x86_64 1769MB yes yes no 240.88 53.104 293.984
x86_64 1769MB yes yes yes 265.776 46.327 312.103

System.Text.Json - Source Generator - Cold Start Duration

Fullsize Image

Minimum Execution Cost

The results for this benchmark are a bit more complicated to parse. For the first time, we don't have a symmetry across options. Instead, ARM64 secures the 3 out of 4 cheapest spots. The same is true for the PreJIT option and the 256 MB memory configuration.

Similar to the Json.NET benchmark, the cheapest configurations use ReadyToRun and, as for all execution cost benchmarks, Tiered Compilation is disabled.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Warm Used (100) Cost (µ$)
arm64 256MB no yes no 287.093 702.015 294.423 23.52147561
arm64 256MB no yes yes 311.507 660.822 295.178 23.38668193
arm64 512MB no yes yes 312.017 315.322 204.109 23.66288998
x86_64 256MB no yes yes 294.279 519.965 298.581 23.61061349

System.Text.Json - Source Generator - Execution Cost and Total Warm Execution Time

Fullsize Image

Summary

Here are our observed lower bounds for the JSON serialization libraries, as well as the baseline performance on .NET 6 for comparison. I've omitted .NET Core 3.1 since I don't consider a viable target runtime anymore. However, you can explore the result set in the interactive Google spreadsheet.

  • Baseline for .NET 6
    • Cold start duration: 223ms
    • Execution cost: 21.94µ$
  • Newtonsoft Json.NET
    • Cold start duration: 433 ms
    • Execution cost: 26.72 µ$
  • System.Text.Json - Reflection
    • Cold start duration: 321 ms
    • Execution cost: 24.28 µ$
  • System.Text.Json - Source Generator
    • Cold start duration: 294 ms
    • Execution cost: 23.39 µ$

It shouldn't be a surprise that Json.NET, which has been around for a long time, has accumulated a lot of cruft. Json.NET is truly a Swiss army knife for serialization and this flexibility comes at a cost. It adds at least 210 ms to our cold start duration and it's also the most expensive JSON library to run.

The newer System.Text.Json library has a compelling performance and value benefit over Json.NET. It only adds 100 ms to our cold start duration and is 9% cheaper to run compared to Json.NET.

However, the clear winner is the new JSON source generator with only 70 ms in cold start overhead compared to our baseline. Cost is also 12% lower than Json.NET. That said, the lack of features may not make it a good choice just yet.

When it comes to minimizing cold start duration, the more memory, the better. These benchmarks used 1,769 MB, which unlocks most of the available vCPU performance, but not all of it. Full vCPU performance is achieved at 3,008 MB, which almost doubles the cost for a 10% improvement (source).

For minimizing cost, 256 MB seems to be the preferred choice. Tiered Compilation should never be used, but ReadyToRun is beneficial. The weird thing about this configuration is that ReadyToRun produces Tier0 code (i.e. dirty JIT without inlining, hoisting, or any of that delicious performance stuff). With Tiered Compilation disabled, our code will never be optimized further, as far as I know.

What's Next

For the next post, I'm going to investigate the overhead introduced by the AWS SDK. Since most Lambda functions will use it, I thought it would be useful to understand what the initialization cost is.

Top comments (0)