DEV Community

Vitor R. F. Ribeiro
Vitor R. F. Ribeiro

Posted on

2 2

Benchmarking two different approaches for collection validation

I recently came up in a situation at work where I needed to validate a collection of permission's ids (providedIds) based on a valid collection of permission's ids (validIds). If I find at least one invalid Id, the whole collection fails the validation.

So I used LINQ Except extension to check what exists on my providedIds that doens't exist in my validIds. If there are any, my providedIds are not valid.

public bool ExceptApproach(IEnumerable<int> permissionIds, IEnumerable<int> validPermissionsIds)
{
    var invalidPermissions = permissionIds.Except(validPermissionsIds);

    return !invalidPermissions.Any();
}
Enter fullscreen mode Exit fullscreen mode

But then it occurred to me that all I need to know is if there are a single invalid id in my collection. So I tryed a different approach:

public bool AnyApproach(IEnumerable<int> permissionIds, IEnumerable<int> validPermissionsIds)
{
    var atLeastOneInvalidPermission = permissionIds.Any(id => !validPermissionsIds.Contains(id));

    return !atLeastOneInvalidPermission;
}
Enter fullscreen mode Exit fullscreen mode

But, is it really better? To answear that question I searched a way to benchmark it and decided to use: benchmarkdotnet. And I came up with this:


BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.928 (2004/?/20H1)
Intel Core i5-9400F CPU 2.90GHz (Coffee Lake), 1 CPU, 6 logical and 6 physical cores
.NET Core SDK=5.0.101
  [Host]     : .NET Core 5.0.1 (CoreCLR 5.0.120.57516, CoreFX 5.0.120.57516), X64 RyuJIT
  DefaultJob : .NET Core 5.0.1 (CoreCLR 5.0.120.57516, CoreFX 5.0.120.57516), X64 RyuJIT
Enter fullscreen mode Exit fullscreen mode
Method permissionIds validPermissionsIds Mean Error StdDev
ExceptApproach [1,2,3] [1,2,3] 199.40 ns 1.181 ns 1.104 ns
ExceptApproach [1,2,3] [2,3,4] 162.11 ns 2.729 ns 2.553 ns
ExceptApproach Range(1, 100) Range(1, 100) 3,787.92 ns 59.032 ns 55.219 ns
ExceptApproach Range(1, 100) Range(2, 101) 2,242.24 ns 7.806 ns 6.920 ns
AnyApproach [1,2,3] [1,2,3] 83.33 ns 0.294 ns 0.260 ns
AnyApproach [1,2,3] [2,3,4] 41.22 ns 0.479 ns 0.425 ns
AnyApproach Range(1, 100) Range(1, 100) 2,398.31 ns 22.912 ns 21.432 ns
AnyApproach Range(1, 100) Range(2, 101) 57.68 ns 0.770 ns 0.721 ns

The AnyApproach is better than the ExceptApproach for small collections. When the collection
[1,2,3] is valid, the AnyApproach was about 50% faster. And when the collection is invalid [2,3,4]
the AnyApproach was way better (because the first element was already invalid)

For larger collections, the ExceptApproach and AnyApproach were more close: 3,787.92 ns x 2,398.31 ns and yet, the AnyApproach was faster (for a valid collection). For a invalid collection, the AnyApproach was significantly faster (again, because the first element was already invalid)

In conclusion, I think using the AnyApproach is better than the ExceptApproach. I know I should have included more cenarios like: "what if only the last permission Id is invalid?" or "what if only the element at the center of the collection is invalid?", but for the situation I have at work, this is enough to go on.

*For those wandering, I edited the columns permissionIds and validPermissionsIds because I didn't find a way to include custom names for the values I used. Here's the full code.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more