Matt Eland

Posted on Jan 27, 2020 • Originally published at killalldefects.com on Jan 27, 2020

LINQ SelectMany in Depth

#csharp #dotnet #linq

In this article, I’ll walk through the various overloads and usages of LINQ’s SelectMany methods.

SelectMany is in many ways the opposite of GroupBy which I covered last time in this series on LINQ. While GroupBy took a single collection and transformed it into multiple child collections, SelectMany flattens child collections into a single merged collection.

So, how is this flattening actually useful?

Basic SelectMany Operations

Imagine you had a collection of books. Each book may have one or more characters in it (think people, not letters).

If I wanted to enumerate my entire library and identify every character – regardless of book – that’s more or less what SelectMany does on a collection basis.

Put another way, SelectMany maps an enumerable property on each item in a collection into a single flat list.

The most simple form of this code looks like this:

var people = books.SelectMany(b => b.Characters);

This operation would return a list of characters that might look something like this (comments added for ease of understanding):

[
  // Characters from Sphere
  "Harry",
  "Norman",
  "Beth",
  "Jerry",
  // Characters from Jurassic Park
  "Malcolm",
  "Grant",
  "Satler",
  "Nedry",
  "Hammond",
  "Gennaro",
  "Tim",
  "Lex"
]

Note that the resulting collection is a single flat collection, not multiple groups of collections. This makes SelectMany nearly the inverse operation to the GroupBy method.

The function we provided SelectMany for identifying the nested collection is called the collection selector.

Adding Result Selectors

LINQ also provides an additional overload that gives us something called a result selector.

The result selector is a simple function that transforms an individual node in the collection that would be returned into something different. It does this via a function that takes in the parent collection (think book, using the analogy before) as well as the child in that collection (think character in the same example).

Take a look at what this looks like, using (b, c) as inputs to a function that returns a new anonymous type containing a book and character:

The resulting collection now provides a bit more context on each character, as listed below in this abbreviated sample:

[
  {
    "Book": "Sphere",
    "Character": "Harry"
  },
  // Some results omitted
  {
    "Book": "Jurassic Park",
    "Character": "Lex"
  }
]

Note that you don’t need to return an object at all. Since characters are strings in our example, we can just as easily do some string formatting operations as follows:

This simpler code results in a far more concise and readable output:

[
  "Harry (Sphere)",
  "Norman (Sphere)",
  "Beth (Sphere)",
  "Jerry (Sphere)",
  "Malcolm (Jurassic Park)",
  "Grant (Jurassic Park)",
  "Satler (Jurassic Park)",
  "Nedry (Jurassic Park)",
  "Hammond (Jurassic Park)",
  "Gennaro (Jurassic Park)",
  "Tim (Jurassic Park)",
  "Lex (Jurassic Park)"
]

Pretty cool, right?

As we saw, not only can SelectMany flatten nested collections into a single collection, it can also transform or map the objects in those collections into different objects as needed.

Index-Based Overloads

Sometimes you need to know what index an item is in the source collection. This case should be somewhat rare and typically involves cases where you have to join together two different sources of data.

For this case, LINQ provides overloads for both of the methods we’ve discussed so far. Each overload allows you to add in a function parameter to the collection selector that will take in the integer-based index of that collection.

To take a look at what this might look like, see the following example:

Here the b parameter in the collection selector corresponds to the Book object while the i parameter is the zero-based index in the collection. We grab the characters list out of the characters collection by index and SelectMany is able to use its results selector on the resulting object.

Again, this is a somewhat uncommon overload to use but it can be helpful in cases where your data is fragmented across multiple collections.

Closing Thoughts

In my opinion, SelectMany is much more useful than the inverse operation, GroupBy.

I would strongly consider SelectMany anytime you need to flatten nested lists into a single collection.

Additionally, the ability to flatten and transform a collection in a single method call is extremely efficient (at the cost of readability) and can reduce the need to chain together subsequent LINQ calls to transform the result collection.

Ultimately, I feel you will use SelectMany on an infrequent yet reliable basis for its utility value alone when dealing with nested collections. The mapping functions may be less needed, but still important in key scenarios.

If anything in this article was confusing, please let me know or check out Microsoft’s own documentation on the method group.

If you’ve found a use for SelectMany I haven’t covered here, I’d love to add it to my own bag of tricks. Please leave a comment and let me know what you’ve found.

The post LINQ SelectMany in Depth appeared first on Kill All Defects.

Top comments (4)

Jesse Phillips • Jan 27 '20

If I understand your final index example should look like

books.SelectMany(
   (b, i) => b.Characters[i]...)

Or we could replace the result collector.

books.SelectMany(
   (b, i) => $"{b.Characters[i]} ({b.Title})")

Matt Eland • Jan 27 '20

No, what I have there is correct and works properly.

Jesse Phillips • Jan 27 '20 • Edited

I don't understand where characters comes from then. It looks undefined to me.

Matt Eland • Jan 27 '20

Imagine it as a List<string> that comes from another file or data source outside of books