Nícolas Bassini

Posted on May 18

Do you really know what yield return does?

#dotnet #csharp #webdev #programming

The use of yield return in C# has always been a bit of a mystery to me, but recently I’ve come to understand a bit better how it works and when it can be useful.

Definition

Put simply, yield return allows you to generate values on demand as if they were part of a collection (IEnumerable), without having to manually create or populate a list. Items are returned one by one, only when needed, without storing them all in memory beforehand. Its main advantage lies in supporting lazy evaluation.

Basic usage example

Imagine you're working on a system that processes data from a large source (maybe a .csv file to be imported or even a .txt log file). In such cases, yield return can optimize memory usage, since it lets you load only one line at a time and begin processing immediately, without waiting to read the entire file.

IEnumerable<string> ReadFileLines(string filePath)
{
    using StreamReader reader = new(filePath);
    string? line;
    int lineNumber = 1;
    while ((line = reader.ReadLine()) != null)
    {
        Console.WriteLine($"Reading line {lineNumber} now.");
        yield return line;
        lineNumber++;
    }
}

foreach (string line in ReadFileLines("./test_file.txt"))
{
    Console.WriteLine($"Extracted line: {line}");
    Console.WriteLine("Performing extremely complex operations on the extracted line...");
    int charCount = line.Length;
    Console.WriteLine($"Result: the line has {charCount} characters");
}

Output:

Reading line 1 now.
Extracted line: line 1 a
Performing extremely complex operations on the extracted line...
Result: the line has 9 characters
Reading line 2 now.
Extracted line: line 2 abc
Performing extremely complex operations on the extracted line...
Result: the line has 11 characters
Reading line 3 now.
Extracted line: line 3 abcdefghi
Performing extremely complex operations on the extracted line...
Result: the line has 17 characters

Lazy evaluation

After that explanation, you might be wondering: “Okay, I can iterate items one by one, but what exactly does lazy evaluation mean?”. At least, that’s what I asked myself.

Looking at the output again, notice how the lines inside ReadFileLines are only executed after code outside the method starts consuming the values. In other words, each iteration only happens when requested by the consumer.

A more extreme example would be creating an "infinite list", such as this one using the Fibonacci sequence:

IEnumerable<int> GenerateFibonacciLazy()
{
    int a = 0, b = 1;
    while (true)
    {
        yield return a;
        (a, b) = (b, a + b);
    }
}

IEnumerable<int> GenerateFibonacciEager()
{
    long limit = 999999999999999999;
    int a = 0, b = 1;
    var results = new List<int>();

    for (int i = 0; i < limit; i++)
    {
        results.Add(a);
        (a, b) = (b, a + b);
    }

    return results;
}

foreach (int result in GenerateFibonacciLazy().Take(10))
{
    Console.WriteLine(result);
}

There are two Fibonacci implementations here: GenerateFibonacciLazy and GenerateFibonacciEager. Eager evaluation is what we usually see, it loads all data into memory before returning.

If we run this exact code, the output will be:

But if we change the call from GenerateFibonacciLazy to GenerateFibonacciEager and run the application with dotnet run, even though we’re using .Take(10), after a few seconds we'll see:

Out of memory.

A simple analogy from the restaurant world:

With eager execution, the chef places every dish on the table at once. Taking up space and using all ingredients upfront.
With lazy execution, the chef only prepares a dish when the customer asks for it. Saving space and resources.

When not to use `yield return`

After reading this far, you might think that yield return is the answer to all your problems, and from now on, anytime you need to return more than three items, you'll use it by default. I felt the same way for a few minutes.

But, when you look closer, not everything is sunshine and rainbows. Like a kitchen sponge, yield return also has two sides — one soft, that makes life easier, and one rougher, that you need to handle carefully or risk scratching your mom’s brand-new nonstick pan.

Here are some cases where yield return might not be the best choice:

When you need to iterate the result more than once

If you call .ToList() on a yield result, it will be materialized only once. But if you iterate directly multiple times, the method will be re-executed from scratch each time.
When you need indexed access ([])

yield return doesn’t create an indexed structure.
When the iterator logic depends on external mutable state

This can lead to subtle and hard-to-find bugs.

This last point can be a bit tricky, so let’s break it down.

Imagine a product service with a method GetProductsByMinPrice() that uses a _minPrice field to filter products. If _minPrice is changed before iteration, the filtering changes. Or even worse, if it changes during iteration, some results will use one value and others a different one. Leading to unpredictable behavior.

Example:

var service = new ProductService(100);
var products = new List<Product> {
    new Product("A", 90),
    new Product("B", 110),
    new Product("C", 120),
    new Product("D", 115),
};

var filtered = service.GetProductsByMinPrice(products);

int iteration = 0;
foreach (var product in filtered)
{
    Console.WriteLine(product.Name);
    iteration++;
    // simulate state change during iteration
    if (iteration == 2)
    {
        service.UpdateMinPrice(120);
    }
}

Output:

B // meets original min price >= 100
C // meets updated min price >= 120

Note how product “D” is skipped because it no longer meets the new filter mid-iteration.

This happens because yield return does not execute immediately. Instead, the compiler transforms the method into a state machine that implements IEnumerator. This class retains local variables and the current execution point, resuming from where it left off on each iteration.

The state machine concept is quite deep, and I won’t try to explain it fully.

But at a high level, the state machine retains local state and reevaluates external variables on each iteration, making behavior harder to predict if those external variables change.

Other use cases

While yield return is a tool for building iterators, it doesn’t need to be in “classic” iteration methods. An interesting example is returning validation errors from a model:

public IEnumerable<string> ValidateUser(UserDTO user)
{
    if (string.IsNullOrWhiteSpace(user.Name))
        yield return "Name is required.";

    if (user.Age < 18)
        yield return "Minimum age is 18.";

    if (!user.Email.Contains("@"))
        yield return "Invalid email.";
}

In this example we return an error collection of user validation errors, making it easier to check multiple error conditions.

Yield break

The yield break keyword explicitly ends the iteration. For example, we can enhance our previous validation method:

public IEnumerable<string> ValidateUser(UserDTO user)
{
    if (user == null)
    {
        yield return "User cannot be null.";
        yield break; // exit early, nothing to validate
    }

    if (string.IsNullOrWhiteSpace(user.Name))
        yield return "Name is required.";

    if (user.Age < 18)
        yield return "Minimum age is 18.";

    if (!user.Email.Contains("@"))
        yield return "Invalid email.";
}

This way, we avoid throwing unnecessary exceptions and keep control flow simple.

References

Top comments (2)

Sarah Matta • Jun 17

I've needed to better understand this. I'm going to parse through it tomorrow, this write up seems quality. I particularly appreciate the resources. Thank you!