The use of yield return
in C# has always been a bit of a mystery to me, but recently I’ve come to understand a bit better how it works and when it can be useful.
Definition
Put simply, yield return
allows you to generate values on demand as if they were part of a collection (IEnumerable
), without having to manually create or populate a list. Items are returned one by one, only when needed, without storing them all in memory beforehand. Its main advantage lies in supporting lazy evaluation.
Basic usage example
Imagine you're working on a system that processes data from a large source (maybe a .csv
file to be imported or even a .txt
log file). In such cases, yield return
can optimize memory usage, since it lets you load only one line at a time and begin processing immediately, without waiting to read the entire file.
IEnumerable<string> ReadFileLines(string filePath)
{
using StreamReader reader = new(filePath);
string? line;
int lineNumber = 1;
while ((line = reader.ReadLine()) != null)
{
Console.WriteLine($"Reading line {lineNumber} now.");
yield return line;
lineNumber++;
}
}
foreach (string line in ReadFileLines("./test_file.txt"))
{
Console.WriteLine($"Extracted line: {line}");
Console.WriteLine("Performing extremely complex operations on the extracted line...");
int charCount = line.Length;
Console.WriteLine($"Result: the line has {charCount} characters");
}
Output:
Reading line 1 now.
Extracted line: line 1 a
Performing extremely complex operations on the extracted line...
Result: the line has 9 characters
Reading line 2 now.
Extracted line: line 2 abc
Performing extremely complex operations on the extracted line...
Result: the line has 11 characters
Reading line 3 now.
Extracted line: line 3 abcdefghi
Performing extremely complex operations on the extracted line...
Result: the line has 17 characters
Lazy evaluation
After that explanation, you might be wondering: “Okay, I can iterate items one by one, but what exactly does lazy evaluation mean?”. At least, that’s what I asked myself.
Looking at the output again, notice how the lines inside ReadFileLines
are only executed after code outside the method starts consuming the values. In other words, each iteration only happens when requested by the consumer.
A more extreme example would be creating an "infinite list", such as this one using the Fibonacci sequence:
IEnumerable<int> GenerateFibonacciLazy()
{
int a = 0, b = 1;
while (true)
{
yield return a;
(a, b) = (b, a + b);
}
}
IEnumerable<int> GenerateFibonacciEager()
{
long limit = 999999999999999999;
int a = 0, b = 1;
var results = new List<int>();
for (int i = 0; i < limit; i++)
{
results.Add(a);
(a, b) = (b, a + b);
}
return results;
}
foreach (int result in GenerateFibonacciLazy().Take(10))
{
Console.WriteLine(result);
}
There are two Fibonacci implementations here: GenerateFibonacciLazy
and GenerateFibonacciEager
. Eager evaluation is what we usually see, it loads all data into memory before returning.
If we run this exact code, the output will be:
0
1
1
2
3
5
8
13
21
34
But if we change the call from GenerateFibonacciLazy
to GenerateFibonacciEager
and run the application with dotnet run
, even though we’re using .Take(10)
, after a few seconds we'll see:
Out of memory.
A simple analogy from the restaurant world:
- With eager execution, the chef places every dish on the table at once. Taking up space and using all ingredients upfront.
- With lazy execution, the chef only prepares a dish when the customer asks for it. Saving space and resources.
When not to use yield return
After reading this far, you might think that yield return
is the answer to all your problems, and from now on, anytime you need to return more than three items, you'll use it by default. I felt the same way for a few minutes.
But, when you look closer, not everything is sunshine and rainbows. Like a kitchen sponge, yield return
also has two sides — one soft, that makes life easier, and one rougher, that you need to handle carefully or risk scratching your mom’s brand-new nonstick pan.
Here are some cases where yield return
might not be the best choice:
-
When you need to iterate the result more than once
If you call
.ToList()
on ayield
result, it will be materialized only once. But if you iterate directly multiple times, the method will be re-executed from scratch each time. -
When you need indexed access (
[]
)yield return
doesn’t create an indexed structure. -
When the iterator logic depends on external mutable state
This can lead to subtle and hard-to-find bugs.
This last point can be a bit tricky, so let’s break it down.
Imagine a product service with a method GetProductsByMinPrice()
that uses a _minPrice
field to filter products. If _minPrice
is changed before iteration, the filtering changes. Or even worse, if it changes during iteration, some results will use one value and others a different one. Leading to unpredictable behavior.
Example:
var service = new ProductService(100);
var products = new List<Product> {
new Product("A", 90),
new Product("B", 110),
new Product("C", 120),
new Product("D", 115),
};
var filtered = service.GetProductsByMinPrice(products);
int iteration = 0;
foreach (var product in filtered)
{
Console.WriteLine(product.Name);
iteration++;
// simulate state change during iteration
if (iteration == 2)
{
service.UpdateMinPrice(120);
}
}
Output:
B // meets original min price >= 100
C // meets updated min price >= 120
Note how product “D” is skipped because it no longer meets the new filter mid-iteration.
This happens because yield return
does not execute immediately. Instead, the compiler transforms the method into a state machine that implements IEnumerator
. This class retains local variables and the current execution point, resuming from where it left off on each iteration.
The state machine concept is quite deep, and I won’t try to explain it fully.
But at a high level, the state machine retains local state and reevaluates external variables on each iteration, making behavior harder to predict if those external variables change.
Other use cases
While yield return
is a tool for building iterators, it doesn’t need to be in “classic” iteration methods. An interesting example is returning validation errors from a model:
public IEnumerable<string> ValidateUser(UserDTO user)
{
if (string.IsNullOrWhiteSpace(user.Name))
yield return "Name is required.";
if (user.Age < 18)
yield return "Minimum age is 18.";
if (!user.Email.Contains("@"))
yield return "Invalid email.";
}
In this example we return an error collection of user validation errors, making it easier to check multiple error conditions.
Yield break
The yield break
keyword explicitly ends the iteration. For example, we can enhance our previous validation method:
public IEnumerable<string> ValidateUser(UserDTO user)
{
if (user == null)
{
yield return "User cannot be null.";
yield break; // exit early, nothing to validate
}
if (string.IsNullOrWhiteSpace(user.Name))
yield return "Name is required.";
if (user.Age < 18)
yield return "Minimum age is 18.";
if (!user.Email.Contains("@"))
yield return "Invalid email.";
}
This way, we avoid throwing unnecessary exceptions and keep control flow simple.
References
- https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/statements/yield
- https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/iterators
- https://stackoverflow.com/questions/410026/proper-use-of-yield-return
- https://stackoverflow.com/questions/742497/yield-statement-implementation
Top comments (2)
I've needed to better understand this. I'm going to parse through it tomorrow, this write up seems quality. I particularly appreciate the resources. Thank you!
I'm glad you liked it. I hope this article can help you!