Introduction
Have you ever tried to load a CSV file with a million rows into memory and ended up with something like this?
PHP Fatal error: Allowed memory size of 134217728 bytes exhausted
Even with a higher memory_limit
, the entire dataset still sits in memory while you process it line by line.
The solution? Lazy evaluation — a technique where data is generated and processed only when it's actually needed.
In PHP, you can achieve this in two ways: with generators (yield
) and via the Iterator API. Let’s explore both.
What Is Lazy Evaluation?
Normally, when you create an array, PHP loads all elements into memory at once:
function getNumbersArray(int $count): array {
$result = [];
for ($i = 1; $i <= $count; $i++) {
$result[] = $i;
}
return $result;
}
foreach (getNumbersArray(5) as $number) {
echo $number . PHP_EOL;
}
Here, the entire array [1, 2, 3, 4, 5]
is stored in memory.
Now let’s try a lazy approach:
function getNumbersGenerator(int $count): Generator {
for ($i = 1; $i <= $count; $i++) {
yield $i;
}
}
foreach (getNumbersGenerator(5) as $number) {
echo $number . PHP_EOL;
}
📌 The key difference: the generator doesn't store everything — it yields one element at a time, as foreach
requests it.
Reading a Huge CSV Without Pain
Imagine you have a data.csv
file that’s 2 GB in size. If you try to load it with file()
or fgetcsv()
into an array, you'll likely run out of memory.
But with a generator, it's easy:
function readCsv(string $filename): Generator {
$handle = fopen($filename, 'r');
if ($handle === false) {
throw new RuntimeException("Cannot open file $filename");
}
while (($row = fgetcsv($handle)) !== false) {
yield $row;
}
fclose($handle);
}
foreach (readCsv('data.csv') as $row) {
// Process the row
// For example: echo implode(', ', $row) . PHP_EOL;
}
📊 Memory usage: even with a 2 GB CSV, only one row is stored in memory at a time, so usage stays in the kilobytes.
Benchmark: Array vs Generator
$startMemory = memory_get_usage();
$array = range(1, 1_000_000); // creates an array with a million numbers
echo "Array: " . (memory_get_usage() - $startMemory) / 1024 / 1024 . " MB
";
unset($array);
$startMemory = memory_get_usage();
function bigGenerator(): Generator {
for ($i = 1; $i <= 1_000_000; $i++) {
yield $i;
}
}
foreach (bigGenerator() as $n) {
// Just iterate
}
echo "Generator: " . (memory_get_usage() - $startMemory) / 1024 / 1024 . " MB
";
On my machine:
Array: 120 MB
Generator: 0.5 MB
The Iterator API
Generators are quick and simple. But sometimes you need more control — preserving state, managing custom keys, or switching data sources dynamically.
That's where the Iterator API comes in.
Example: Custom Iterator
class RangeIterator implements Iterator {
private int $start;
private int $end;
private int $current;
public function __construct(int $start, int $end) {
$this->start = $start;
$this->end = $end;
$this->current = $start;
}
public function current(): int {
return $this->current;
}
public function key(): int {
return $this->current;
}
public function next(): void {
$this->current++;
}
public function rewind(): void {
$this->current = $this->start;
}
public function valid(): bool {
return $this->current <= $this->end;
}
}
foreach (new RangeIterator(1, 5) as $num) {
echo $num . PHP_EOL;
}
When to Use What?
Scenario | Recommended Approach |
---|---|
Need to stream data on demand | Generator |
Need complex logic or internal state | Iterator API |
Reading large files or DB streams | Generator |
Multiple iterations with preserved state | Iterator API |
Real Production Case
We were parsing an automotive sales API that returned hundreds of thousands of records.
Originally, we loaded everything into an array — the script used 1–2 GB of memory.
After switching to a generator:
function fetchCars(): Generator {
$page = 1;
do {
$data = apiRequest('cars', ['page' => $page]);
foreach ($data['items'] as $car) {
yield $car;
}
$page++;
} while (!empty($data['items']));
}
📉 Memory usage dropped from 2 GB to just 10 MB — with no noticeable change in execution time.
🔍 For Geeks: How Generators Work Under the Hood
1. A Generator Is an Object
In PHP, a generator is an object of class Generator
, implementing both Iterator
and Traversable
.
It can:
- store function state;
- pause execution at
yield
; - resume from the same spot.
2. Execution Flow
- Calling the generator function doesn’t run the code — it returns a
Generator
object. - The first
foreach
orcurrent()
call executes until the firstyield
. - Each
yield
pauses execution and returns a value. -
next()
resumes from the paused point. - When the function ends, the generator is marked as completed.
3. Zend Engine Internals
If you compile a generator function using VLD (Vulcan Logic Disassembler), each yield
corresponds to an instruction that:
- saves the call stack;
- stores variable context;
- yields control back to the caller.
4. Compared to Arrays
- Arrays store all elements in memory at once.
- Generators hold only the current element (
zval
) and overwrite it with each step. - You can process a million elements using just a few hundred kilobytes.
5. Infinite Generator Example
function counter(): Generator {
$i = 0;
while (true) {
yield $i++;
}
}
foreach (counter() as $num) {
if ($num > 5) break;
echo $num . PHP_EOL;
}
You couldn’t do this with an array — memory would run out quickly.
🔗 Source Code
You can try everything yourself — the code used in this article is open source:
👉 github.com/phpner/phpner-php-lazy-evaluation-demo
It includes:
- CSV benchmark: array (eager) vs generator (lazy)
- NDJSON stream simulation
- Memory and time profiling
- CLI-friendly tools
- Sample data generator
- PHPUnit tests
📊 Benchmark results (1,000,001 rows)
Method | Time | Memory used | Peak diff | Rows |
---|---|---|---|---|
Array (eager) | 1.401s | 120 B | 395.92 MB | 1,000,001 |
Generator | 1.012s | 0 B | 0.00 MB | 1,000,001 |
These results show that lazy generators can significantly reduce memory usage when processing large datasets like CSVs.
🖼️ Visual results
📈 Peak memory usage:
📉 Execution time:
Feel free to fork the repo, try the benchmarks, or adapt it to your own data processing tasks!
Conclusion
✅ Generators and the Iterator API are must-haves for modern PHP development.
They let you process millions of records without running out of memory.
Use generators for simple streaming logic, and Iterator API when you need more control.
💬 Do you use generators in production? Share your use cases in the comments!
Top comments (1)
i use generators a lot, and this is a solid, concise and well-written article. thumbs up applied!