DEV Community

Cover image for Lazy Evaluation in PHP: Real‑World Memory Savings with Generators
Oleksandr Vasyliev
Oleksandr Vasyliev

Posted on

Lazy Evaluation in PHP: Real‑World Memory Savings with Generators

Introduction

Have you ever tried to load a CSV file with a million rows into memory and ended up with something like this?

PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted
Enter fullscreen mode Exit fullscreen mode

Even with a higher memory_limit, the entire dataset still sits in memory while you process it line by line.

The solution? Lazy evaluation — a technique where data is generated and processed only when it's actually needed.

In PHP, you can achieve this in two ways: with generators (yield) and via the Iterator API. Let’s explore both.

What Is Lazy Evaluation?

Normally, when you create an array, PHP loads all elements into memory at once:

function getNumbersArray(int $count): array {
    $result = [];
    for ($i = 1; $i <= $count; $i++) {
        $result[] = $i;
    }
    return $result;
}

foreach (getNumbersArray(5) as $number) {
    echo $number . PHP_EOL;
}
Enter fullscreen mode Exit fullscreen mode

Here, the entire array [1, 2, 3, 4, 5] is stored in memory.

Now let’s try a lazy approach:

function getNumbersGenerator(int $count): Generator {
    for ($i = 1; $i <= $count; $i++) {
        yield $i;
    }
}

foreach (getNumbersGenerator(5) as $number) {
    echo $number . PHP_EOL;
}
Enter fullscreen mode Exit fullscreen mode

📌 The key difference: the generator doesn't store everything — it yields one element at a time, as foreach requests it.

Reading a Huge CSV Without Pain

Imagine you have a data.csv file that’s 2 GB in size. If you try to load it with file() or fgetcsv() into an array, you'll likely run out of memory.

But with a generator, it's easy:

function readCsv(string $filename): Generator {
    $handle = fopen($filename, 'r');
    if ($handle === false) {
        throw new RuntimeException("Cannot open file $filename");
    }

    while (($row = fgetcsv($handle)) !== false) {
        yield $row;
    }

    fclose($handle);
}

foreach (readCsv('data.csv') as $row) {
    // Process the row
    // For example: echo implode(', ', $row) . PHP_EOL;
}
Enter fullscreen mode Exit fullscreen mode

📊 Memory usage: even with a 2 GB CSV, only one row is stored in memory at a time, so usage stays in the kilobytes.

Benchmark: Array vs Generator

$startMemory = memory_get_usage();

$array = range(1, 1_000_000); // creates an array with a million numbers

echo "Array: " . (memory_get_usage() - $startMemory) / 1024 / 1024 . " MB
";

unset($array);

$startMemory = memory_get_usage();

function bigGenerator(): Generator {
    for ($i = 1; $i <= 1_000_000; $i++) {
        yield $i;
    }
}

foreach (bigGenerator() as $n) {
    // Just iterate
}

echo "Generator: " . (memory_get_usage() - $startMemory) / 1024 / 1024 . " MB
";
Enter fullscreen mode Exit fullscreen mode

On my machine:

Array: 120 MB
Generator: 0.5 MB
Enter fullscreen mode Exit fullscreen mode

The Iterator API

Generators are quick and simple. But sometimes you need more control — preserving state, managing custom keys, or switching data sources dynamically.

That's where the Iterator API comes in.

Example: Custom Iterator

class RangeIterator implements Iterator {
    private int $start;
    private int $end;
    private int $current;

    public function __construct(int $start, int $end) {
        $this->start = $start;
        $this->end = $end;
        $this->current = $start;
    }

    public function current(): int {
        return $this->current;
    }

    public function key(): int {
        return $this->current;
    }

    public function next(): void {
        $this->current++;
    }

    public function rewind(): void {
        $this->current = $this->start;
    }

    public function valid(): bool {
        return $this->current <= $this->end;
    }
}

foreach (new RangeIterator(1, 5) as $num) {
    echo $num . PHP_EOL;
}
Enter fullscreen mode Exit fullscreen mode

When to Use What?

Scenario Recommended Approach
Need to stream data on demand Generator
Need complex logic or internal state Iterator API
Reading large files or DB streams Generator
Multiple iterations with preserved state Iterator API

Real Production Case

We were parsing an automotive sales API that returned hundreds of thousands of records.

Originally, we loaded everything into an array — the script used 1–2 GB of memory.

After switching to a generator:

function fetchCars(): Generator {
    $page = 1;
    do {
        $data = apiRequest('cars', ['page' => $page]);
        foreach ($data['items'] as $car) {
            yield $car;
        }
        $page++;
    } while (!empty($data['items']));
}
Enter fullscreen mode Exit fullscreen mode

📉 Memory usage dropped from 2 GB to just 10 MB — with no noticeable change in execution time.

🔍 For Geeks: How Generators Work Under the Hood

1. A Generator Is an Object

In PHP, a generator is an object of class Generator, implementing both Iterator and Traversable.

It can:

  • store function state;
  • pause execution at yield;
  • resume from the same spot.

2. Execution Flow

  • Calling the generator function doesn’t run the code — it returns a Generator object.
  • The first foreach or current() call executes until the first yield.
  • Each yield pauses execution and returns a value.
  • next() resumes from the paused point.
  • When the function ends, the generator is marked as completed.

3. Zend Engine Internals

If you compile a generator function using VLD (Vulcan Logic Disassembler), each yield corresponds to an instruction that:

  • saves the call stack;
  • stores variable context;
  • yields control back to the caller.

4. Compared to Arrays

  • Arrays store all elements in memory at once.
  • Generators hold only the current element (zval) and overwrite it with each step.
  • You can process a million elements using just a few hundred kilobytes.

5. Infinite Generator Example

function counter(): Generator {
    $i = 0;
    while (true) {
        yield $i++;
    }
}

foreach (counter() as $num) {
    if ($num > 5) break;
    echo $num . PHP_EOL;
}
Enter fullscreen mode Exit fullscreen mode

You couldn’t do this with an array — memory would run out quickly.

🔗 Source Code

You can try everything yourself — the code used in this article is open source:

👉 github.com/phpner/phpner-php-lazy-evaluation-demo

It includes:

  • CSV benchmark: array (eager) vs generator (lazy)
  • NDJSON stream simulation
  • Memory and time profiling
  • CLI-friendly tools
  • Sample data generator
  • PHPUnit tests

📊 Benchmark results (1,000,001 rows)

Method Time Memory used Peak diff Rows
Array (eager) 1.401s 120 B 395.92 MB 1,000,001
Generator 1.012s 0 B 0.00 MB 1,000,001

These results show that lazy generators can significantly reduce memory usage when processing large datasets like CSVs.

🖼️ Visual results

📈 Peak memory usage:

Peak memory

📉 Execution time:

Execution time

Feel free to fork the repo, try the benchmarks, or adapt it to your own data processing tasks!

Conclusion

✅ Generators and the Iterator API are must-haves for modern PHP development.

They let you process millions of records without running out of memory.

Use generators for simple streaming logic, and Iterator API when you need more control.

💬 Do you use generators in production? Share your use cases in the comments!

Top comments (1)

Collapse
 
gbhorwood profile image
grant horwood

i use generators a lot, and this is a solid, concise and well-written article. thumbs up applied!