DEV Community

Cover image for The Secret Life of Python: The Lazy Baker
Aaron Rose
Aaron Rose

Posted on

The Secret Life of Python: The Lazy Baker

Why your Python code runs out of memory, and how yield fixes it.


The library smelled faintly of ozone and burning plastic. Timothy was sitting in front of his laptop, but he wasn't typing. He was staring blankly at a spinning cursor. The fan was screaming.

"It crashed again," Timothy groaned.

Margaret marked her place in her book and looked over. "What are you trying to build?"

"I need to simulate a dataset," Timothy explained. "I need one hundred million random sensor readings to test my new analysis script. I wrote a function to generate them and put them in a list."

He pointed to the frozen code on his screen.

# Timothy's "Eager" Baker
def produce_readings(n):
    results = []
    print("Starting production...")
    for i in range(n):
        # Imagine complex calculation here
        results.append(i * 1.5) 
    print("Production complete!")
    return results

# Asking for 100 million items
data = produce_readings(100_000_000)

for reading in data:
    process(reading)

Enter fullscreen mode Exit fullscreen mode

"I hit Run," Timothy said, "and then the computer freezes. It never even gets to the 'process' loop. It just runs out of memory."

Margaret nodded knowingly. "You have hired the Eager Baker."

The Warehouse Problem

"Imagine a bakery," Margaret began. "You walk in and ask for ten thousand loaves of bread."

"The Eager Baker," she continued, "says 'Yes, sir!' He immediately mixes ten thousand dough balls. He bakes ten thousand loaves. He piles them up. He fills the counter, then the floor, then the back room. The bakery is bursting. He cannot give you a single loaf until all ten thousand are finished."

"That is what your list is doing," Margaret said. "You asked for one hundred million numbers. Python is trying to build them all, store them all in RAM, and then let you use the first one. Your computer does not have a warehouse big enough."

"So I can't process big data?" Timothy asked.

"You can," Margaret smiled. "But you must fire the Eager Baker. You need the Lazy Baker."

The Magic of Yield

Margaret took the keyboard. "The Lazy Baker is smart. When you ask for ten thousand loaves, he does not bake them. He simply remembers the recipe."

She deleted Timothy’s list and wrote a strange new function.

# Margaret's "Lazy" Baker (Generator)
def produce_readings_lazy(n):
    print("Starting production...")
    for i in range(n):
        yield i * 1.5
    print("Production complete!")

Enter fullscreen mode Exit fullscreen mode

"Where is the list?" Timothy asked. "Where is results.append?"

"There is no list," Margaret said. "There is only yield. This keyword turns the function into a Generator."

"Watch what happens when we ask for the data."

# Create the generator object
baker = produce_readings_lazy(100_000_000)

print(f"Baker object: {baker}")

# Ask for the first loaf
print(next(baker))

# Ask for the second loaf
print(next(baker))

Enter fullscreen mode Exit fullscreen mode

Output:

Baker object: <generator object produce_readings_lazy at 0x...>
Starting production...
0.0
1.5

Enter fullscreen mode Exit fullscreen mode

"Do you see?" Margaret whispered. "When we created baker, nothing happened. No code ran. No memory was used. The Baker was waiting."

"When we called next(baker)," she explained, "he woke up. He ran until he hit yield. He handed you one value, and then he paused. He froze his state right there, remembering exactly where he left off."

The Memory Audit

"So he bakes them one at a time?" Timothy asked.

"On demand," Margaret confirmed. "He bakes one, hands it to you, and waits for you to come back. He never stores more than one item at a time."

"Prove it," Timothy said, his skepticism returning. "Show me the memory difference."

Margaret imported the sys module to weigh the objects.

import sys

# The Eager List (Small scale to avoid crashing)
eager_list = [i * 1.5 for i in range(1_000_000)]

# The Lazy Generator (Same scale)
lazy_gen = (i * 1.5 for i in range(1_000_000))

print(f"List Size in Bytes:      {sys.getsizeof(eager_list)}")
print(f"Generator Size in Bytes: {sys.getsizeof(lazy_gen)}")

Enter fullscreen mode Exit fullscreen mode

She hit Run.

List Size in Bytes:      8448728
Generator Size in Bytes: 104

Enter fullscreen mode Exit fullscreen mode

Timothy stared. "The list is 8 megabytes. The generator is... 104 bytes? That's smaller than a tweet."

"Because the list holds a million numbers," Margaret said. "The generator only holds the recipe for how to get the next number. It doesn't matter if you ask for a million items or a trillion."

The Infinite & The Chain

"He can even bake forever," Margaret added. "If you wrote while True: yield n, he would never run out of dough. He would bake infinitely, as long as you only asked for one at a time."

"Is there a catch?"

"One catch," Margaret warned. "The Lazy Baker has no shelf space. Once he hands you a loaf, he forgets it. You cannot ask for 'loaf number 5' again unless you restart the whole process. Generators are a one-way street."

"But," she noted, typing quickly, "you can chain them together into pipelines."

# The Pipeline: Data flows through, one item at a time
raw_data = (read_sensor() for _ in range(1_000_000))
clean_data = (x for x in raw_data if x > 0)
final_data = (process(x) for x in clean_data)

Enter fullscreen mode Exit fullscreen mode

"No lists," Timothy marveled. "Just a stream."

Margaret’s Cheat Sheet

Margaret opened her notebook to a new page titled "Lazy Evaluation."

  • The Eager Approach (Lists): Computes everything upfront. Fast to access repeatedly (my_list[5]), but eats memory.
  • The Lazy Approach (Generators): Computes one item at a time.
  • The yield Keyword: Pauses a function and returns a value, saving its state.
  • Generator Expressions: (x for x in data). Like list comprehensions, but with ().
  • The Trade-off: Generators save massive amounts of memory, but they are "Single Use Only." You cannot index them (gen[0]) or iterate over them twice.
  • Use Cases: Massive datasets, infinite streams, and data pipelines.

Timothy replaced his brackets with parentheses and hit Run. The fan on his laptop instantly quieted down. The data began to flow.

"I never thought I'd say this," Timothy grinned. "But I love laziness."

"Efficiency, Timothy," Margaret corrected, sipping her tea. "We call it efficiency."


In the next episode, Margaret and Timothy will face "The Magic Bookshelf"—where they discover a data structure that doesn't use numbers for an index, but anything you want.


Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.

Top comments (0)