Python weakref for Cache Systems: Prevent Memory Leaks

#python #weakref #memorymanagement #caching

A 400MB Memory Leak from 12 Lines of Cache Code

I watched a production service climb from 200MB to 4GB over 6 hours. The culprit? A dictionary-based cache that never forgot anything.

# The silent killer
class ImageProcessor:
    _cache = {}  # This grows forever

    @classmethod
    def get_processed(cls, image_id, raw_image):
        if image_id not in cls._cache:
            cls._cache[image_id] = expensive_process(raw_image)
        return cls._cache[image_id]

The fix was three lines. Here's the working version first, then I'll explain why the naive approach fails so spectacularly:

import weakref

class ImageProcessor:
    _cache = weakref.WeakValueDictionary()

    @classmethod
    def get_processed(cls, image_id, processed_image):
        # Only caches while caller holds a reference
        cls._cache[image_id] = processed_image
        return processed_image

    @classmethod  
    def get_cached(cls, image_id):
        return cls._cache.get(image_id)  # Returns None if GC'd

That's it. WeakValueDictionary doesn't prevent garbage collection of its values. When the last strong reference to a cached object disappears, the entry evicts itself. No manual cleanup, no LRU complexity, no TTL tracking.

Continue reading the full article on TildAlice

DEV Community

Python weakref for Cache Systems: Prevent Memory Leaks

A 400MB Memory Leak from 12 Lines of Cache Code

Top comments (0)