Python Memory Management: Practical Techniques to Optimize Performance and Prevent Leaks

#programming #devto #python #softwareengineering

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Let's talk about Python and memory. I've been there—watching an application slow to a crawl, seeing the system memory fill up, and wondering where it all went wrong. Python makes it easy to build things quickly, but that convenience can come at a cost when you're dealing with a lot of data or need your program to run for a long time. It's like having a wonderfully organized but sometimes forgetful assistant.

The good news is you can take control. You can understand where memory goes and make smart choices to keep your programs fast and lean. I want to share some practical ways to do that. We'll start by figuring out how to see what's happening, then look at ways to make things better.

First, you need to see the problem before you can fix it. Python has a built-in tool called tracemalloc. It's like turning on a security camera inside your program's memory warehouse. It shows you exactly where memory is being allocated, line by line.

You start it, take snapshots at different points in your code, and then compare them. Did creating that new list of a million items use more memory than you thought? The snapshot will tell you. Is memory slowly increasing over time, suggesting a leak? The comparison will point to the file and line number where the allocations are happening that aren't being cleaned up.

Here's a simple way to use it. I often wrap it in a class to keep things tidy.

import tracemalloc

class MemoryProfiler:
    def __init__(self):
        tracemalloc.start()
        self.snapshots = []

    def take_snapshot(self, label):
        snapshot = tracemalloc.take_snapshot()
        self.snapshots.append((label, snapshot))
        print(f"Snapshot '{label}' taken.")
        return snapshot

    def compare(self, label1, label2):
        # Find the two snapshots
        snap1 = None
        snap2 = None
        for label, snap in self.snapshots:
            if label == label1:
                snap1 = snap
            if label == label2:
                snap2 = snap

        if snap1 and snap2:
            print(f"\nComparing '{label2}' to '{label1}':")
            stats = snap2.compare_to(snap1, 'lineno')
            for stat in stats[:5]:  # Show top 5 differences
                print(f"{stat.size_diff/1024:.1f} KiB added at {stat.traceback}")

# Use it in your code
profiler = MemoryProfiler()
profiler.take_snapshot("start")

# Do something that uses memory
my_big_list = [i for i in range(1000000)]

profiler.take_snapshot("after list creation")
profiler.compare("start", "after list creation")

This will print out where the memory for that list of a million integers came from. It's the first step. You can't optimize what you can't measure.

Now, let's say your profiling shows you are creating and destroying a lot of small, short-lived objects. Every time you create an object in Python, it asks the system for a little piece of memory. This request has some overhead. If you're doing it thousands of times a second, that overhead adds up.

One way to handle this is with an object pool. Instead of constantly asking for new memory and then throwing it away, you keep a pool of pre-created objects. When you need one, you check it out from the pool. When you're done, you return it. The memory is reused.

Think of it like a library of tools. Instead of buying a new hammer every time you need one and throwing it away after, you just borrow the library's hammer and return it when you're done.

class ObjectPool:
    def __init__(self, create_func, initial_size=10):
        self.create_func = create_func
        self.pool = []
        for _ in range(initial_size):
            self.pool.append(self.create_func())

    def get(self):
        if self.pool:
            return self.pool.pop()  # Take one from the pool
        else:
            return self.create_func()  # Pool is empty, make a new one

    def put(self, obj):
        # Optional: reset the object's state here
        self.pool.append(obj)

# Example: A simple connection object
def create_connection():
    return {"connected": False, "data": None}

pool = ObjectPool(create_connection, initial_size=5)

# Get connections from the pool
conn1 = pool.get()
conn2 = pool.get()
print(f"Pool size after getting 2: {len(pool.pool)}")  # Will be 3

# Use them...
conn1["connected"] = True

# Return them when done
pool.put(conn1)
pool.put(conn2)
print(f"Pool size after returning: {len(pool.pool)}")  # Back to 5

This is very useful for things like network connections, database cursors, or any object that is expensive to create. The memory for those objects is allocated once at the start and just circles around.

Another common source of memory use is data duplication. Sometimes you have a large piece of data, and you make a copy of it when you don't really need to. Slicing a list creates a new list. Passing large structures between functions can sometimes lead to implicit copies.

The key is to use views instead of copies where possible. Libraries like NumPy are brilliant at this. A slice of a NumPy array is just a view into the original data, not a copy.

import numpy as np
import sys

# Create a large array
big_array = np.arange(10000000)  # 10 million numbers
print(f"Original array size: {big_array.nbytes / 1024 / 1024:.2f} MB")

# A slice is a VIEW, not a copy
slice_view = big_array[5000000:5000010]
print(f"Slice is a view: {slice_view.base is big_array}")  # True

# Changing the view changes the original
slice_view[0] = 9999
print(big_array[5000000])  # This is now 9999

# If you need an actual copy, be explicit
slice_copy = big_array[5000000:5000010].copy()
print(f"Copy is a view: {slice_copy.base is big_array}")  # False

For your own data structures, think about whether you can share data. Could you pass an index or a reference instead of the whole object? For large, immutable data, look at sys.getrefcount() to understand how many references you have. Too many copies floating around is a sure way to fill up memory.

Python has a built-in janitor called the garbage collector. Its job is to find objects that are no longer needed and free up their memory. Most of the time, it works fine on its own. But sometimes it needs a nudge, or you need to understand its behavior to avoid problems.

The main issue it solves is the circular reference. This is when object A has a reference to object B, and object B has a reference back to object A. Even if nothing else in your program is using them, they're keeping each other alive. The garbage collector can find and break these cycles.

You can interact with it using the gc module.

import gc
import time

# See the current state
print(f"Garbage collector is enabled: {gc.isenabled()}")
print(f"Current counts (Gen 0, Gen 1, Gen 2): {gc.get_count()}")
print(f"Collection thresholds: {gc.get_threshold()}")

# Create a deliberate circular reference
class Node:
    def __init__(self, name):
        self.name = name
        self.parent = None
        self.children = []

root = Node("root")
child = Node("child")
root.children.append(child)
child.parent = root  # Circular reference created!

# Now, let's delete our references to this little family
del root
del child

# The objects still exist in memory because they reference each other.
# The garbage collector can find them.
print(f"\nUnreachable objects before collection: {gc.collect()}")
# gc.collect() returns the number of objects it cleaned up.

# You can also manually control how often it runs.
# The thresholds (700, 10, 10) mean:
# - Run a Gen 0 collection after 700 new object allocations.
# - Run a Gen 1 collection after 10 Gen 0 collections.
# - Run a Gen 2 collection after 10 Gen 1 collections.
# For a long-running service, you might increase these to reduce CPU overhead.
# gc.set_threshold(1000, 15, 15)

For most applications, you don't need to touch the garbage collector. But if you're building a long-running service like a web server, you might profile it and find that the garbage collector is running too often and using CPU time. Tweaking the thresholds can help. Conversely, if you have a batch process that creates millions of temporary objects, forcing a collection at key points can prevent your memory usage from ballooning.

Sometimes, the standard ways Python stores data are not the most efficient for your specific case. A list of a million integers has overhead for each integer object. A dictionary, while incredibly fast, also has overhead.

This is where the built-in array module or libraries like numpy come in. They store data in tight, continuous blocks of memory, just like languages like C do. This uses far less memory and is much faster for numerical operations.

import array
import sys

# List of integers
int_list = [i for i in range(1000000)]
print(f"Size of list: {sys.getsizeof(int_list) / 1024 / 1024:.2f} MB (just the list object)")
# The actual memory is much higher because each integer is a separate object.

# Array of integers
int_array = array.array('i', range(1000000))  # 'i' is for signed integer
print(f"Size of array: {sys.getsizeof(int_array.buffer_info())} bytes for info")
# The actual data is in int_array.buffer_info()
print(f"Array data size: {int_array.buffer_info()[1] * int_array.itemsize / 1024 / 1024:.2f} MB")

# Compare performance
import timeit
time_list = timeit.timeit('sum(lst)', setup='lst = list(range(10000))', number=1000)
time_array = timeit.timeit('sum(arr)', setup='import array; arr = array.array(\"i\", range(10000))', number=1000)
print(f"\nSum time, list: {time_list:.4f}s")
print(f"Sum time, array: {time_array:.4f}s")

For custom objects, you can use __slots__. Normally, a Python object stores its attributes in a dictionary called __dict__. This is flexible—you can add attributes anytime—but it uses memory. If you have a simple object where you know all the attributes ahead of time, using __slots__ tells Python not to create the __dict__. This can save a significant amount of memory if you have millions of these objects.

class RegularUser:
    def __init__(self, user_id, name):
        self.user_id = user_id
        self.name = name

class SlotUser:
    __slots__ = ['user_id', 'name']
    def __init__(self, user_id, name):
        self.user_id = user_id
        self.name = name

# Create a lot of them
users_regular = [RegularUser(i, f"User{i}") for i in range(100000)]
users_slots = [SlotUser(i, f"User{i}") for i in range(100000)]

# Check memory (roughly)
import sys
total_regular = sum(sys.getsizeof(u) + sys.getsizeof(u.__dict__) for u in users_regular[:1000])
total_slots = sum(sys.getsizeof(u) for u in users_slots[:1000])
print(f"Approx memory for 1000 regular users: {total_regular / 1024:.1f} KiB")
print(f"Approx memory for 1000 slot users: {total_slots / 1024:.1f} KiB")

# Note: A slot user cannot have new attributes added dynamically.
# user_slots.new_attr = 5  # This would cause an AttributeError.

This is a trade-off. You gain memory efficiency but lose the flexibility of adding arbitrary attributes. Use it for data-carrying objects in large collections.

When you're working with truly massive datasets that don't fit in memory, you need a different strategy. You process the data in chunks. Read a piece, work on it, save the result, discard it, and move to the next piece.

Python's generators are perfect for this. A generator doesn't hold all its values in memory at once. It produces them one at a time, on demand.

def read_large_file_in_chunks(file_path, chunk_size=1024*1024):  # 1MB chunks
    """A generator to read a file piece by piece."""
    with open(file_path, 'r') as file:
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            yield chunk

def process_chunk(chunk):
    # Do some processing on the chunk
    return len(chunk)  # Simple example: just return the size

# Use it
total_length = 0
for chunk in read_large_file_in_chunks("huge_data.txt"):
    total_length += process_chunk(chunk)

print(f"Total processed length: {total_length}")

# This never loads the whole file into memory.

The same idea applies to databases (use cursors to fetch rows in batches), APIs (paginate requests), and list comprehensions (use generator expressions).

# List comprehension - builds the whole list in memory
big_list = [x * 2 for x in range(10000000)]  # Uses memory

# Generator expression - produces values one by one
big_gen = (x * 2 for x in range(10000000))  # Uses almost no memory
for value in big_gen:
    if value > 100:
        break  # We never created the full list

When you use a library like Pandas, you can use chunksize in read_csv to get an iterator over DataFrames. For custom data pipelines, structuring your code around generators is one of the most effective ways to control memory.

The final technique is about being smart with what you keep around. Python has a module called weakref. It lets you create a reference to an object that does not keep the object alive. If all other "normal" references to the object disappear, the object can be garbage collected even if weak references to it still exist.

This is useful for caches. You might want to cache an expensive-to-create object, but you don't want the cache to force that object to stay in memory if it's not being used anywhere else in your program.

import weakref

class ExpensiveObject:
    def __init__(self, name):
        self.name = name
        print(f"ExpensiveObject {name} created")
    def __del__(self):
        print(f"ExpensiveObject {self.name} destroyed")

# Create it
obj = ExpensiveObject("Primary")
# Make a strong reference (keeps it alive)
strong_ref = obj
# Make a weak reference
weak_ref = weakref.ref(obj)

print(f"Via weakref: {weak_ref().name}")  # Call the ref to get the object

# Delete the strong reference
del obj
del strong_ref

# The object should be gone now
import gc
gc.collect()  # Force collection to see the effect immediately

print(f"Via weakref after deletion: {weak_ref()}")  # Returns None

A WeakKeyDictionary or WeakValueDictionary uses this concept for its keys or values. When the key or value object is deleted from the rest of the program, it automatically disappears from the dictionary too. This prevents your cache from becoming a memory leak.

from weakref import WeakValueDictionary

cache = WeakValueDictionary()

def get_expensive_data(key):
    if key in cache:
        print("Cache hit!")
        return cache[key]
    print("Cache miss, creating...")
    data = f"Expensive data for {key}"  # Simulate expensive creation
    cache[key] = data  # This is a weak reference to the data
    return data

# Use it
val1 = get_expensive_data("user_1")  # Miss, creates
print(f"Got: {val1}")
val2 = get_expensive_data("user_1")  # Hit!
print(f"Got: {val2}")

# Now, if we delete all other references to the data...
del val1
del val2
import gc; gc.collect()

# The cache entry is automatically gone
print(f"Cache now has 'user_1'? {'user_1' in cache}")

This is an advanced pattern, but it's incredibly powerful for building robust, long-running applications where you want caching but are afraid of memory growth.

So, where do you start? I always start with measurement. Use tracemalloc or a memory profiler to see where your memory is going. Is it one big list? Many small objects? Then pick the appropriate technique. For lots of numeric data, use arrays. For many identical objects, consider __slots__ or an object pool. For data that doesn't fit, use generators. For caches, consider weak references.

The goal isn't to use every technique in every program. The goal is to understand the tools so you can pick the right one when you face a real memory challenge. Python gives you the power to write simple code quickly, and with these techniques, you also get the power to make that code efficient and scalable when you need it to be. Just take it one step at a time. Start by looking. The rest will follow.

📘 Checkout my latest ebook for free on my channel!

Be sure to like, share, comment, and subscribe to the channel!

101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!