DEV Community

Cover image for Python's Most Famous Gotcha: The Mutable Default Argument
Aaron Rose
Aaron Rose

Posted on

Python's Most Famous Gotcha: The Mutable Default Argument

You’re writing a simple Python function. It takes an item and adds it to a list, which defaults to an empty list if none is provided. It seems straightforward, but then this happens:

def add_to_list(item, target=[]):  # 🚨 This is the famous bug!
    target.append(item)
    return target

list_1 = add_to_list('a')  # Returns ['a']
list_2 = add_to_list('b')  # Returns ['a', 'b']... wait, what?

print(list_1 is list_2)    # True - They're the same object!
Enter fullscreen mode Exit fullscreen mode

Why does list_2 remember the item from the first call? If you’ve encountered this, welcome to the club. This isn't a bug in your logic; it's a nuanced behavior in Python that has tripped up countless developers. Let's demystify it.

The Root Cause: Definition Time vs. Execution Time

The confusion stems from a critical point: Python evaluates default arguments only once—when the function is defined, not each time it is called.

When the interpreter reads the def statement, it creates the function object and evaluates any default values right then and there. That empty list [] is created a single time and attached to the function object.

Think of it like this:

# This happens at FUNCTION DEFINITION time
_default_value = []  # The list is born here

def add_to_list(item, target=_default_value):
    target.append(item)
    return target
Enter fullscreen mode Exit fullscreen mode

Every time you call add_to_list() without a target argument, it uses that one original list. There is no new list created per call.

Visualizing the Problem

Let's break down the process step-by-step. The key is to understand that the default list is a single object shared across all function calls.

  1. Function Definition

    • When Python defines the function, it creates the default list [] in memory.
    • Let's say this list is assigned the unique identity (ID) 12345.
  2. First Call (list_1 = add_to_list('a'))

    • The function uses the list at ID 12345.
    • It appends 'a' to it.
    • It returns this now-modified list.
    • The variable list_1 is now a reference to this list.
    • State of the List (id: 12345): ['a']
  3. Second Call (list_2 = add_to_list('b'))

    • The function again uses the same original list at ID 12345.
    • It appends 'b' to it, making the list ['a', 'b'].
    • It returns this list.
    • The variable list_2 is now also a reference to this same list.
    • State of the List (id: 12345): ['a', 'b']

Now, both list_1 and list_2 are references to the single list object at ID 12345. This is why the identity check list_1 is list_2 returns True. They are not just similar lists; they are different names for the exact same object in memory.

You can always verify this yourself using the id() function:

print(id(list_1)) # e.g., 140241231415040
print(id(list_2)) # e.g., 140241231415040 (The same number!)
Enter fullscreen mode Exit fullscreen mode

Why Is This The Behavior?

You might wonder why Python was designed this way. The answer, from Python’s creator Guido van Rossum, is largely performance. Evaluating default arguments once at function definition is more efficient than re-evaluating them on every single call, especially for simple, frequently called functions. The trade-off was performance over a potential pitfall.

This behavior only causes problems with mutable objects (lists, dictionaries, sets). With immutable objects (integers, strings, tuples), it's harmless because they can't be modified in-place. You can only reassign the variable, which doesn't affect the original default object.

# No problem with immutable defaults
def increment(count=0):
    count += 1  # This reassigns 'count'; it doesn't modify the default '0'
    return count

print(increment())  # 1
print(increment())  # 1 (a new '0' is not created, but the original is unchanged)
Enter fullscreen mode Exit fullscreen mode

The Pythonic Solution: Use None as a Sentinel

The official idiom to avoid this pitfall is simple and elegant: use None as your default and create the mutable object inside the function.

def add_to_list_fixed(item, target=None):
    if target is None:  # ✅ Check for None with `is`
        target = []     # A new list is created here, at execution time
    target.append(item)
    return target

list_1 = add_to_list_fixed('a') # Creates a new list, returns ['a']
list_2 = add_to_list_fixed('b') # Creates a new list, returns ['b']

print(list_1, list_2)  # ['a'] ['b']  ← Expected behavior!
print(list_1 is list_2) # False - They are different objects.
Enter fullscreen mode Exit fullscreen mode

Why this works: The function body is executed on every call. So, every time you call the function without a target, the code target = [] runs, creating a brand new list.

When Might This Behavior Be Useful?

While usually a pitfall, this behavior can be used intentionally for advanced patterns like caching or memoization, where you want a function to maintain state across calls in a local variable.

def generate_id(prefix, cache=[]):
    """Generates unique IDs with a prefix. Example: 'a_1', 'a_2', 'b_3'"""
    cache.append(prefix)
    return f"{prefix}_{len(cache)}"

print(generate_id('a')) # 'a_1'
print(generate_id('a')) # 'a_2' - The cache list persisted
Enter fullscreen mode Exit fullscreen mode

However, this is an advanced technique and should be well-documented to avoid confusing other developers.

Your Python Rite of Passage

Encountering and understanding this behavior is a milestone. It means you're moving beyond Python's syntax and starting to understand its execution model—a sign of a growing developer.

So the next time you see a linter warn you about a "mutable default argument," you'll know exactly what it means and how to fix it. You haven't just fixed a bug; you've mastered one of Python's most famous nuances.


Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.

Top comments (0)