You’re writing a simple Python function. It takes an item and adds it to a list, which defaults to an empty list if none is provided. It seems straightforward, but then this happens:
def add_to_list(item, target=[]): # 🚨 This is the famous bug!
target.append(item)
return target
list_1 = add_to_list('a') # Returns ['a']
list_2 = add_to_list('b') # Returns ['a', 'b']... wait, what?
print(list_1 is list_2) # True - They're the same object!
Why does list_2
remember the item from the first call? If you’ve encountered this, welcome to the club. This isn't a bug in your logic; it's a nuanced behavior in Python that has tripped up countless developers. Let's demystify it.
The Root Cause: Definition Time vs. Execution Time
The confusion stems from a critical point: Python evaluates default arguments only once—when the function is defined, not each time it is called.
When the interpreter reads the def
statement, it creates the function object and evaluates any default values right then and there. That empty list []
is created a single time and attached to the function object.
Think of it like this:
# This happens at FUNCTION DEFINITION time
_default_value = [] # The list is born here
def add_to_list(item, target=_default_value):
target.append(item)
return target
Every time you call add_to_list()
without a target
argument, it uses that one original list. There is no new list created per call.
Visualizing the Problem
Let's break down the process step-by-step. The key is to understand that the default list is a single object shared across all function calls.
-
Function Definition
- When Python defines the function, it creates the default list
[]
in memory. - Let's say this list is assigned the unique identity (ID)
12345
.
- When Python defines the function, it creates the default list
-
First Call (
list_1 = add_to_list('a')
)- The function uses the list at ID
12345
. - It appends
'a'
to it. - It returns this now-modified list.
- The variable
list_1
is now a reference to this list. - State of the List (
id: 12345
):['a']
- The function uses the list at ID
-
Second Call (
list_2 = add_to_list('b')
)- The function again uses the same original list at ID
12345
. - It appends
'b'
to it, making the list['a', 'b']
. - It returns this list.
- The variable
list_2
is now also a reference to this same list. - State of the List (
id: 12345
):['a', 'b']
- The function again uses the same original list at ID
Now, both list_1
and list_2
are references to the single list object at ID 12345
. This is why the identity check list_1 is list_2
returns True
. They are not just similar lists; they are different names for the exact same object in memory.
You can always verify this yourself using the id()
function:
print(id(list_1)) # e.g., 140241231415040
print(id(list_2)) # e.g., 140241231415040 (The same number!)
Why Is This The Behavior?
You might wonder why Python was designed this way. The answer, from Python’s creator Guido van Rossum, is largely performance. Evaluating default arguments once at function definition is more efficient than re-evaluating them on every single call, especially for simple, frequently called functions. The trade-off was performance over a potential pitfall.
This behavior only causes problems with mutable objects (lists, dictionaries, sets). With immutable objects (integers, strings, tuples), it's harmless because they can't be modified in-place. You can only reassign the variable, which doesn't affect the original default object.
# No problem with immutable defaults
def increment(count=0):
count += 1 # This reassigns 'count'; it doesn't modify the default '0'
return count
print(increment()) # 1
print(increment()) # 1 (a new '0' is not created, but the original is unchanged)
The Pythonic Solution: Use None
as a Sentinel
The official idiom to avoid this pitfall is simple and elegant: use None
as your default and create the mutable object inside the function.
def add_to_list_fixed(item, target=None):
if target is None: # ✅ Check for None with `is`
target = [] # A new list is created here, at execution time
target.append(item)
return target
list_1 = add_to_list_fixed('a') # Creates a new list, returns ['a']
list_2 = add_to_list_fixed('b') # Creates a new list, returns ['b']
print(list_1, list_2) # ['a'] ['b'] ← Expected behavior!
print(list_1 is list_2) # False - They are different objects.
Why this works: The function body is executed on every call. So, every time you call the function without a target
, the code target = []
runs, creating a brand new list.
When Might This Behavior Be Useful?
While usually a pitfall, this behavior can be used intentionally for advanced patterns like caching or memoization, where you want a function to maintain state across calls in a local variable.
def generate_id(prefix, cache=[]):
"""Generates unique IDs with a prefix. Example: 'a_1', 'a_2', 'b_3'"""
cache.append(prefix)
return f"{prefix}_{len(cache)}"
print(generate_id('a')) # 'a_1'
print(generate_id('a')) # 'a_2' - The cache list persisted
However, this is an advanced technique and should be well-documented to avoid confusing other developers.
Your Python Rite of Passage
Encountering and understanding this behavior is a milestone. It means you're moving beyond Python's syntax and starting to understand its execution model—a sign of a growing developer.
So the next time you see a linter warn you about a "mutable default argument," you'll know exactly what it means and how to fix it. You haven't just fixed a bug; you've mastered one of Python's most famous nuances.
Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.
Top comments (0)