DEV Community

suvhotta
suvhotta

Posted on

Understanding Object Copying in Python

My Initial Struggle with List Copying

When I began my programming journey in Python, I encountered a scenario that many new programmers face: copying a list and passing it to a function for further modification. Here's the approach I initially used and why it didn't work as expected.


def square_nums(nums_list):
  ...

original_list = [1, 2, 3, 4, 5]

copied_list = original_list

squared_nums = square_nums(copied_list)

Enter fullscreen mode Exit fullscreen mode

However, I soon realized a critical issue: any modifications made to copied_list within the function were also reflected in original_list. This was puzzling and led me to delve deeper into Python's documentation.

The Revelation: Aliases vs. Actual Copies

Upon reading the official documentation, I figured out that the way I was copying the contents of the original_list was fundamentally flawed. In fact, as per Python semantics, I wasn't creating a copy, instead I was just creating an alias using the assignment operator. Any number of aliases we create, they will all point to the same object. That's why when the object itself was getting modified, I was getting altered values in the original_list in my code.
Alias creation in python

The Correct Approach to Copying in Python

Python provides 2 ways of doing so:
1) Shallow Copy
2) Deep Copy
Both of these are available in the copy module provided as part of Python's standard library.

Shallow Copy

In shallow copy, a new Container/Object is created but the containing items are still a reference to the original Container/Object.

This can be further verified by checking the memory of the containers and their individual items using the id() function.

This way of copying works best if we've immutable items.

This is achieved by using the copy.copy() function of the copy module.

Shallow Copy


import copy

original_list = [1, 2, 3]

# Creates a shallow copy
copied_list = copy.copy(original_list)

# Will print True
print(id(original_list[0]) == id(copied_list[0]))

original_list[0] = 0

# Will print False
print(id(original_list[0]) == id(copied_list[0]))

Enter fullscreen mode Exit fullscreen mode

In the above code, the last print statement will still be False because integers are immutable. So once original_list[0] is re-assigned a new value, the reference is updated to the new integer value i.e. 0. But copied_list[0] continues to reference the original integer object 1. This behaviour would've been different if original_list contained mutable objects (like other lists, dictionaries, etc.), where a change in the mutable object inside original_list would be reflected in original_list as well due to the shared references.

Due to potential issues that could arise when mutable objects are involved, copy module has another offering: deepcopy.

Deep copy

In shallow copy, a new Container/Object is created and it recursively inserts copies of the objects(It doesn't always copy but depending on the type of the object) found in the original.


import copy

original_list = [1, 2, 3, [2,3,4]]

copied_list = copy.deepcopy(original_list)

# Will print False
print(id(original_list)==id(copied_list))

# Will print True
print(id(original_list[0])==id(copied_list[0]))

# Will print True
print(id(original_list[1])==id(copied_list[1]))

# Will print True
print(id(original_list[2])==id(copied_list[2]))

# Will print False
print(id(original_list[3])==id(copied_list[3]))

Enter fullscreen mode Exit fullscreen mode

In the code above, we see an interesting thing i.e. when the objects are immutable like those at indexes 0, 1, 2 in original_list then those in copied_list also reference to the same object. But when the element is a mutable one like a list then a separate copy is maintained in the copied_list

Deepcopy

The Role of the memo Dictionary in Deepcopy

  • Avoiding Redundant Copies:
    When an object is copied using deepcopy, the memo dictionary keeps track of objects that have already been copied in that copy cycle. This is particularly important for complex objects that may refer to the same sub-object multiple times. Without memo, each reference to the same sub-object would result in a new and separate copy, potentially leading to an inefficient duplication of objects.

  • Handling Recursive Structures:
    Recursive structures are objects that, directly or indirectly, refer to themselves. For example, a list that contains a reference to itself. deepcopy uses the memo dictionary to keep track of objects that have already been encountered in the copying process. If it encounters an object that’s already in the memo dictionary, it uses the existing copy from the memo rather than entering an infinite loop trying to copy the recursive reference.

Implementation in User defined Classes

In order to implement the functionalities of copy and deepcopy, we can use the magic methods copy() and deepcopy() within the class.
Internally even the list data structure is a class and that's how we get the list.copy() feature.

With this deeper insight into Python's object copying, you're now better prepared to handle the nuances of mutable and immutable objects in your coding adventures. Happy coding!

Top comments (0)