I am officially back to strengthen my Python logic with #80DaysOfChallenges. Today, I tackled Challenge #2: How to effectively remove duplicate elements from a list while maintaining their original order. This is an excellent exercise for working with different data structures and implementing conditional logic.
π‘ Key Takeaways from Day 2: The Set as Memory
The core problem here is that if we simply convert the list to a set
, duplicates are removed, but the original order of elements is lost. Therefore, to preserve the sequence, we require two distinct data structures:
- List (
first_seen
): Used to store the final result in the correct, preserved order. - Set (
memory
): Used for efficiently tracking elements we have already encountered. The crucial benefit of aset
is its O(1) lookup time, which ensures our overall algorithm remains optimal and linear (O(n)).
Logic and Execution
We iterate through the original list. For every item
:
- We check if it already exists in our
memory
(the Set). - If it does not exist: We append it to the
first_seen
list (preserving order) and immediately add it to thememory
Set to mark it as seen for future checks. - If it does exist: We simply ignore it and move to the next element.
Crucial Learning Point Today: I had a moment of confusion regarding the methods! I learned that adding to the final list (
first_seen
) requires the.append()
method, while adding to the Set (memory
) requires the.add()
method. This difference in syntax served as a good reminder of the distinct behaviours of each data type.
The Final Solution
def remove_duplicates(items):
"""
Removes duplicates from a list while preserving the original order.
Uses a Set for efficient tracking of seen elements.
"""
first_seen = [] # List for storing the result (order preserved)
memory = set() # Set for fast duplicate tracking (O(1) lookup speed)
for item in items:
if item not in memory:
# If the item is not a duplicate
first_seen.append(item)
memory.add(item)
return first_seen
π― Summary and Next Steps
Focusing on algorithmic efficiency rather than just obtaining a correct output is the core mission of this series. Utilizing a Set as a high-speed "tracker" alongside a List for ordered storage is a fundamental pattern in algorithmic programming.
What other solutions come to mind for this challenge? (Perhaps methods leveraging features in newer Python versions?) I would love to see your thoughts in the comments!
Challenge Resources
You can find the full source code for today's challenge on GitHub.
β’ Source Code for Challenge #2: scripts/remove_duplicates.py
β’ Main Repository: 80-days-of-challenges
β’ Daily Updates: Twitter/X (@Shahrouzlogs)
Top comments (0)