DEV Community

Shahrouz Nikseresht
Shahrouz Nikseresht

Posted on

🐍 Day 2: The Magic of Sets, Removing Duplicates While Preserving Order

I am officially back to strengthen my Python logic with #80DaysOfChallenges. Today, I tackled Challenge #2: How to effectively remove duplicate elements from a list while maintaining their original order. This is an excellent exercise for working with different data structures and implementing conditional logic.


πŸ’‘ Key Takeaways from Day 2: The Set as Memory

The core problem here is that if we simply convert the list to a set, duplicates are removed, but the original order of elements is lost. Therefore, to preserve the sequence, we require two distinct data structures:

  1. List (first_seen): Used to store the final result in the correct, preserved order.
  2. Set (memory): Used for efficiently tracking elements we have already encountered. The crucial benefit of a set is its O(1) lookup time, which ensures our overall algorithm remains optimal and linear (O(n)).

Logic and Execution

We iterate through the original list. For every item:

  • We check if it already exists in our memory (the Set).
  • If it does not exist: We append it to the first_seen list (preserving order) and immediately add it to the memory Set to mark it as seen for future checks.
  • If it does exist: We simply ignore it and move to the next element.

Crucial Learning Point Today: I had a moment of confusion regarding the methods! I learned that adding to the final list (first_seen) requires the .append() method, while adding to the Set (memory) requires the .add() method. This difference in syntax served as a good reminder of the distinct behaviours of each data type.

The Final Solution

def remove_duplicates(items):
    """
    Removes duplicates from a list while preserving the original order.
    Uses a Set for efficient tracking of seen elements.
    """
    first_seen = []  # List for storing the result (order preserved)
    memory = set()   # Set for fast duplicate tracking (O(1) lookup speed)

    for item in items:
        if item not in memory:
            # If the item is not a duplicate
            first_seen.append(item)
            memory.add(item) 

    return first_seen
Enter fullscreen mode Exit fullscreen mode

🎯 Summary and Next Steps

Focusing on algorithmic efficiency rather than just obtaining a correct output is the core mission of this series. Utilizing a Set as a high-speed "tracker" alongside a List for ordered storage is a fundamental pattern in algorithmic programming.

What other solutions come to mind for this challenge? (Perhaps methods leveraging features in newer Python versions?) I would love to see your thoughts in the comments!

Challenge Resources

You can find the full source code for today's challenge on GitHub.

β€’ Source Code for Challenge #2: scripts/remove_duplicates.py
β€’ Main Repository: 80-days-of-challenges
β€’ Daily Updates: Twitter/X (@Shahrouzlogs)

Top comments (0)