DEV Community

Cover image for Mastering Python Sets: A Definitive Guide with Examples & Use Cases
Satyam Gupta
Satyam Gupta

Posted on

Mastering Python Sets: A Definitive Guide with Examples & Use Cases

Mastering Python Sets: Your Ultimate Guide to Unordered, Unique Collections

Welcome, fellow coders! If you've been journeying through Python, you've undoubtedly fallen in love with lists and dictionaries. They are the workhorses of the language. But there's another data type hiding in Python's toolbox, often overlooked yet incredibly powerful: the Set.

Imagine you have a list of user email addresses for a newsletter, but it's full of duplicates. Or you need to quickly check if a user ID has access to a specific resource. Using a list for these tasks would be inefficient and clunky. This is where Python sets shine.

In this comprehensive guide, we're not just going to scratch the surface. We'll dive deep into the world of sets. We'll explore what they are, how to use them, and—most importantly—why and when you should use them to write cleaner, faster, and more Pythonic code. By the end, you'll wonder how you ever coded without them.

To learn professional software development courses such as Python Programming, Full Stack Development, and MERN Stack, visit and enroll today at codercrafter.in. Our structured curriculum is designed to take you from beginner to job-ready developer.

What Exactly is a Python Set?
At its core, a set is a built-in data type in Python used to store multiple items in a single variable. But it has three very distinct and defining characteristics:

Unordered: The items in a set do not have a defined order. You cannot refer to an item by using an index or a key. When you print a set, the order of items might change.

Unindexed: Since they are unordered, they are also unindexed. my_set[0] will throw a TypeError.

Unique: Sets cannot contain duplicate values. If you try to add a duplicate, the set will silently ignore it. This is their superpower.

You can think of a set as a mathematical set or a bag of unique items.

How to Create a Set
Creating a set is straightforward. You can use curly braces {} or the set() constructor.

python
# Method 1: Using curly braces {}
my_set = {"apple", "banana", "cherry"}
print(my_set)  # Output: {'cherry', 'banana', 'apple'} (order may vary)

# Method 2: Using the set() constructor
another_set = set(["apple", "banana", "cherry", "apple"]) # Note the duplicate
print(another_set)  # Output: {'cherry', 'banana', 'apple'} - duplicate removed!

# An empty set is tricky! {} creates an empty dictionary.
empty_dict = {}
print(type(empty_dict))  # Output: <class 'dict'>

empty_set = set()
print(type(empty_set))   # Output: <class 'set'>
Enter fullscreen mode Exit fullscreen mode

Important Note: Sets can only contain hashable (immutable) data types. This means you can have sets of integers, floats, strings, and tuples. However, you cannot have a set of lists or dictionaries, as they are mutable.

python
# This will work
valid_set = {1, 2, 3, ("tuple", "inside")}

# This will NOT work and raise a TypeError
invalid_set = {1, 2, ["list", "inside"]}
Enter fullscreen mode Exit fullscreen mode

Why Use Sets? The Power of Uniqueness and Speed
The two main reasons to use sets are deduplication and membership testing.

Deduplication: This is the most common use case. Need to remove duplicates from a list? Convert it to a set and back. It's incredibly efficient.



python
my_list = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
unique_list = list(set(my_list))
print(unique_list)  # Output: [1, 2, 3, 4] (order may vary)
Enter fullscreen mode Exit fullscreen mode

Membership Testing: Checking if an element exists in a collection (element in collection) is blazingly fast with sets. Under the hood, sets use a hash table, which makes membership tests an O(1) operation on average. This is significantly faster than checking membership in a list (O(n)), especially for large collections.

python
huge_set = set(range(1000000))
huge_list = list(range(1000000))

# This is very fast
print(999999 in huge_set)   # Output: True

Enter fullscreen mode Exit fullscreen mode

This is much slower as it checks each item one by one

print(999999 in huge_list) # Output: True
Diving into Set Operations: It's Like Math Class, But Fun
Python sets support mathematical set operations like union, intersection, difference, and symmetric difference. This makes them perfect for comparing groups of data.

Let's define two sets to play with:

python
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
Enter fullscreen mode Exit fullscreen mode
  1. Union (| or union()) Returns a new set containing all items from both sets.
python
print(set_a | set_b)         # Output: {1, 2, 3, 4, 5, 6, 7, 8}
print(set_a.union(set_b))    # Output: {1, 2, 3, 4, 5, 6, 7, 8}
Enter fullscreen mode Exit fullscreen mode
  1. Intersection (& or intersection()) Returns a new set containing only the items that exist in both sets.

python
print(set_a & set_b)             # Output: {4, 5}
print(set_a.intersection(set_b)) # Output: {4, 5}
Enter fullscreen mode Exit fullscreen mode
  1. Difference (- or difference()) Returns a new set containing items that are in the first set but not in the second.
python
print(set_a - set_b)             # Output: {1, 2, 3}
print(set_a.difference(set_b))   # Output: {1, 2, 3}
Enter fullscreen mode Exit fullscreen mode
  1. Symmetric Difference (^ or symmetric_difference()) Returns a new set containing items that are in only one of the sets, but not both. It's the opposite of intersection.

python
print(set_a ^ set_b)                         # Output: {1, 2, 3, 6, 7, 8}
print(set_a.symmetric_difference(set_b))     # Output: {1, 2, 3, 6, 7, 8}
Enter fullscreen mode Exit fullscreen mode

Essential Set Methods to Know
Beyond operations, sets come with a host of useful methods.

add(): Adds a single element.

python
my_set.add("mango")
update(): Adds multiple elements (from iterables like lists, tuples, other sets).
Enter fullscreen mode Exit fullscreen mode
python
my_set.update(["kiwi", "orange"], {"pineapple"})
remove() vs discard(): Both remove an element. The key difference? remove() will raise a KeyError if the element doesn't exist, while discard() will not.
Enter fullscreen mode Exit fullscreen mode

python
my_set.remove("banana")  # Safe if you know it exists
my_set.discard("dragonfruit")  # Safe always, does nothing if not found
pop(): Removes and returns a random element (because sets are unordered). Raises KeyError on an empty set.

clear(): Empties the entire set.

isdisjoint(): Returns True if two sets have no elements in common.

issubset() (<=) / issuperset() (>=):
Enter fullscreen mode Exit fullscreen mode

Check if one set is entirely contained within another or entirely contains another.


python
small_set = {1, 2}
print(small_set.issubset(set_a))  # Output: True
print(set_a.issuperset(small_set)) # Output: True
Enter fullscreen mode Exit fullscreen mode

Real-World Use Cases: Where Sets Save the Day
Removing Duplicates: As shown earlier, this is the classic use case. Cleaning data from CSVs, user inputs, or API responses becomes a one-liner.

Finding Common or Unique Elements:

Social Media: "Find mutual friends" is a perfect intersection.

E-commerce: "Show me items that are in my wishlist but already in my cart" is a difference.

Data Analysis: "Find unique visitors to a website" or "Find tags common to multiple blog posts."

Membership Testing: Anywhere you need to check for existence quickly.

A list of banned user IDs.

A list of valid country codes.

A collection of prime numbers for a math algorithm.

Boolean Flags without Booleans: Sometimes, using a set is more memory efficient than a list of booleans for tracking seen items in an algorithm.

Best Practices and Common Pitfalls
Choose the Right Tool: Use a set when you care about uniqueness and membership testing. Use a list when you need an ordered sequence that allows duplicates. Use a dictionary when you need to map keys to values.

Beware of Unorderedness: Never rely on the order of elements in a set. If you need order, use a list or a tuple. If you need both order and uniqueness, consider a list-deduplication pattern or, in newer Python versions, a dictionary from keys() (which preserves insertion order).

in is your friend: Leverage the speed of element in my_set for conditionals and filters.

Use Set Comprehensions: Just like list comprehensions, you can create sets elegantly.

python
squared_set = {x**2 for x in range(10)} # {0, 1, 64, 4, 36, 9, 16, 49, 81, 25}
Enter fullscreen mode Exit fullscreen mode

Frequently Asked Questions (FAQs)
Q: Can a set contain different data types?
A: Yes! A set can contain a mix of hashable types like integers, strings, and tuples.

python
mixed_set = {1, "hello", 3.14, (1, 2)}
Q: Are sets mutable?
A: Yes, the sets we've discussed (set) are mutable and can be changed. Python also provides an immutable version called frozenset, which can be used as a dictionary key or an element inside another set.

Q: How do I get the length of a set?
A: Use the len() function, just like with lists and dictionaries.

python
print(len(my_set))
Enter fullscreen mode Exit fullscreen mode

Q: Why did my set change order when I printed it?
A: This is normal behavior due to the unordered nature of sets. The internal ordering is based on the hash of the elements and is not meant to be relied upon.

Conclusion: Don't Overlook the Mighty Set
Python sets are a deceptively simple yet profoundly powerful tool. They offer an optimal solution for problems involving uniqueness, commonality, and fast lookups. By understanding and leveraging sets, you can write code that is not only more efficient but also more expressive and aligned with the problem you're trying to solve.

Mastering fundamental data structures like sets, lists, and dictionaries is crucial for any aspiring developer. If you're looking to solidify your understanding of Python and other in-demand technologies, consider exploring the professional courses offered at codercrafter.in. Our Python Programming and Full Stack Development programs are designed to give you the deep, practical knowledge needed to build a successful career in software development.

Top comments (0)