DEV Community

Pavol Z. Kutaj
Pavol Z. Kutaj

Posted on

Explaining defaultdict in Python

USECASE

The aim of this page📝 is to explain the concept and usage of Python's defaultdict from the collections module, specifically wondering about the weird name. It is inspired by David Baezley's Advanced Python Mastery, see ex_2_2 > Collections.

defaultdict:

  • provides a default value for missing keys
  • avoids KeyError by initializing keys automatically
  • is named for its default behavior of initializing elements
  • simplifies code by avoiding manual checks and insertions
  • only a callable object (type or function) is passed to initialize
  • in the given example, list is used as the default factory
  • which means that it automatically creates an empty list for missing keys
  • and facilitates grouping data efficiently
  • can also use lambda functions for other, literal default values
  • example: defaultdict(lambda: 0) returns 0 for missing keys

Example Code From Advanced Python Mastery

portfolio
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, {'name': 'IBM', 'shares': 100, 'price': 70.44}]
print("### DEFAULTDICT")
from collections import defaultdict

print("#### Group data, e.g. find all stocks with the same name")
byname = defaultdict(list)
for s in portfolio:
    byname[s["name"]].append(s)
byname

# defaultdict(<class 'list'>, {'AA': [{'name': 'AA', 'shares': 100, 'price': 32.2}], 'IBM': [{'name': 'IBM', 'shares': 50, 'price': 91.1}, {'name': 'IBM', 'shares': 100, 'price': 70.44}], 'CAT': [{'name': 'CAT', 'shares': 150, 'price': 83.44}], 'MSFT': [{'name': 'MSFT', 'shares': 200, 'price': 51.23}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}], 'GE': [{'name': 'GE', 'shares': 95, 'price': 40.37}]})

print('#### Find all stocks with the name "IBM"')
byname["IBM"]
# >>> [{'name': 'IBM', 'shares': 50, 'price': 91.1}, {'name': 'IBM', 'shares': 100, 'price': 70.44}]
Enter fullscreen mode Exit fullscreen mode

Example with Lambda:

from collections import defaultdict

byname = defaultdict(lambda: 0)
print(byname["missing_key"])  # This will return 0
Enter fullscreen mode Exit fullscreen mode

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

Image of Timescale

Timescale – the developer's data platform for modern apps, built on PostgreSQL

Timescale Cloud is PostgreSQL optimized for speed, scale, and performance. Over 3 million IoT, AI, crypto, and dev tool apps are powered by Timescale. Try it free today! No credit card required.

Try free

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay