DEV Community

RobinFiveWords
RobinFiveWords

Posted on

Would it be silly to write my own defaultdict?

TL;DR — Yes? Python's defaultdict is exactly the right tool for the job I call upon it to do.

Python

defaultdict is part of Python's standard library. Python is already a high-level language, so it seems arbitrary to say I'll allow myself to use some but not all parts of it as given to me.

I do, however, like when I know everything that something does, and I know it doesn't do anything else. If I'm not mistaken, defaultdict is 300 lines of C code (starting at line 2175 here), and I can't say I understand everything that it does. When I use defaultdict, almost always I am looping through data, aggregating values, and looking to avoid the KeyError the first time a key appears in the data (or having to check whether the key exists each time).

I'm sure millions of people have done this for homework or fun before, but I didn't search for their solutions; I searched for how to make an object "subscriptable", how to add a key to a dict without the subscript syntax (which I'm guessing would infinitely recurse), and how to call the parent class's __init__ method (it had been a while). The following seems to work fine:

class DefaultDict(dict):
    def __init__(self, value_type):
        self.value_type = value_type
        super().__init__()

    def __getitem__(self, key):
        if key not in self:
            self.update({key: self.value_type()})
        return self.get(key)
Enter fullscreen mode Exit fullscreen mode

I probably will never use this, because defaultdict is already the right tool for the job. It throws an exception immediately when the default factory is not callable or None. Someone has already thought through what type the result should have when taking the union of a dict and a defaultdict. It has a nice repr.

Where I find writing my own tool to be most useful is if an existing tool is complex in a way that makes it unclear whether it is the right tool to use. Or if an existing tool solves a bigger problem than I am trying to solve.

Java

I must not have needed a "default HashMap" before, because I just now discovered computeIfAbsent. While I was thinking it through, though, this one-off code seemed straightforward enough:

ArrayList<Integer> list = d.get(key)
if (list == null) {
    list = new ArrayList<Integer>();
    d.put(key, list);
}
Enter fullscreen mode Exit fullscreen mode

The more concise way:

d.computeIfAbsent(key, (x -> new ArrayList<Integer>())).add(value)
Enter fullscreen mode Exit fullscreen mode

I think I'm looking for a way to extend HashMap so that I can write

d.myGet(key).add(value)
Enter fullscreen mode Exit fullscreen mode

and it will run the previous code. This seems somewhat straightforward...

class DefaultMap extends HashMap<Integer, ArrayList<Integer>> {
    ArrayList<Integer> myGet(Integer key) {
        return this.computeIfAbsent(key, (z -> new ArrayList<Integer>()));
    }
}
Enter fullscreen mode Exit fullscreen mode

...except I would need to create a separate class for each set of key-value types. There might be a way to do this with generics, but because (if I understand correctly) classes and functions aren't first-class objects in Java, I'm not sure how that would be done. Here's where I really need to search for an answer.

And of course the top Google result is a link I've visited before, because of course this isn't the first time I've wondered about it. The .class property effectively allows the class of the default value, such as ArrayList, to be passed as a parameter to the constructor. I'm going to have to let this simmer.

Top comments (0)