loading...

De-throning the List: Part π

robinheghan profile image Robin Heggelund Hansen Updated on ・4 min read

In the last post, we looked at how we could alter the implementation of Array to make it a more general purpose than it already is (like, five stars). In this post, we’ll see if we can make it more extensible.

Extensible? What does that mean?

Let’s see how we would implement a function like find for Array. find takes a predicate function and a collection, and returns the first item in the collection for which the predicate returns true, or Nothing if the predicate never returns true.

The only way to implement this for Array is by using foldl or foldr:

find : (a -> Bool) -> Array a -> Maybe a
find pred array =
    Array.foldl
        (\item result ->
            if result == Nothing && pred item then
                Just item
            else
                result
        )
        Nothing
        array
Enter fullscreen mode Exit fullscreen mode

While this implementation does what we want, it isn’t that efficient. When this implementation finds what we’re looking for, it will keep iterating through all the remaining items in the collection. Depending on the size of the array, this can be very slow.

Is List any better? Yes, yes it is:

find : (a -> Bool) -> List a -> Maybe a
find pred list =
    case list of
        [] ->
            Nothing

        x :: xs ->
            if pred x then
                Just x
            else
                find pred xs
Enter fullscreen mode Exit fullscreen mode

This implementation does exactelly what we want. This is a benefit of List being a simple data structure, it’s easy to pattern match on it to get the functionality we want. I explained briefly in my first post why the internal structure of Array is not exposed like this, but I’ll say it again: it’s simply not simple, and it’s easy to make mistakes that can cause hard to find bugs. Now, before you go all «Off with Arrays head», let’s explore how we can get this same extensibility for Array.

Looking at the List version of find, what we need is a way to iterate through the collection but be able to say «Hey! That’s my horse!» at any point and cut the iteration short. We could borrow an idea from Common Lisp and add a new function to the Array API which allows us to signal that we’re done iterating. Such a function, called stoppableFoldl in the example below, would allow us to implement find like this:

find : (a -> Bool) -> Array a -> Maybe a
find pred array =
    Array.stoppableFoldl
        (\item result ->
            if pred item then
                Done (Just item)
            else
                Continue result
        )
        Nothing
        array
Enter fullscreen mode Exit fullscreen mode

That doesn’t look so bad. Let’s see how these implementations stack up against each other, performance wise. The benchmarks below were run on a mid-2012 Macbook Air, using the latest version of Chrome. The benchmark creates a list and an array with 100 elements, and tries to find a value in the middle of the collection. The code can be found here.

List:   402,870 ops/sec
Array:  227,745 ops/sec
% diff: -43.47% 
Enter fullscreen mode Exit fullscreen mode

Auch. It’s seems List is still faster. It’s probably a combination of calling a function for every item in the Array, as well as performing an extra allocation per item (Done/Continue).

"But Robin," you say. "Doesn't this benchmark prove that List is better than Array? Is the reason behind this blog post taking so long to publish that you've been wandering around town soothing your bruised pride with alcohol?"

Yes. I mean no! My argument in this series has been that Array should be the default because it's a more general purpose data structure when compared to the List. This doesn't mean that Array will beat List in all benchmarks, but that it will have decent performance for most operations.

For instance, we can just as well provide a stoppableFoldr function to make it possible to implement findLast. The performance for find and findLast would be about the same. This would not be the case for List, which would have worse performance for findLast.

That’s the point I’m trying to make: List is very good at a few things, Array is fine for most things. Even Array makes use of List in it’s internal implementation (take a look at how Array.filter is implementated). For most applications, the Array is a better default. So maybe Array shouldn’t outright replace List, but simply be the data type that is constructed when literal syntax is used (like [1, 2, 3]). It might even be a good idea to rename List to Stack to signal what it really is good at.

What remains?

There are still things you cannot do with Array which are available with List. In the next part of this series, we’ll compare the API exposed by the Array and List module, and see where we need to bridge the gap, and perhaps even where the gap shouldn’t be bridged.

Discussion

pic
Editor guide