De-throning the List: Part Boron

robinheghan profile image Robin Heggelund Hansen ・3 min read

In the last post of this series, we talked about extending the API for Array to make it as usable as List. In this post, we'll look at how making Array the default collection could reduce overhead, and thus improve performance.

In Elm 0.18, when the compiler sees an expression like this:

list = [1, 2, 3]
Enter fullscreen mode Exit fullscreen mode

It compiles it into the following javascript code:

var _some_namespace_list = {ctor: '::', _0: 1, _1: {ctor: '::', _0: 2, _1: {ctor: '::', _0: 3, _1: {ctor: '[]'}}}};
Enter fullscreen mode Exit fullscreen mode

This is actually pretty nice, though it looks ugly, as it's theoretically the fastest way of creating a List. Unfortunately, due to the way Chrome works it can also crash the browser if the List literal is big enough. This has to do with the object literals being nested, the javascript engine can only support so much nesting before it crashes. You can read more about this in the following issue.

So in Elm 0.19, to avoid this problem, it will instead compile down to this javascript code:

var _some_namespace_list = _elm_core_fromArray([1, 2, 3]);
Enter fullscreen mode Exit fullscreen mode

This works, but it does mean that every time you have a List literal in your program, Elm will convert it from a javascript array at runtime. This extra conversion, could make your app a little slower.

How much slower?

I thought you'd never ask. It's benchmark time!

This benchmark compares calling List.foldl (\a b -> a + b) 0 with [1,2,3,4,5,6,7,8,9] using Elm 0.18's literal expression (nested objects) and 0.19 runtime conversion. By hand-editing the compiled output, we can get a benchmark of how the equivalent code with an Array literal performs. You can read the benchmark code here.

The benchmark was run on the latest version of Chrome (v66), running the latest version of Mac OS X (High Sierra), on a Mid-2012 Macbook Air.

List literal sum: 873,847 ops/sec
List runtime sum: 1,959,202 ops/sec (124.2% faster than list literal)
Array literal sum: 3,329,392 ops/sec (40.61% faster than list runtime)
Enter fullscreen mode Exit fullscreen mode

These results are pretty interesting. You'll see that runtime conversion is actually significantly faster than having a literal representation in Chrome. This is probably tied together with the runtime crashing issue described earlier. If we were to run this benchmark through Firefox and Safari, you'd see that runtime conversion is between 10-20% slower than literal representation.

The more interesting thing is that an Array literal would be faster still. From running this benchmark on different browsers with different sizes of List literals, I think this has to do with the nesting of objects. Appearently, nested object literals are not good for performance. A List is "just" nested objects, while Array has a very flat structure.

What would an Array literal look like?

If Array had been the default collection type, the compiler could've have turned it into the following javascript code:

var _some_namespace_list = {ctor: 'Array', _0: 3, _1: 5, _2: [], _3: [1,2,3]};
Enter fullscreen mode Exit fullscreen mode

This is shorter than the original literal representation for List. It's also less nested. A List with 1024 elements requires 1024 nested object literals when expressed as javascript, an Array would require 3 nested literals.


Although List literals in Chrome is as slow as an asthmatic ant with heavy shopping, and might even crash the Elm application altogether, Array literals are fast and safe.

Literal syntax for List is used heavily in Elm for stuff like Html nodes, so being able to construct such literals quickly is of huge value. By switching default collection type to Array there is a clear performance win to be had.

All my arguments and thoughts regarding making Array the default collection type have now been made. Read this next post for a summary of what I've written so far, and what I plan to do next.


Editor guide