Ken Bellows

Posted on Dec 27, 2018 • Edited on Jan 17, 2019

The JavaScript Iteration Protocols and How They Fit In

#javascript #es2015 #es6 #intermediate

One of the coolest, and IMHO most underrated, features introduced by ECMAScript 2015 (ES2015, aka ES6) was the pair of iteration protocols, which define "iterators" and "iterables" in JavaScript. These protocols give us a native way to create custom sorts of containers, lists, and pseudo-kinda-list-ish things, and when combined with two other features introduced in ES2015, the for...of loop and generator functions (function*), they give us some very nice new powers.

Case Study: Linked Lists

For a concrete example to play with, let's look at how we might implement and loop over a Linked List three different ways:

the old-school, non-iterator way
using the iteration protocols
using a generator

If you need a quick refresher on what a linked list is, and are feeling a bit TL;DR about the Wikipedia article I linked up there, here's the basics: a linked list can be thought of as a list of stuff built using a bunch of separately connected nodes, each of which only knows about its own value and the next thing in the list, with a parent object that knows about the start ("head") and end ("tail") of the list. You add to the list by creating a new node, linking the current tail to it, and updating the parent's tail reference. There are a bunch of variations, like doubly linked lists, and they have a bunch of performance advantages over traditional arrays for certain applications, but I'm not going to get into any of that here, because it gets complicated fast; if you aren't familiar with all this, check out the Wikipedia article, and google around for articles and maybe MOOC courses on "data structures".

Linked Lists the Old-School Way

Here's a sort of naïve implementation of a linked list using an ES6 class, but not using iterators:

class LinkedList {
    constructor() {
        this.head = this.tail = null
    }
    push(val) {
        const next = {val, next: null}
        if (this.head === null) {
            this.head = this.tail = next
        }
        else {
            this.tail.next = next
            this.tail = next
        }
    }
    forEach(fn) {
        let curr = this.head
        while (curr !== null) {
            fn(curr.val)
            curr = curr.next
        }
    }
}

// example
const l = new LinkedList
l.push(10)
l.push(20)
l.push(30)
l.forEach(n => console.log(n))

Okay, Let's break this down.

When the LinkedList is first initialized in the constructor(), it has nothing in it, so its head and tail properties are both set to null.

The push() method adds a new element to the list. Each time push() is called, a new object is created to hold the newly added value, with two properties:

a val property to hold the value passed in
a next property to point to the next node in the list

Note that each node's next property is initially set to null, since a node is always created as the last thing in the list so far.

We declare this new node to be the new tail node of the list in two steps:

set the next property of the list's current tail to the new node
set the tail property of the list to the new node

There's also a little extra step in there to check if head is null to handle the very first call to push(), and I'm sure this class could be refactored to avoid the repeated check, but this is just a toy example, so ignore the inefficiency for now.

Now the important part: the forEach() method. This is where we iterate over the linked list's contents. We can't use a traditional for (let i=0; i<list.length; i++) loop to iterate over the nodes, since we don't have direct (aka "random") access to any nodes except the head and the current tail. Instead, we need to start with the head node and walk down the list one node at a time, using the next property of the current node at each step to find the next node, until we hit a null. Now, I chose to write this as a while loop because I think it's easier to read, but this could actually be written as a for loop instead:

forEach(fn) {
    for (let curr=this.head; curr !== null; curr=curr.next) {
        fn(curr.val)
    }
}

Take your pick, they're equivalent.

Now, this code is not too bad, but any code that wants to use your class will have to use the forEach method instead of a nicer construct like a for...of loop. This could make it less compatible with other data types like Arrays. If you were writing some complex processing code based on Arrays, but realized after a while that you were in a circumstance where you should really be using a linked list, it might be discouraging to discover that you need to go back and rewrite a bunch of code that uses for...of loops in order to switch over, and you may decide you don't have time. This may seem like a silly example, and of course this is an intentionally simplistic toy case, but as a general rule, cross-compatibility is a good thing.

So let's refactor and see how we can take advantage of the iteration protocols to make our class for...of loop-compatible.

The Iteration Protocols

First, though, let's take a beat and talk about what these protocols are. There are two of them: the iterator protocol and the iterable protocol. Both are pretty simple, so we're in luck there.

Iterators

The iterator protocol is the more interesting one. In order for an object to qualify as an "iterator", it only needs one thing: a next() method. Each time next() is called, it must return an object with two properties: value, representing the next value to be iterated over, and done, indicating whether there's another iteration left.

Concretely, on each call, if there's at least one value left to be iterated over, the function should return an object like this:

{ value: 'next value here', done: false }

If there's nothing left to produce, the function should return an object like this:

{ value: undefined, done: true }

I'll show you some example code in a minute. But first we need to talk about...

Iterables

The iterable protocol is even simpler than the iterator protocol. Conceptually, an iterable is any object that can produce an iterator when needed. Technically speaking, an object counts as an iterable if it has a method with a special name (hold on a sec) that, when called, returns an iterator, as defined above.

Now, about that special name. Another underrated feature of ES2015 was the introduction of a new primitive type, symbol. There's plenty to talk about here, but long-story-short, Symbols can be used as globally-unique Object keys to make sure everyone is talking about the same thing, and not two different ideas with the same name. (There's a lot more to talk about with Symbols, and I highly recommend reading the Mozilla Hacks blog's article, ES6 In Depth: Symbols, and also the rest of the ES6 In Depth series, actually.)

The point for us is that there are a handful of built-in, spec-defined Symbols used to implement protocols, such as the iterable protocol, which uses the global key Symbol.iterator to identify the method that returns an iterator. Here's a trivial class that creates an iterable to loop over the args passed to the constructor:

class ArgsIterable {
    constructor(...args) {
        this.list = args
    }
    [Symbol.iterator]() {
        const list = this.list
        let i=-1
        return {
            next() {
                i += 1
                if (i<list.length) {
                    return { value: list[i], done: false }
                }
                else {
                    return { done: true }
                }
            }
        }
    }
}

So how does this work? Let's step through it:

const iterable = new ArgsIterable(1,3,5,7)
const iterator = iterable[Symbol.iterator]()
console.log(iterator.next())
console.log(iterator.next())
console.log(iterator.next())
console.log(iterator.next())
console.log(iterator.next())
console.log(iterator.next())
/* output:
{value: 1, done: false}
{value: 3, done: false}
{value: 5, done: false}
{value: 7, done: false}
{done: true}
{done: true}
*/

The first 4 times iterator.next() is called, we get a value in the array, and we're told that we haven't reached the end yet. Then once we reach the end, we start always sending {done: true}.

The key advantage to this approach is that the for...of loop understands this protocol:

for (const n of new ArgsIterable(1,3,5,7)) {
    console.log(n)
}
/* output:
1
3
5
7
*/

If this seems like a lot of work, you're not wrong, but there's a solution: generators. But we'll get to that in a minute. For now, let's get back to our LinkedList class.

Iterable Linked Lists

Now that we understand how iterators and iterables work, let's turn our class into an iterable.

class LinkedList {
    constructor() {
        this.head = this.tail = null
    }
    push(val) {
        const next = {val, next: null}
        if (this.head === null) {
            this.head = this.tail = next
        }
        else {
            this.tail.next = next
            this.tail = next
        }
    }
    [Symbol.iterator]() {
        let curr = this.head
        return {
            next() {
                if (curr === null) {
                    return { done: true }
                }
                else {
                    const next = { value: curr.val, done: false }
                    curr = curr.next
                    return next
                }
            }
        }
    }
}

// example
const l = new LinkedList
l.push(10)
l.push(20)
l.push(30)
for (const n of l) console.log(n)
/* output:
10
20
30
*/

Not too horrible, right? [Symbol.iterator]() returns an object with a next() method, with a local variable curr to keep track of the current node, just like we had in our forEach() method earlier. Each time next() is called, we check whether curr is null. If so, we let the caller know that we're done; if not, we prepare our response object, move curr one node down the list to prepare for the next iteration, then return our response object. Kind of a less-controlling version of forEach(), where the user can grab the next item in the list whenever they're ready. And if you run the example code at the end there, you'll see that instances of our LinkedList class just work with for...of loops now! How cool is that?

Array spread for free!

If you aren't convinced, let me show you a very nice perk that comes along for free when you implement the iterable protocol: spreading into an Array with the ES2015 spread operator! If you need to use a linked list for your main processing, but want an array with the results, maybe to run some array methods on, you're in luck! Just spread your LinkedList instance into an array:

const list = new LinkedList
list.push(10)
list.push(20)
list.push(30)
list.push(50)
// magic!
const final = [...list].map(n => n*2).filter(n => n%3 === 0)[0]
console.log(final)
// output: 60

This is because the spread operator, just like the for...of loop, relies on the iterable protocol under the hood to generate the contents of the resulting Array.

As I mentioned above, this might still feel like a lot of mental effort and code without that much benefit. But as I also mentioned, there is a solution:

Generators

Another of my favorite underrated ES2015 features, generators are often referred to in tutorials as "pauseable functions". This is a pretty intuitive way to think about them, but I'd adjust slightly: I'd rather call them pauseable iterables. Let's take a look at a simple example, then I'll explain:

function* countTo(n) {
    for (let i=1; i<=n; i++)
        yield i
}

// example
for (const n of countTo(5))
    console.log(n)
/* output:
1
2
3
4
5
*/

As you may have guessed, the key here is the yield keyword. The first time through the for...of loop, the generator function runs from the top until it hits that yield i, at which point it returns the value of i (sorta; bear with me), and "pauses" the function there, hence the "pauseable" descriptor. The next time through the loop, it picks up right where it left off and continues until it hits another yield, when it pauses again. This continues until the function doesn't hit a yield, but instead reaches a return statement or, in our case, the end of the function. But how exactly does it communicate all this with the for...of loop? Doesn't this loop expect an iterable?

If you call countTo(5) directly and look at the result, you'll see something very interesting. Here's what I get when I poke a bit in Chrome's dev tools:

Poking at a generator in Chrome's dev tools

> x = countTo(5)
  countTo {<suspended>}
> x.next
  f next() { [native code] }
> x[Symbol.iterator]
  f [Symbol.iterator]() { [native code] }

The important thing here is that calling a generator doesn't return a value directly: it returns an object that the engine describes as "suspended", meaning that the generator function's code hasn't been run yet. Interestingly, the object has both a next() method and a [Symbol.iterator] method. In other words, it returns an object that is both an iterable and and iterator!

This means that generators can be used both as standalone sequence generators, like the countTo(n) method above, and as a really easy way to make your object iterable!

Linked Lists with Generators!

Let's loop back around once more to our LinkedList class and replace our custom [Symbol.iterator] method with a generator:

class LinkedList {
    constructor() {
        this.head = this.tail = null
    }
    push(val) {
        const next = {val, next: null}
        if (this.head === null) {
            this.head = this.tail = next
        }
        else {
            this.tail.next = next
            this.tail = next
        }
    }
    *[Symbol.iterator]() {
        let curr = this.head
        while (curr !== null) {
            yield curr.val
            curr = curr.next
        }
    }
}

// example
const l = new LinkedList
l.push(10)
l.push(20)
l.push(30)
for (const n of l) console.log(n)
/* output:
10
20
30
*/

Two things about the [Symbol.iterator] method. First, notice that we had to tack an asterisk on the front of it to indicate that it's a generator function. Second, and most importantly, look at the body of the method: does that look familiar? It's almost exactly the same code as the forEach() method from earlier, just swapping out a callback with the yield keyword!

Because a generator returns an object that implements the iterator protocol, generators make it so easy to make your object iterable! You can use all sorts of interesting storage patterns and traversal algorithms, and it doesn't matter: generators make it easy!

One more example: ImageData

For perhaps a more concrete example, I'd like to talk for a minute about the Canvas. I personally love messing around with image manipulation using the HTML5 Canvas element. You can load up an image using the native Image object, then paint it to the canvas, grab its ImageData object, and directly manipulate pixel values. But there's a catch with ImageData: it's raw pixel data as stored by the computer, meaning that instead of being stored as an array of pixels, something like: [{r:255,b:128,g:0,a:255},...], it's a single long, flat array of bytes, like: [255, 128, 0, 255, ...]. This means that to loop over the pixels, you usually need to do something like this:

for (let i=0; i<imgData.length/4; i++) {
    const p = i*4
    const pixel = {
        r: imgData[p],
        g: imgData[p+1],
        b: imgData[p+2],
        a: imgData[p+3]
    }
    processPixel(pixel)
}

This is... okay, but it's annoying to write out repeatedly if you need to do it a bunch, and it's pretty weird as a util function that takes a callback:

function processPixels(imgData, processPixel)
    for (let i=0; i<imgData.length/4; i++) {
        const p = i*4
        const pixel = {
            r: imgData[p],
            g: imgData[p+1],
            b: imgData[p+2],
            a: imgData[p+3]
        }
        processPixel(pixel)
    }
}

Callbacks... gross 😢

Another option is to loop over the ImageData buffer and convert it to an array first, then use a for...of loop over the array to make it more readable, but given how large images are these days, that's a huge waste of memory.

So what if we instead wrote a little generator function to let us more easily loop over the array without wasting a ton of memory? This is a great benefit of generators: they feel like you're just iterating over an array, but in fact, only a single element exists in memory at a time!

function* getPixels(imgData) {
    for (let i=0; i<imgData.length/4; i++) {
        const p = i*4
        const pixel = {
            r: imgData[p],
            g: imgData[p+1],
            b: imgData[p+2],
            a: imgData[p+3]
        }
        yield pixel
    }
}

for (const pixel of getPixels(imgData)) {
    // process pixel
}

Clean and simple!

Conclusion

The thing that impressed me most about the ES2015 spec, more even that the nice new features themselves, is how much thought went into crafting features that worked together in really nice ways to make JavaScript a deeply cohesive language. The class syntax, the iteration protocol, for...of loops, generators, Symbols, and the array spread operator are all features that were added in ES2015, and they all fit together so smoothly. It's a really impressive feat, and it's only gotten better with ES2016-2018. I've been very impressed by the TC39 proposal process and the features that have emerged from it. I hope it stays this way! It's these sorts of features that get me psyched for the future of JavaScript and the web.

DEV Community