loading...
Cover image for JavaScript Iterators and Generators: Asynchronous Iterators

JavaScript Iterators and Generators: Asynchronous Iterators

jfet97 profile image Andrea Simone Costa Updated on ・9 min read

Introduction

Fresh news of the JavaScript language (we are talking about ES2018), Asynchronous Iterators, and the corresponding Asynchronous Generators, landed to solve a subtle, but important, problem.

We have seen that each iteration step performed with a synchronous iterator returns the {done, value} object, which is called iterator result, where the done field is a boolean flag indicating whether the end of the iteration has been reached. Therefore, they are perfect for synchronous data sources.
What is meant by synchronous source? They are data sources that, when receiving the request for the next element, are immediately able to determine whether that specific data will be the last available or not.

Pay attention to this detail: the source is synchronous, not the data, which can be synchronous or asynchronous.
Quick demonstration with an array of numbers and Promises:

const array = [
    new Promise(res => setTimeout(res, 1000, 1)),
    new Promise(res => setTimeout(res, 2000, 2)),
    new Promise(res => setTimeout(res, 3000, 3)),
    new Promise(res => setTimeout(res, 4000, 4)),
    5,
    6,
    7,
    8,
]


;(async () => {

    for(const v of array) {
        console.log(await v); // 1 2 3 4 5 6 7 8
    }

})();

The array provides a synchronous iterator that is enough because, even if the returned value may be asynchronous, the collection is always able to determine synchronously (at the time of each request) its own state. Each time the next method is implicitly called by the for-of loop, the source knows if the element it is going to return will be the last, so it is able to immediately set the done field.

Note that this is possible only if the collection is completely present in memory. But you will know better than me that it's common to have to interface with external data sources. They are usually represented by an entity that exposes an asynchronous API based on the concept of event or, thanks to some layers of abstraction, with the concept of stream. Unfortunately, the synchronous iterators cannot be used to interface with them, because this type of iterator forces us to determine in a synchronous way the end of the iteration.
Those entities do not contain the data, perhaps they contain a small part but not all of it, because it is often not physically possible, so they are not able to know if the next data will be the last when it is requested.

This is where asynchronous iteration comes into play, for which the resolution of the done flag is asynchronous.
 

Asynchronous Iteration

This type of iteration is based on two new interfaces.
 

The AsyncIterable

The AsyncIterable interface defines what an entity has to implement to be considered an async iterable.
The specification says that the @@asyncIterator method, which returns an object that implements the AsyncIterator interface is required.
What is @@asyncIterator? Is a specific Symbol, like @@iterator was, and we can find it into the Symbol constructor: Symbol.asyncIterator.

const AsyncIterable = {
    [Symbol.asyncIterator]() {
        return AsyncIterator;
    }
}

 

The AsyncIterator

The big difference between an AsyncIterator and a sync one is what the three methods (next, return, throw) should return: a Promise. Their purpose basically remained the same.
Usually, both next and return methods should return a promise that is going to fulfil with an IteratorResult object. On the contrary, the promise returned by the throw method should be a rejected one, with the value passed as the argument being the rejecting reason.

const AsyncIterator = {
    next() {
        return Promise.resolve(IteratorResult);
    },
    return() {
        return Promise.resolve(IteratorResult);
    },
    throw(e) {
        return Promise.reject(e);
    }
}

 

The IteratorResult

There isn't an async counterpart for this interface: the old IteratorResult has everything we need to identify each iteration result. Indeed, we simply wrap it inside a Promise to be able to resolve the done flag asynchronously.
The only thing to keep in mind is a limitation which concerns the value field: it should never be neither a promise nor a thenable. This approach would dangerously resemble a promise (the one returned by the AsyncIterator methods) of a promise (the one inside the value field of the IteratorResult), a concept from which the JavaScript has always kept very far.
On the other hand, always finding a spatial value inside the fulfilled IteratorResult will ensure greater temporal consistency between iterations.
 

Asynchronous Iterators

Let's implement an async iterators factory to iterate over a remote API:

function remotePostsAsyncIteratorsFactory() {
    let i = 1;
    let done = false;

    const asyncIterableIterator = {
        // the next method will always return a Promise
        async next() {

            // do nothing if we went out-of-bounds
            if (done) {
                return Promise.resolve({
                    done: true,
                    value: undefined
                });
            }

            const res = await fetch(`https://jsonplaceholder.typicode.com/posts/${i++}`)
                                .then(r => r.json());

            // the posts source is ended
            if (Object.keys(res).length === 0) {
                done = true;
                return Promise.resolve({
                    done: true,
                    value: undefined
                });
            } else {
                return Promise.resolve({
                    done: false,
                    value: res
                });
            };

        },
        [Symbol.asyncIterator]() {
            return this;
        }
    }

    return asyncIterableIterator;
}

I'm sure that you aren't seeing anything you are not able to understand. The next method will always return a Promise, as the interface wants. The Promise will be fulfilled after data fetching, thanks to which we are able to know when the iteration is over.
Note that I've added the @@asyncIterator method to the returned iterator. And that's because all async iterators should be async iterables, following the example of their sync counterpart.

Let's use it:

;(async() => {

    const ait = remotePostsAsyncIteratorsFactory();

    await ait.next(); // { done:false, value:{id: 1, ...} }
    await ait.next(); // { done:false, value:{id: 2, ...} }
    await ait.next(); // { done:false, value:{id: 3, ...} }
    // ...
    await ait.next(); // { done:false, value:{id: 100, ...} }
    await ait.next(); // { done:true, value:undefined }

})();

I think the code is sufficiently self-explanatory.
 

The for-await-of loop

The async counterpart of the for-of loop is the fow-await-of loop, which helps us a lot to iterate async sources without the need to manually handle each async IterationResult nor the async end of the iteration.
It can be used only inside async contexts, like a yaffee, and is able to handle sync sources too. First of all, it will try to call the @@asyncIterator method to get an async iterator to iterate, but it will fall back on the @@iterator method when the source given to it is synchronous.

;(async function IIAFE() {

    for await (const v of source) {
        console.log(v);
    }

})();

 

Let's see some examples to learn how this loop behaves:

    // sync source, sync values
    // each iteration will return '{ value:number|undefined, done:boolean }'
    for await (const v of [1, 2, 3, 4, ...]) {
        console.log(v); // 1 2 3 4 ...
    }

    // sync source, async values
    // each iteration will return '{ value:Promise<number>|undefined, done:boolean }'
    const array = [
        new Promise(res => setTimeout(res, 1000, 1)),
        new Promise(res => setTimeout(res, 2000, 2)),
        new Promise(res => setTimeout(res, 3000, 3)),
        new Promise(res => setTimeout(res, 4000, 4)),
        ...
    ]
    for await (const v of array) {
        console.log(v); // 1 2 3 4 ...
    }


    // async source, sync values
    // each iteration will return 'Promise<{ value:number|undefined, done:boolean }>'
    for await (const v of asyncSource) {
        console.log(v); // 1 2 3 4 ...
    }

    // async source, async values (BAD)
    // each iteration will return 'Promise<{ value:Promise<number|undefined>, done:boolean }>'
    for await (const v of asyncSource) {
        console.log(v); // series of Promises...
    }

Probably one or more results may sound strange to you, so let's try to make things clearer.
 

Async sources

For async sources, the loop will just await each Promise returned by the implicit calls to the next method. When the Promise is fulfilled, if the done flag is false, the loop will make the value available inside its body, whatever it is, to then proceed with the following iteration at its end.
No other operations will be performed on the value itself, and this explains why in the third example we see a series of numbers and in the fourth a series of Promises. Another good reason to not use Promises as values for async iterations! Instead, if the done flag is true, the loop will end.
 

Sync sources

The behaviour of the for-await-of loop for sync sources is slightly different from what one might expect. You could think that each IteratorResult object will be directly adapted, being inserted into an immediately fulfilled Promise, to eliminate any difference between sync and async iteration results. But, if this was the case, the outcome of the second example should be the same as the fourth one.

You are not very far from the truth, but things are slightly different. It's the sync Iterator itself which is adapted thanks to the CreateAsyncFromSyncIterator abstract operation. What happens is that each iterated value is normalized into a Promise, via Promise.resolve, to then be "awaited" to produce the IteratorResult.
We can outline what happens behind the scenes, at each iteration, in the following way, which I've derived from the Dr. Axel one:

// the for-await-of has just called 'adapter.next()' and is 'awaiting' the result

try {
    const syncIteratorResult = syncIterator.next();

    const nextIteratorResultPromise = Promise.resolve(syncIteratorResult.value)
        .then(value => ({ value, done: syncIteratorResult.done }));

    return nextIteratorResultPromise; // <-- this will be 'awaited' by the for-await-of
} catch(e) {
    // the loop is going to throw an exception if something goes wrong during the 'next' method call
    throw e;
}

Another great way to see what happens, which I'm going to borrow from Axel, is the following: Iterable<T> and Iterable<Promise<T>> become AsyncIterable<T>.
 

Node.js Streams

Node.js Readable Streams are a more concrete example of async iterables. That is because they were built to support consumers that are slower than producers, so they are able to interrupt the data stream whenever necessary. Usually all this goes unnoticed, well hidden by the pipe method:

readableStream.pipe(writableStream);

But we can explicitly pause the stream too, requesting chunks of data only when it's our will:

const readableStreamAsyncIter = readableStream[Symbol.asyncIterator]();

await readableStreamAsyncIter.next(); // first chunk
// other async stuff
await readableStreamAsyncIter.next(); // second chunk

Readable Streams cannot implement the synchronous iteration interfaces because they interact asynchronously with external resources like files. The point is that it's not a single, long interaction, but it is spreaded over time. That is because streams are not going to load the whole file, but only chunks of it, which flow to the consumer. Having a limited knowledge of the file itself, they are almost never able to solve synchronously the done flag.
 

The consumer pressure problem

Let's consider a generic async source:

const ait = asyncSource[Symbol.asyncIterator]();

What will happen if we do like this?

ait.next().then(...);
ait.next().then(...);
ait.next().then(...);

Each call to the next method will cause the async source to start an async task to provide a result, but the main problem is that these tasks will run in parallel, not sequentially. That is because the async iterator was moved forward synchronously.

We could say that the consumer is putting too much pressure on the producer. Odds are that the latter is unable to deal with it because:

  1. Each async task could be the last, ending with a done:true. All the async tasks started after that shouldn't do any work, ending as soon as possible with {value:undefined, done:true}. Unfortunately, if tasks were started concurrently, chances are that at a certain point some of them will be doing completely useless work, wasting resources and probably causing problems, because one of the others has completed the iteration. And most likely they will not even finish correctly reporting the out-of-bounds status.
  2. Leaving aside the end-of-iteration problem, let's focus on the results. What if the async source, to compute each async task result, need the ending value of the previous one? For example, think about cursor-based pagination. Tasks can be started concurrently, so it's impossible to create well-formed async iterables for these eventualities.

The truth is we need a way to force the iteration to be sequential, ensuring time consistency between both async and sync next, throw and return calls. Doing so, we'll also avoid the unfortunate, conceptually wrong situation where one call to one of those iteration methods is going to finish before than a previous one.

Since we will never be able to prevent a consumer to mess with the iteration's methods, we have to enqueue the calls to them with their respective arguments, if any. In this way, the async source will be able to properly handle them one after another.
At this point, things get quite complicated, but we don't have to worry about it because we have async generators, which have this feature out-of-the-box.

 

Conclusion

That's all you should know about Asynchronous Iterators!

We have learnt why the ability to resolve the done flag asynchronously could be vital in some circumstances. The fresh async iteration interfaces are here to help us reach the goal, and now you know all their main features and best practices.

Then we have seen an example of a simple async iterator, how the for-await-of loop behaves and why Node.js Readable Streams do support async iteration. We also have spent some words on a rarely considered but important problem that is very well resolved by the next, last big topic: Asynchronous Generators

I hope to see you there again 🙂 and on twitter!

 

Acknowledgements

I would like to thank Marco Iamonte for the time he spent helping me to identify a lot of grammatical errors.

 

Bibliography

Posted on by:

jfet97 profile

Andrea Simone Costa

@jfet97

I write JavaScript code, mostly.

Discussion

markdown guide
 

Oh wow, this is next level! Also TIL about for-await-of; neat. Do you have some examples of where this can be really helpful in real world code? I'm wondering when I should think to maybe reach for this instead of cobbling something else together 😃 Thanks!

 

Thanks, I'm glad you liked the article 😃

Generally, the async iteration is well suited for all those circumstances where consumers should have control over the flow of data.
Next time I'll show some more concrete examples thanks to async generators, but for now a good starting point to satisfy your needs surely are Node.js Readable streams. WHATWG Streams are async iterables too.

 

I was thinking pagination could be a use case candidate for async iteration.

const page = await pagination[Symbol.asyncIterator]().next();
 

Yes of course. I've already briefly mentioned it into the 'The consumer pressure problem' section 😃

 

I think we can also write it down in this succinct way. But correct me if I'm wrong

const pagination = {
  async *[Symbol.asyncIterator]() {
    let i = 0;
    while (i < 10) {
      await Promise.resolve(true);
      i++;
      yield i;
    }
  }
};

 

Yes, but async gens are the argument of the next article eheheheh

 
 

A while back I also wrote a library to manipulate async stream with async iterables. You might find it interesting. It was in the context of this article.
It is a very nice serie you wrote here. However I think you should stress more on the fact that closing iterable is a leaky abstraction(by Reginald “Raganwald” Braithwaite). You did a bit in the second article but I think it deserves more attention. It is usually better to use built in consumers (such for await statement) to avoid problems

 

It could seem a stupid answer but I strongly believe in well-commented programs, so what is not immediately understandable should be highlighted with a proper comment/explanation.

I am with you, I think that custom implementations of interfaces like these should be kept to a minimum but...there is always an exception.
Like here: github.com/nodejs/node/blob/master...
One of the Node.js streams maintainer told me that an async gen would be too slow and hard to implement and mantain, so they've chosen to manually implement the async iteration interfaces.

 

I don't think we are talking about the same thing. I think it is totally fine to implement these interfaces yourself: usually when you implement a producer in a "low level" way you know what you are doing (cf Nodejs core maintainers).

I am talking about the code which consumes it, depending on the way you consume it, you might create leaks.

Consider the following producer (from my article I linked above):

const wait = delay => new Promise(resolve => {
    setTimeout(() => resolve(), delay);
});
const counterGen = async function * (limit = 10, delay = 100) {
 let iter = 1;
 try {
  while (true) {
   if (iter > limit) {
    break;
   }
   await wait(delay);
   yield iter;
   iter++;
  }
 } catch (e) {
  console.log('oops something is wrong');
  throw e;
 } finally {
  console.log('I have been released !!!');
 }
};

You could have written it by implementing the interfaces, it does not matter.

Just note it is doing some cleaning ("I have been released !!!"), it could be release file handle or whatever.

Now let's say you want to consume it and sum the 3 first values.

You can do it in a naive way

const sum = async iterator => {
    let i = 0;
    let sum = 0;
    while (i < 3) {
        const next = await iterator.next();
        if (next.done) {
            break;
        }
        sum += next.value;
        i++;
    }

    // VERY IMPORTANT IF YOU DO NOT WANT TO CREATE A LEAK
    // iterator.return();

    return sum;
};

sum(counterGen())
    .then(console.log);

And this code creates a leak if you don't pay attention and do not explicitly call return (you will not see the release message)

Whereas if you decide to go for "native construct" like for await statement you are safe

const sum = async iterator => {
    let i = 0;
    let sum = 0;
    for await (const v of iterator){
        if(i>=3){
            break;
        }
        sum+= v;
        i++
    }
    return sum;
};

sum(counterGen())
    .then(console.log);

I think tutorials and articles on iterators (and async iterators) do not stress enough that eventual issue.

This was my point :)

Uh I've totally misunderstood your previous message!

Yes you are right, the for-of and the for-await-of should be always preferred 🙂