Posted on

# Solution: Chunked Iterator, Python Riddle

This is a detailed solution to this riddle.

Here it is again: write a function (`chunks`) where the input is an iterator. And the output is an iterator of `n` sized iterators.

The first challenge is that the length of the original iterator is unknown. Let's start with a naive broken solution using `itertools.islice` to create `n` size iterators without caring about the length of the original iterator:

``````import itertools
from typing import TypeVar, Iterator

T = TypeVar('T')

def chunks(iterator: Iterator[T], n: int) -> Iterator[Iterator[T]]:
while True:
yield itertools.islice(iterator, 0, n)  # yield `n` size chunk of the iterator

for chunk in chunks(iter(range(10)), 3):
print(f'A {type(chunk)} type chunk, contains items: {tuple(chunk)}')
``````

The output is:

``````A <class 'itertools.islice'> type chunk, contains items: (0, 1, 2)
A <class 'itertools.islice'> type chunk, contains items: (3, 4, 5)
A <class 'itertools.islice'> type chunk, contains items: (6, 7, 8)
A <class 'itertools.islice'> type chunk, contains items: (9,)
A <class 'itertools.islice'> type chunk, contains items: ()
A <class 'itertools.islice'> type chunk, contains items: ()
.... forever

``````

Using `itertools.islice` we managed to chunk up the original iterator, but we don't know when it is exhausted. And so, `chunks` is a generator function that never ends.

To overcome this problem we need to take one item out of the original iterator. Then, we'll use `itertools.chain` to create a chunk featuring this one item and `n-1` more items.

``````
def chunks(iterator: Iterator[T], n: int) -> Iterator[Iterator[T]]:
for first in iterator: # take one item out (exits loop if `iterator` is empty)
rest_of_chunk = itertools.islice(iterator, 0, n - 1)
yield itertools.chain([first], rest_of_chunk)  # concatenate the first item back

``````

The output is:

``````A <class 'itertools.chain'> type chunk, contains items: (0, 1, 2)
A <class 'itertools.chain'> type chunk, contains items: (3, 4, 5)
A <class 'itertools.chain'> type chunk, contains items: (6, 7, 8)
A <class 'itertools.chain'> type chunk, contains items: (9,)
``````

And That is it!

And in comprehension form (I usually prefer comprehensions, but I must admit that this one is not as readable as the loop form):

``````    return (itertools.chain([first], itertools.islice(iterator, 0, n - 1))
for first in iterator)
``````

### Performance

The runtime overhead of `islice`-ing and `chain`-ing is way smaller than writing your own custom code for `chunks` since both these functions are implemented in `C` (for the `CPython` runtime).

### The Caveat

There is a caveat here: This whole solution assumes that the consumer of `chunks` is consuming the iterators fully and in order. If that is not the case, the order of items in our chunks might not be consistent with the original iterator, due to the laziness of `chunks`.

Example:

``````chunk_iterator = chunks(iter(range(10)), 3)
first_chunk = next(chunk_iterator)
second_chunk = next(chunk_iterator)
print(tuple(second_chunk))  # (1, 2, 3)
print(tuple(first_chunk))  # (0, 4, 5)
``````

Cheers!