DEV Community

Cover image for Iterating through trees using the yield statement
Simon Richard
Simon Richard

Posted on

Iterating through trees using the yield statement

Data is often organized using tree like structures; JSON, XML, and File Directories are good examples of this. In this post, I would like to talk about how the keyword yield (in combination with recursion) can be used to iterate easily through complex1 tree like structures.

1 By complex, I mean multi-typed and messy.

The yield Statement

For those who are not familiar with yield statements in python, here's a quick explanation.

The yield statement allows a function to return a value and then continue running. In practice, this means turning the function into an iterator. For example:

def foo():
    n = 0
    while True:
        yield n
        n += 1

my_iter = foo()
my_iter.next()
> 0
my_iter.next()
> 1
my_iten.next()
> 2
...
Enter fullscreen mode Exit fullscreen mode

We can also use a for loop.

my_iter = foo()
for n in my_iter:
    print(n)
> 0
> 1
> 2
...
Enter fullscreen mode Exit fullscreen mode

Recursion

Recursion occurs when a function is executed inside of itself. This idea is especially helpful in relation to self similar structures, like trees. However, that function needs an exit point in order to actually be useful. Otherwise, it would run forever and return nothing. Let's look at some examples.

This function will run forever.

def foo(x):
    return 1 + foo(x)
Enter fullscreen mode Exit fullscreen mode

This function will not run forever. Notice that it has an exit point, which is used when x >= 5.

def bar(x):
    if x < 5:
        return var(x+1)
    else:
        return x
Enter fullscreen mode Exit fullscreen mode

This function prints all of the items of a nested array.

def print_nested_array(x):
    if isinstance(x, list):
        for item in x:
            print_nested_array(item)
    else:
        print(x)
Enter fullscreen mode Exit fullscreen mode

That one's a little more complicated; let's talk about it.

The first value that's passed through this function is probably going to be an array (since we called the function print_nested_array). In that case, the function will loop through that array and pass each of its items back into the function.

For each item that is not an array, the function will print its string representation to the console. That's our exit point.

If one of those items is also an array, it will go through the same process the initial array went through. Each of its items will be passed back into the function.

Let's test it.

arr = [ [1,2], 3, [[4], 5, 6], 7]
print_nested_array(arr)
Enter fullscreen mode Exit fullscreen mode
1
2
3
4
5
6
7
Enter fullscreen mode Exit fullscreen mode

Using Recursion and yield

The yield statement allows us to pass each of the items in our nested array back to the first instance of our function (the one that was not called using recursion), making it easier for us to access those values later on.

To implement this, we'll only need to do few things to the previous function we used (shown below for convenience).

def print_nested_array(x):
    if isinstance(x, list):
        for item in x:
            print_nested_array(item)
    else:
        print(x)
Enter fullscreen mode Exit fullscreen mode

First, we'll replace the print statement with a yield statement.

Then, since our function now returns an iterator, we'll also have to handle that inside our for loop. In this case, instead of just calling our function again, we'll use another for loop to iterate through the iterator it returns. Each of the items in that loop will also be yielded.

def parse_array(x):
    if isinstance(x, list):
        for item in x:
            for result in parse_array(item):
                yield result
    else:
        yield x
Enter fullscreen mode Exit fullscreen mode

Let's test it.

arr = [
    "foo",
    "bar",
    ["asdf", 34],
    [[], [1, 2]],
    [3, 4, 5, 6, 7],
    True
]

for item in parse_array(arr):
    print(item)
Enter fullscreen mode Exit fullscreen mode
foo
bar
asdf
34
1
2
3
4
5
6
7
True
Enter fullscreen mode Exit fullscreen mode

We could have done that with our previous function. However, we can also cast the iterator the function returns into a list.

list( parse_array(arr) )
Enter fullscreen mode Exit fullscreen mode
["foo", "bar", "asdf", 34, 1, 2, 3, 4, 5, 6, 7, True]
Enter fullscreen mode Exit fullscreen mode

Let's modify the function so we can print a nicely formatted representation of the array.

def print_array(x):
    if isinstance(x, list):
        for item in x:
            for result in print_array(item):
                yield "  "+str(result)
    else:
        yield str(x)

for row in print_array(arr):
    print(row)
Enter fullscreen mode Exit fullscreen mode
  foo
  bar
    asdf
    34
      1
      2
    3
    4
    5
    6
    7
  True
Enter fullscreen mode Exit fullscreen mode

What if our array also had dictionaries in it? What if it was a dictionary? We can add an elif statement to handle that.

def parse_something(x):
    if isinstance(x, list):
        for item in x:
            for result in parse_something(item):
                yield result
    elif isinstance(x, dict):
        for key in x:
            for result in parse_something(x[key]):
                yield result
    else:
        yield x
Enter fullscreen mode Exit fullscreen mode

To print a nicely formatted version of our array/dictionary, we could use this function.

def print_something(x):
    if isinstance(x, list):
        for item in x:
            for result in print_something(item):
                yield "  "+str(result)
    elif isinstance(x, dict):
        for key in x:
            yield ""
            yield key+":"
            for result in print_something(x[key]):
                yield "  "+str(result)
    else:
        yield str(x)
Enter fullscreen mode Exit fullscreen mode

Let's test it.

my_dict = {
    "key1": 1,
    "key2": [1, 2, {"foo": 42}],
    "bar": {
        "a": 1234,
        "b": 5678
    }
}

print_something(my_dict)
Enter fullscreen mode Exit fullscreen mode

key1:
  1

key2:
  1
  2

  foo:
    42

bar:

  a:
    1234

  b:
    5678
Enter fullscreen mode Exit fullscreen mode

Conclusion

The combination of recursion and the yield statement is pretty powerful. I'm definitely going to keep it in the back of my head when I'm programming.

I hope you've also found this post to be helpful (or at least a little interesting).

— jsimonrichard

Repl IT Post

Top comments (0)