DEV Community

Cover image for Serialize and Deserialize Binary Tree
Egregors
Egregors

Posted on • Edited on

Serialize and Deserialize Binary Tree

What?

A binary tree is a tree data structure in which each node has at most two children. It could be (and it is!) pretty useful in many ways.

In this article, we gonna try to serialize and deserialize binary trees using Golang.

Why?

The original purpose is quite prosaic: a bunch of binary trees problems on LeetCode requires some testing. However, the definition of trees “by hand” is so annoying. So, to use the original LeetCode input format, we gonna make our own serialize \ deserialize package.

How?

Let's start with a TreeNode definition. It's going to be a node with Int value, but you may use any type you need here.

type TreeNode struct {
    Val   int
    Left  *TreeNode
    Right *TreeNode
} 
Enter fullscreen mode Exit fullscreen mode

It's basically all we need to start building trees. We'll take a simple tree, just for example:

Image 1-5 tree

To make it in a code, we should create a root node and define all internal nodes recursively:

root := &TreeNode{
    Val: 1,
    Left: &TreeNode{
        Val:   2,
        Left:  nil,
        Right: nil,
    },
    Right: &TreeNode{
        Val: 3,
        Left: &TreeNode{
            Val:   4,
            Left:  nil,
            Right: nil,
        },
        Right: &TreeNode{
            Val:   5,
            Left:  nil,
            Right: nil,
        },
    },
}
Enter fullscreen mode Exit fullscreen mode

Looks a bit messy, and it's super annoying to annotate trees like this one by your hands. It could be one of the many reasons you want to implement serialize \ deserialize for your data.

Serialize

First, we want to find a way to make a useful representation of trees. Serialization generally is a process of encoding some data into byte progression. When we got how to do this, we'll can store trees like a string-data, for example for tests or for transporting trees between programs through the Web.

Level Order Binary Tree Traversal

As far as we want to support LeetCode binary trees format, we should serialize it “line by line”.

Image lobtt

We can do this by using Level Order Binary Tree Traversal which is one of the ways to use Breadth first search algorithm.

The main idea is to go through the tree “line by line” and accumulate the result of each line traversal.

We could do this recursively with a simple queue approach. So, first, let's define a queue.

type NodeQueue []*TreeNode

func (q *NodeQueue) IsEmpty() bool           { return len(*q) == 0 }
func (q *NodeQueue) Push(nodes ...*TreeNode) { *q = append(*q, nodes...) }
func (q *NodeQueue) Pop() *TreeNode {
    if !q.IsEmpty() {
        n := (*q)[0]
        *q = (*q)[1:]
        return n
    }
    return nil
}
Enter fullscreen mode Exit fullscreen mode

We want to collect our nodes level by level. That means we should be able:

  • Push a node in the queue (to make it a bit more convenient, let's allow to push a few nodes at a time);
  • Pop nodes from the queue;
  • know if the queue IsEmpty.

Next, we're going to write a recursive bfs function to collect values of all nodes “line by line”.

const EmptyNodeMark = "null"

func bfs(q *NodeQueue) []string {
    if q.IsEmpty() {
        return nil
    }

    var (
        nextQ        NodeQueue
        level        []string
    )

    for !q.IsEmpty() {
        n := q.Pop()
        if n != nil {
            level = append(level, strconv.Itoa(n.Val))
            nextQ.Push(n.Left, n.Right)
        } else {
            level = append(level, EmptyNodeMark)
        }
    }

    return append(level, bfs(&nextQ)...)
}
Enter fullscreen mode Exit fullscreen mode

As usual, when we are dealing with a recursive function, we start with a terminator. In this case, we would like to interrupt calls if our queue is empty.

Next, we have two main goals for each call:

  • collect values of all nodes in the current queue (current level);
  • build a next queue for the next iteration (next level).

The initial call of bfs will be with a queue that contains just root of the tree. In our example, it's node 1.

So, we just pop a node from the current queue and if it is not empty, we append its value into level slice.

Next, we push children of all nodes from the current queue into nextQ slice. For node 1, it will be nodes 2 and 3.

When the current queue q becomes empty, we can take the next step and repeat the whole process for the nextQ.

Thus, our current queue is [2, 3] and our result accumulator is "1". After the next iteration, the next queue going to be [nil, nil, 4, 5] and the resulting accumulator – ["1", "2", "3"].

And so on and so forth…

Image tree-traversal

But there is a problem, independently of tree shape the last level will always contain only "null" values. In our example, it's [ nil, nil, nil, nil ].

We shouldn't include any unnecessary redundant data in the final result, especially when we are talking about serialization. Let's little rework our bfs.

We want to detect “the last level” and interrupt calls at this place, without including useless parts of data.

const EmptyNodeMark = "null"

func bfs(q *NodeQueue) []string {
    var (
        nextQ        NodeQueue
        level        []string
        isEmptyLevel = true
    )

    for !q.IsEmpty() {
        n := q.Pop()
        if n != nil {
            level = append(level, strconv.Itoa(n.Val))
            nextQ.Push(n.Left, n.Right)
            isEmptyLevel = false
        } else {
            level = append(level, EmptyNodeMark)
        }
    }

    if isEmptyLevel {
        return nil
    }

    return append(level, bfs(&nextQ)...)
}
Enter fullscreen mode Exit fullscreen mode

Ok, looks like this is generally all we need to serialize the tree. Now we can implement Stringer interface for TreeNode.

const (
    EmptyNodeMark = "null"
    Separator     = ","
)

func (t TreeNode) String() string { return t.serialize() }

func (t TreeNode) serialize() string {
    data := bfs(&NodeQueue{&t})

    // remove redundant nulls
    i := len(data) - 1
    for data[i] == EmptyNodeMark {
        i--
    }

    return fmt.Sprintf("[%s]", strings.Join(data[:i+1], Separator))
}
Enter fullscreen mode Exit fullscreen mode

Here in serialize() method, we just concatenate values of all nodes with comma separator.

Deserialize

Since we are done with serialization, we need to do the opposite. Deserializing gonna be a bit more tricky, but the main idea is still the same.

First, we need to prepare our serialized data to build an actual tree. Serialized representation of the tree from the example is "[1,2,3,null,null,4,5]". Let's unwrap /s'up Rust fellows! ;) / brackets and build a string slice.

nodes := strings.Split(
    strings.TrimSuffix(strings.TrimPrefix(data, "["), "]"),
    Separator,
)
Enter fullscreen mode Exit fullscreen mode

For now, we got a string slice as an input, and we need to build a whole tree from it.

As a first step, we create root node by hand, using the head of data slice like a root value. Next, we gonna move through data slice tail and build the tree recursively. Here is still the same BFS algorithm with a queue.

Image tree-build

const EmptyNodeMark = "null"

func bfsBuild(q *NodeQueue, data []string) error {
    if len(data) == 0 || q == nil {
        return nil
    }

    var nextQ NodeQueue

    for !q.IsEmpty() {
        n := q.Pop()
        if n != nil {
            // if the data tail of current level contains only nulls, they could be reduced.
            // that means, if the data becomes empty earlier than level does, there is no more nodes
            if len(data) == 0 {
                return nil
            }
            l, r := data[0], data[1]
            data = data[2:]

            if l != EmptyNodeMark {
                if lVal, lErr := strconv.Atoi(l); lErr == nil {
                    n.Left = &TreeNode{Val: lVal}
                } else {
                    return lErr
                }
            }

            if r != EmptyNodeMark {
                if rVal, rErr := strconv.Atoi(r); rErr == nil {
                    n.Right = &TreeNode{Val: rVal}
                } else {
                    return rErr
                }
            }

            nextQ.Push(n.Left, n.Right)
        }
    }

    return bfsBuild(&nextQ, data)
}
Enter fullscreen mode Exit fullscreen mode

When we fetched the first value from the data slice, the length of this slice must be always even. Because of in a binary tree and all non-nil nodes guaranteed to have two children (including nil-nodes). So each iteration we are creating children for all non-nil nodes in the queue, and extracting for this purpose len(q) * 2 elements from the data slice. Until the data become empty.

That is it! Now we can merge data parsing and tree building functions in TreeNode constructor.

func NewTreeNode(data string) (*TreeNode, error) {
    if len(data) < 2 {
        return nil, errors.New("invalid input")
    }
    nodes := strings.Split(
        strings.TrimSuffix(strings.TrimPrefix(data, "["), "]"),
        Separator,
    )
    rootVal, err := strconv.Atoi(nodes[0])
    if err != nil {
        return nil, err
    }

    root := &TreeNode{Val: rootVal}
    err = bfsBuild(&NodeQueue{root}, nodes[1:])
    if err != nil {
        return nil, err
    }
    return root, nil
}
Enter fullscreen mode Exit fullscreen mode

TreeNode package

Needless to say, you could find all this code in the repo: https://github.com/egregors/TreeNode

And moreover, you can use this package as a dependency to deal with LeetCode binary tree problems, or whatever you want.

Top comments (1)

Collapse
 
getjump profile image
getjump

Amazing post as always! Keep up the good work!