What?
A binary tree is a tree data structure in which each node has at most two children. It could be (and it is!) pretty useful in many ways.
In this article, we gonna try to serialize and deserialize binary trees using Golang.
Why?
The original purpose is quite prosaic: a bunch of binary trees problems on LeetCode requires some testing. However, the definition of trees “by hand” is so annoying. So, to use the original LeetCode input format, we gonna make our own serialize \ deserialize package.
How?
Let's start with a TreeNode
definition. It's going to be a node with Int
value, but you may use any type you need here.
type TreeNode struct {
Val int
Left *TreeNode
Right *TreeNode
}
It's basically all we need to start building trees. We'll take a simple tree, just for example:
To make it in a code, we should create a root
node and define all internal nodes recursively:
root := &TreeNode{
Val: 1,
Left: &TreeNode{
Val: 2,
Left: nil,
Right: nil,
},
Right: &TreeNode{
Val: 3,
Left: &TreeNode{
Val: 4,
Left: nil,
Right: nil,
},
Right: &TreeNode{
Val: 5,
Left: nil,
Right: nil,
},
},
}
Looks a bit messy, and it's super annoying to annotate trees like this one by your hands. It could be one of the many reasons you want to implement serialize \ deserialize for your data.
Serialize
First, we want to find a way to make a useful representation of trees. Serialization generally is a process of encoding some data into byte progression. When we got how to do this, we'll can store trees like a string-data, for example for tests or for transporting trees between programs through the Web.
Level Order Binary Tree Traversal
As far as we want to support LeetCode binary trees format, we should serialize it “line by line”.
We can do this by using Level Order Binary Tree Traversal which is one of the ways to use Breadth first search algorithm.
The main idea is to go through the tree “line by line” and accumulate the result of each line traversal.
We could do this recursively with a simple queue approach. So, first, let's define a queue.
type NodeQueue []*TreeNode
func (q *NodeQueue) IsEmpty() bool { return len(*q) == 0 }
func (q *NodeQueue) Push(nodes ...*TreeNode) { *q = append(*q, nodes...) }
func (q *NodeQueue) Pop() *TreeNode {
if !q.IsEmpty() {
n := (*q)[0]
*q = (*q)[1:]
return n
}
return nil
}
We want to collect our nodes level by level. That means we should be able:
-
Push
a node in the queue (to make it a bit more convenient, let's allow to push a few nodes at a time); -
Pop
nodes from the queue; - know if the queue
IsEmpty
.
Next, we're going to write a recursive bfs
function to collect values of all nodes “line by line”.
const EmptyNodeMark = "null"
func bfs(q *NodeQueue) []string {
if q.IsEmpty() {
return nil
}
var (
nextQ NodeQueue
level []string
)
for !q.IsEmpty() {
n := q.Pop()
if n != nil {
level = append(level, strconv.Itoa(n.Val))
nextQ.Push(n.Left, n.Right)
} else {
level = append(level, EmptyNodeMark)
}
}
return append(level, bfs(&nextQ)...)
}
As usual, when we are dealing with a recursive function, we start with a terminator. In this case, we would like to interrupt calls if our queue is empty.
Next, we have two main goals for each call:
- collect values of all nodes in the current queue (current level);
- build a next queue for the next iteration (next level).
The initial call of bfs
will be with a queue that contains just root
of the tree. In our example, it's node 1.
So, we just pop a node from the current queue and if it is not empty, we append its value into level
slice.
Next, we push children of all nodes from the current queue into nextQ
slice. For node 1, it will be nodes 2 and 3.
When the current queue q
becomes empty, we can take the next step and repeat the whole process for the nextQ
.
Thus, our current queue is [2, 3]
and our result accumulator is "1"
. After the next iteration, the next queue going to be [nil, nil, 4, 5]
and the resulting accumulator – ["1", "2", "3"]
.
And so on and so forth…
But there is a problem, independently of tree shape the last level will always contain only "null"
values. In our example, it's [ nil, nil, nil, nil ]
.
We shouldn't include any unnecessary redundant data in the final result, especially when we are talking about serialization. Let's little rework our bfs
.
We want to detect “the last level” and interrupt calls at this place, without including useless parts of data.
const EmptyNodeMark = "null"
func bfs(q *NodeQueue) []string {
var (
nextQ NodeQueue
level []string
isEmptyLevel = true
)
for !q.IsEmpty() {
n := q.Pop()
if n != nil {
level = append(level, strconv.Itoa(n.Val))
nextQ.Push(n.Left, n.Right)
isEmptyLevel = false
} else {
level = append(level, EmptyNodeMark)
}
}
if isEmptyLevel {
return nil
}
return append(level, bfs(&nextQ)...)
}
Ok, looks like this is generally all we need to serialize the tree. Now we can implement Stringer
interface for TreeNode
.
const (
EmptyNodeMark = "null"
Separator = ","
)
func (t TreeNode) String() string { return t.serialize() }
func (t TreeNode) serialize() string {
data := bfs(&NodeQueue{&t})
// remove redundant nulls
i := len(data) - 1
for data[i] == EmptyNodeMark {
i--
}
return fmt.Sprintf("[%s]", strings.Join(data[:i+1], Separator))
}
Here in serialize()
method, we just concatenate values of all nodes with comma separator.
Deserialize
Since we are done with serialization, we need to do the opposite. Deserializing gonna be a bit more tricky, but the main idea is still the same.
First, we need to prepare our serialized data to build an actual tree. Serialized representation of the tree from the example is "[1,2,3,null,null,4,5]"
. Let's unwrap /s'up Rust fellows! ;) / brackets and build a string slice.
nodes := strings.Split(
strings.TrimSuffix(strings.TrimPrefix(data, "["), "]"),
Separator,
)
For now, we got a string slice as an input, and we need to build a whole tree from it.
As a first step, we create root
node by hand, using the head of data
slice like a root value. Next, we gonna move through data
slice tail and build the tree recursively. Here is still the same BFS algorithm with a queue.
const EmptyNodeMark = "null"
func bfsBuild(q *NodeQueue, data []string) error {
if len(data) == 0 || q == nil {
return nil
}
var nextQ NodeQueue
for !q.IsEmpty() {
n := q.Pop()
if n != nil {
// if the data tail of current level contains only nulls, they could be reduced.
// that means, if the data becomes empty earlier than level does, there is no more nodes
if len(data) == 0 {
return nil
}
l, r := data[0], data[1]
data = data[2:]
if l != EmptyNodeMark {
if lVal, lErr := strconv.Atoi(l); lErr == nil {
n.Left = &TreeNode{Val: lVal}
} else {
return lErr
}
}
if r != EmptyNodeMark {
if rVal, rErr := strconv.Atoi(r); rErr == nil {
n.Right = &TreeNode{Val: rVal}
} else {
return rErr
}
}
nextQ.Push(n.Left, n.Right)
}
}
return bfsBuild(&nextQ, data)
}
When we fetched the first value from the data
slice, the length of this slice must be always even. Because of in a binary tree and all non-nil nodes guaranteed to have two children (including nil-nodes). So each iteration we are creating children for all non-nil nodes in the queue, and extracting for this purpose len(q) * 2
elements from the data
slice. Until the data
become empty.
That is it! Now we can merge data
parsing and tree building functions in TreeNode
constructor.
func NewTreeNode(data string) (*TreeNode, error) {
if len(data) < 2 {
return nil, errors.New("invalid input")
}
nodes := strings.Split(
strings.TrimSuffix(strings.TrimPrefix(data, "["), "]"),
Separator,
)
rootVal, err := strconv.Atoi(nodes[0])
if err != nil {
return nil, err
}
root := &TreeNode{Val: rootVal}
err = bfsBuild(&NodeQueue{root}, nodes[1:])
if err != nil {
return nil, err
}
return root, nil
}
TreeNode package
Needless to say, you could find all this code in the repo: https://github.com/egregors/TreeNode
And moreover, you can use this package as a dependency to deal with LeetCode binary tree problems, or whatever you want.
Top comments (1)
Amazing post as always! Keep up the good work!