DEV Community

Michael V
Michael V

Posted on

How I Befriended Segment Trees

Last week, I talked about one advanced data structure that is useful for handling frequent range queries – Fenwick Tree. However, it's not very versatile as it is mostly used for sum queries, and I personally never used it for any other kind of problem. However, there is another player on the field, which has many more hats. Ladies and gentlemen, meet the Segment Tree, one of the most useful, yet often overlooked data structure.

Why I decided to learn it

So, what was my motivation behind learning Segment Trees? The answer is quite simple, actually. I got tired of solving problems I couldn't crack. You can think of it as trying to build a piece of furniture without the right tools. In this example, the furniture is all the Segment Tree problems I would be trying to solve using other methods, i.e., using the wrong tools.

Every time I'd finish a contest and see someone post a solution saying that it was a typical Segment Tree problem, I'd get a little discouraged, since I still didn't know anything about this data structure. I'd always put off learning it, since it seemed so overly complex. Seriously, I'd just google "Segment Tree," and the first link I'd get would usually be from cp-algorithms. The article on that website looks even more complex than my bachelor's diploma, if I'm being honest with you. Still, I was sure that there had to be another, more beginner-friendly, way of learning this data structure.

How I learned the basics

If you read my previous blog post, you're already familiar with the Competitive Programmer's Handbook and the website where you can practice most of the algorithms and data structures described in this book. These two are my favorite resources when it comes to learning something new in the competitive programming realm. Of course, there are many other useful resources as well, and today I'm going to share with you a step-by-step process I took to learn Segment Trees pretty well.

First of all, I went to the "Segment Tree" section in the handbook I just shared with you. It has very clear explanation and a nice example that illustrates its basic functions sum and add. In my opinion, it's crucial to understand the idea behind each function as well as what is going on in the examples. I admit that some things in the code might not be clear on the first go, but it's okay.

Anyway, after finishing reading the section, I came up with my own Segment Tree implementation in Python:

from math import ceil, log2


class SegmentTree:
    def __init__(self, A: list[int]) -> None:
        self.n = 2 ** ceil(log2(len(A)))
        self.tree = [0] * (self.n * 2)

        for k, x in enumerate(A):
            self.add(k, x)

    def add(self, k: int, x: int) -> None:
        k += self.n

        self.tree[k] += x
        k //= 2

        while k >= 1:
            self.tree[k] = self.tree[k * 2] + self.tree[k * 2 + 1]
            k //= 2

    def sum(self, a: int, b: int) -> int:
        a += self.n
        b += self.n

        s = 0

        while a <= b:
            if a % 2 == 1:
                s += self.tree[a]
                a += 1

            if b % 2 == 0:
                s += self.tree[b]
                b -= 1

            a //= 2
            b //= 2

        return s
Enter fullscreen mode Exit fullscreen mode

As you can see, the only thing I had to write by myself was the __init__ function, which turned out to be pretty simple anyway. Note that ceil(log2(len(A))) is simply a way of getting the first number that is a power of 2, which is greater than or equal to len(A). Here, we also assume that len(A) != 0. Everything else in my code is implemented as per the explanations in the book.

After covering the basics, I went a little further and read the other sections, which included topics on other queries, finding the index of the minimum value, and range updates. I made sure to implement all of them and ran the examples from the book in my code.

After doing some playing around, I quickly noticed something else. The function add can actually be changed to set like so:

class SegmentTree:
    # ...

    def set(self, k: int, x: int) -> None:
        k += self.n

        self.tree[k] = x  # changed from "self.tree[k] += x"
        k //= 2

        while k >= 1:
            self.tree[k] = min(self.tree[k * 2], self.tree[k * 2 + 1])
            k //= 2
Enter fullscreen mode Exit fullscreen mode

This simple change came in handy when I implemented the SegmentTree that supports minimum queries:

from math import inf, ceil, log2


class SegmentTree:
    def __init__(self, A: list[int]) -> None:
        self.n = 2 ** ceil(log2(len(A)))
        self.tree = [inf] * (self.n * 2)

        for k, x in enumerate(A):
            self.set(k, x)

    def set(self, k: int, x: int) -> None:
        k += self.n

        self.tree[k] = x
        k //= 2

        while k >= 1:
            self.tree[k] = min(self.tree[k * 2], self.tree[k * 2 + 1])
            k //= 2

    def min(self, a: int, b: int) -> int:
        a += self.n
        b += self.n

        res = inf

        while a <= b:
            if a % 2 == 1:
                res = min(res, self.tree[a])
                a += 1

            if b % 2 == 0:
                res = min(res, self.tree[b])
                b -= 1

            a //= 2
            b //= 2

        return res
Enter fullscreen mode Exit fullscreen mode

Logically, when we are dealing with min/max queries, we don't want to increase/decrease certain elements, but rather change them altogether, which is why the set function seems more reasonable in this implementation.

Anyway, the point of everything I've been telling you so far is that once you start understanding the fundamentals, you begin to observe some other interesting patterns by yourself. Websites like cp-algorithms, being very compehensive and all, flood you with all these extra patterns and observations right from the get-go, which not only overwhelms you but also steals you of the opportunity to notice something by yourself! Of course those sites may still be very useful, especially if you use them as a reference when you already know the algorithms described there.

How I practiced what I learned

Once I dealt with the theory and came up with my own implementations, which, by the way, I've uploaded to this repository, I decided to find some good problems to practice. Not surprisingly, my first choice was CSES. It has a great collection of problems dedicated specifically to "Range Queries." If you're also on the path of mastering Segment Trees, I suggest that you solve the first 5 problems from that list:

All of them are simple, and you'll definitely feel a boost of confidence after clearing them. To practice more interesting techniques, I suggest you refer to the following problems:

  • Range Update Queries (the name says it all: you'll practice quick range updates)
  • Hotel Queries (definitely the hardest problem on this list, but it'll give you huge satisfaction if you solve it)

You can find my solutions in this repository. Incidentally, I plan to add solutions to many other problems there as well.

With CSES being a super useful site and all, sometimes it's simply not enough. Besides, the community there is pretty non-existent, which makes it harder for you to "get unstuck" if you don't know how to solve a certain problem. For that reason, I almost always try to find some similar problems to practice on LeetCode too. Luckily, there was a very nice problem on Segment Trees in one recent contest as well: Peaks in Array. I'd say that it's even harder than "Hotel Queries" from the list above. However, it definitely leaves you with a better understanding of how to spot a Segment Tree pattern in a competitive programming problem. Here's my solution, by the way. As you can see, I followed the same implementation I came up with after reading the handbook, only adding one extra method and slightly modifying the other functions. I guess once you start solving problems like this one, the process gets more creative, and I really love it.

What about more advanced topics?

At this point, some of you might be thinking, "But there's got to be something else to Segment Trees. They can't be so simple! After all, why does that cp-algorithms website have such a long discussion on them?" And you would be perfectly right! Of course, there are some advanced techniques, like "Lazy Propagation," and others, whose names I wouldn't even risk mentioning. Still, it's better not to rush trying to master everything at once. After all, those advanced techniques are probably what's meant by 20% of the results that come from 80% of efforts, according to the Pareto principle.

So, instead of worrying that you still don't know so much, I believe it's better to focus on what you already do know and practice spotting the patterns for various types of queries, just like we did in the problems "Hotel Queries" and "Peaks in Array." And once you're comfortable doing that, you'll definitely benefit more from learning the advanced techniques.

Perhaps I, too, will tackle those advanced topics one day and share my experiences with you. Still, the programming world is huge, and DSA makes up only a small part of it. I know I'll be learning and improving in many other areas as well. But if I find the motivation to learn those truly advanced concepts, I'll definitely give them a try. And even if I fail to understand all of them, one thing will remain certain: my past self would still be proud of me.

Top comments (0)