As instructed by the first lesson of the book Understanding Algorithms, we must create a binary search that uses the "divide to conquer" method to find a number inside an array in the most performant way possible.
As instructed by the first lesson of the book Understanding Algorithms, we must create a binary search by explaining the context: what is a binary search?
There is a situation that can help you find it: suppose you have a book with 5,000 pages and you want to find a specific page in it. Let's imagine you're looking for page 487.ria that uses the "divide to conquer" method to find a number inside an array in the most performant way possible.
To find page 487 you can use some methods, one more efficient than the other:
Simple Search
The simple search method goes through the book page by page until you find page 487. taking into account that you take 1 second per page, it would take you 487 seconds to find the page.
Binary Search
The binary search method is a way to find the page in the most optimized way possible. with this method you would divide the book of 5000 pages in half, that is, you would open the book at page 2500 and ask yourself the following question: "the page I want is before or after the middle of the book?"
If your answer is "after", you will start searching for the specific page from page 2500 of the book,
If the answer is "before", you start searching for the specific page from 0 to 2499.
See in practice:
When we compare the two execution times of the simple and binary searches, we will notice that with small numbers the simple search can have faster results, but to what extent?
To find out, let's run some tests.
Comparatives
Now, let's use a book of 50,000 pages and the chosen page will be a random one.
Binary Search | Basic Search |
---|---|
Random page in a 500,000 page book
*here we already start to notice a difference
Binary Search | Basic Search |
---|---|
Random page in a 5,000,000 page book
*big time difference
Binary Search | Basic Search |
---|---|
Final thoughts
Binary search is one of the strategies you can adopt when you need to look for something specific in a huge collection of other objects.
Even in everyday situations the use of binary search makes a lot of sense.
But the following observation is also important: if you are a beginner and you are having your first contact with BigO optimizations and performative operations, it is important that you know that binary search is NOT a silver bullet
It takes a while since it makes sense to use binary search instead of a basic search, and most likely your backend projects won't need it, as many frameworks you use already do.
Beware of overengineering.
Thumbnail by Takashi Miyazakion Unsplash.
Top comments (3)
For large lists the difference really is massive. I implemented the linear scan from your post, the binary search from your post, and the branchless binary search from mhdm.dev/posts/sb_lower_bound/ and when searching a list of 500 million for page 490 ish million, the linear search takes 0.85 seconds while the binary searches take 4 and 1 micro(!) seconds respectively.
great! thank you for taking a time to test the code and read the article
Cheers! Forgot to mention btw that I reimplemented in C but I think that's only relevant for the branchless version.