Do you crinch when you hear about logarithmic and exponential algorithms? Good news is, you don't need to be a calculus geek to understand them.
Here is a little test for you:
If an algorithm is said to have a runtime of O(log n) and it takes a second to get the result from 10 data points on your computer, how long does it take approximately to get a result from 1,000,000 data points on the same machine?
This time, another algorithm is said to have a runtime of O(2n) (exponential time), and it takes a second to calculate a result from 10 data points. How long will it take approximately to do the same from 20 data points?
The two questions are somewhat related. The first one deals with log2 (In Big-O notation, O(log n) is a shorthand for log2) and the other deal with 2n. They are two sides of the same coin.
2n = P is the same as log2P = n. It's just viewed from two different points of view. The best way to understand the difference is to demonstrate how we get the answer to the two previous questions.
Question 1
If an algorithm is said to have a runtime of O(log n) and it takes a second to get the result from 10 data points, how long does it take to get a result from 1,000,000 data points?
log2 10 = 3.3
(because 23.3 = 10)
log2 1000000 = 19.9
(because 219.9 = 1000000)
scale_factor = 19.9 / 3.3 = 6.03
If the code takes 1 second to run on 10 data points
Then it will take 6.03 seconds on 1,000,000 data points.
What about 1,000,000,000,000 data points?
log2 1000000000000 = 39.9
It will take 39.9 seconds to process a trillion data points.
39.9 seconds is just 40 times longer than what the code takes to process 10 data points, but it is now processing a trillion!. The amount of CPU time just grow at a very sticky rate compared to the amount of data. This makes logarithmic algorithm very scalable.
Question 2
If another algorithm is said to have a runtime of O(2n), and it takes a second to calculate a result from 10 data points. How long does it take to do the same from 20 data points?
Pay close attention. This is an "inverse" of the previous log version.
210 = 1024
220 = 1048576
scale_factor = 1048576 / 1024 = 1024
If the code takes 1 second to process 10 data points,
then it will take 1,024 seconds to process 20 data points.
What about 30 data points?
230 = 1073741824
scale_factor = 1073741824 / 1024 = 1048576
It will take 1,048,576 seconds to process 30 data points.
That is a lot considering the number of data points increases just 3x! That is because we are on the exponential side of 2. Exponential runtime is to be avoided at all cost.
Fun stuff: Try calculating how long this will take for 100 data points.
The takeaway
Even a supercomputer running inefficient algorithm cannot beat an efficient, scalable algorithm running on an average laptop.
Top comments (6)
Always remember that the theoretical efficiency of your algorithm may not be the actual runtime efficiency, because your data may not match what you expected when you wrote the algorithm or memory/CPU/compiler limitations may effect the actual efficiency.
So many people have discovered (the hard way) that an algorithm that looks efficient, turns out to be dog slow when actually implemented due to unexpected problems injected into the actual running solution.
e.g. A Hashtable is only better than a sequential search when the data is reasonably unique (based on your hashing algorithm), if the data almost always results in the same hash there's a good chance that a sequential search may be used to find the item within the bucket.
Beware of over-engineering without reasonable testing!
Isn't the runtime of the logarithmic alogrithm on 1 trillion data points equal to (log2 1000000000000) / 3.3 due to the scaling factor, resulting in a runtime of about 12 seconds?
The 2n exponential algorithm running on 100 data points would take 1.23794 * 1027 seconds to run. That is 3.92287 * 1019 years! The universe is "only" 13.772 * 109 years old!
Sweet! Thanks for adding this.
wow!
1.238 Octillion xD