It’s exam day and, as soon as you wake up, your mind starts racing through all manners of algorithms, until it occurs to you that FizzBuzz was something you never attempted. You quickly fill your moka pot with water, spoon some cheap espresso grind into the filter, tighten the kettle and put it on the hob at low heat.
You flop into the only chair in your dorm room and skillfully enter your PIN on the Windows box in front of you. It takes some time for the window manager to compete with the likes of your Claymore miner to allow you to interface with your machine. As soon as the cursor on the screen synchronizes with your jiggling of the mouse (yes, the mouse), you summon Visual Studio, and start a new C++ console project.
You hastily type in a program for the typical FizzBuzz algorithm, demonstrating, to nobody in particular, the touch-typing skills that you have been honing since you were a pup. You sit back, admiring your work for two seconds, and immediately feel like challenging yourself. Leaning forward and typing furiously, you insert a high-resolution timer and change the algorithm to count (instead of displaying) the number of fizzes, buzzes and fizzbuzzes for numbers 1 through 4 billion. You end up with the following program:
#include <iostream>
#include <chrono>
int main()
{
int fizz = 0, buzz = 0, fizzbuzz = 0;
auto startTime = std::chrono::high_resolution_clock::now();
for (unsigned int i = 1; i <= 4000000000; i++) {
if (i % 3 == 0 && i % 5 == 0) {
fizzbuzz++;
}
else if (i % 3 == 0) {
fizz++;
}
else if (i % 5 == 0) {
buzz++;
}
}
auto endTime = std::chrono::high_resolution_clock::now();
auto totalTime = endTime - startTime;
printf("\t fizz : %d, buzz: %d, fizzbuzz: %d, duration %lld milliseconds\n", fizz, buzz, fizzbuzz, (totalTime / std::chrono::milliseconds(1)));
return 0;
}
Then you turn on optimizations, compile it and run it. Before the run completes, you go over to the hob, pick up a semi-clean mug nearby and fill it with the hot black liquid from the moka pot.
You sit back down and, taking a sip from the mug, note that your algorithm took 6 seconds to complete.
Even for a machine as old as your’s (Intel i7-7700K), this feels like a long time to crunch through 4 billion numbers. So, you start optimizing your algorithm.
Being a computer science student, you note (with a mildly pompous sense of pride) that all the time is consumed by the code inside your for-loop. You recognize that for 5% (fizzbuzz) of the 4 billion numbers there are 2 modulus operations being performed, whilst for about 23% (fizz) of the iterations 3 modulus operations were performed and for the remaining iterations, 4 modulus operations were performed (not taking into account the optimizations performed by the compiler). After a flurry of keystrokes, you end up with the following modified for-loop:
for (unsigned int i = 1; i <= 4000000000; i++) {
if (i % 3 == 0) {
if (i % 5 == 0) {
fizzbuzz++;
}
else {
fizz++;
}
}
else if (i % 5 == 0) {
buzz++;
}
}
Compile. Run! Down to 5.55 seconds.
You go through the algorithm and find yourself bemused upon realizing that each cycle of the for-loop performed exactly 2 modulus operations. That’s when you realize that the overheads of managing the for-loop and the logical operations were considerable. You start work on further optimizing your for-loop. Minutes later, you end up with the following:
for (unsigned int i = 1; i <= 4000000000; i++) {
isFizz = false;
if (i % 3 == 0) {
isFizz = true;
fizz++;
}
if (i % 5 == 0) {
if (isFizz) {
fizz--;
fizzbuzz++;
}
else {
buzz++;
}
}
}
After running the new program, you get a result of under 5.2 seconds.
Once again, your program performs only 2 modulus operations in every iteration. You realize that in the previous optimization your program performed 2 logical operations in each iteration. However, in your new optimization, your program performs 3 logical operations through 13% of the cycles and 2 logical operations each throughout the rest of the cycles. You have now established that incursions also have a cost, and you start wondering if you could limit them.
You get cracking on transforming your entire linear program into a multithreaded one. So, relying on native Windows multithreading, you fire off three CreateThread() calls, one for fizz, the other for buzz, and the last for fizzbuzz counting. This way, each thread does only 2 modulus operations with a single logic operation that does not have an incursion. Sometime after your mug is drained, you end up with the following program:
When you compile and run the program, you are thrilled to yield a computation time of 4.6 seconds. Success!
You then look at your watch and realize there was ample time to take the long way through the campus to the exam halls. Smelling your breath, and establishing that it was just south of unbearable, you pick up your bag and head outside. As you lock the door, you chuckle to yourself, “those humans don’t know what’s coming.” Then you run your tentacles through your antennae and slither off to the exam halls.
Top comments (0)