loading...

I Guess I Kinda Get it...: The Prime Generator Story

kaelscion profile image kaelscion ใƒป4 min read

Today, we're going to talk about Primes. If you play Horizon: Zero Dawn and think we're talking about the Alpha Prime, Elizabet Sobek, we're not. Nor are we referring to any leader, past or present, of the Autobots. Instead, we're talking about prime numbers.

More specifically, we're going to talk about the why behind an exercise that many of us did in CS 101 or in whiteboard interviews or, at the very least, have had to explain on occasion to an interviewer: prime number generators.

Now, a prime number generator is known mostly for its use in cryptography systems, mathematical applications, and even data science. But, for most of us, its just an excercise that we do when covering the Num Theory part of university CS degrees.

So why do I decide to talk about them? Well, its because if you follow my content, you know that I like to dig into certain topics and to understand the why, not just the what. And, despite my best efforts, prime numbers have come up in my programming more often than I would like to admit. Plus, as engineers, developers and coders, it is our job to understand as much as we can about the machines we work with every day, and our duty to try and understand, as best we can, what our colleagues do every day. I, personally, believe that even little tidbits like the topic of today's blog post will help us to be better developers, even if we never write them in production ourselves. I also believe that it will help us to empathize with team members, other teams, or even friends on teams at other companies, which will make the world of software engineering better for everyone!

So, getting into it, today we're just going to cover a small, yet meaningful, tidbit on how we write efficient prime generators. I'll be using Python, because I love it and its my go-to language, but you can follow along in whatever language you like.

As we know, a Prime Number is a number that is only divisible by itself and 1. A function that generates all prime numbers between 1 and 1000 would, most often, look like this:

def prime_gen()
    for num in range(1, 1000):
        if num > 1:
            for i in range(2, num):
                if (num % i) == 0:
                    break
                else:
                    yield(num)

On my Macbook pro, the run time to print all elements in this generator is 0.564s. But what if I told you we could decrease that bigtime with only a few keystrokes, and not change the structure of the generator at all?

How? Well, you see this block of code here?

for i in range(2, num):
        if (num % i) == 0:
            break

This block tests to see if any of the numbers between 2 (because 1 is not a prime number and is skipped) and the number we are testing from prime-ness (num) are factors of num. If we find some that are, then num is not a prime number (because Primes are only divisible by themselves and 1), and therefore we can stop testing for prime-ness.

But this block actually tests twice as many numbers as it needs to, slowing the testing process down. Why is that?

To put it plainly, if you haven't found a factor of a number (aka, a number that divides evenly into it) by the time you hit that number's square root, you ain't gonna find one at all.

This is because a number's square root is a smaller number that equals the original number when multiplied by itself. This then means, that half of the factors of a number will be below the square root, and half will be above it. In turn, this means that if we're testing a number for factors and haven't found any that are less than its square root, we won't find any that are above it's square root and can safely conclude that the number is prime!

Now, armed with such knowledge, we can say with a straight face, why we only test for a prime's factors up to the square root of the number in question. It also means, that we can add the following code to our original generator:

for i in  range(2, ceil(sqrt(num)  +  1)):
    if (num % i) ==  0:
        break
    else:
        yield(num)

As a side note, in Python, the ceil() and sqrt() functions are part of the math class, so be sure to import them before trying to use them. Your completed generator, with the new code, should look like this:

from math import ceil, sqrt

def prime_gen():
    for num in  range(2, 1000):
        if num >  1:
            for i in range(2, ceil(sqrt(num)  +  1)):
                if (num % i) ==  0:
                    break
                else:
                    yield(num)

And, on the same equipment as our first test, the run time to print the elements for our revised generator is 0.023s. That is only 4% of the run time of our first generator!

So, I hope this has helped some of those out there to get a better grip on a CS principle that most of us (myself included) had an "I kinda get it..." opinion of at first.

As always, please keep coding! No matter what you do in this industry. No matter where you come from, what your social status is, what your parents did, what neighborhood/country/district you're from, or anything else that people try to convince you means you're not cut out for this gig, code on. The world needs those with every possible approach to any problem facing it. Which means they need the diversity that you can provide. Otherwise, we'll always solve the same problems the same way and get the results we've always gotten! Happy coding everybody!!!

Discussion

pic
Editor guide