Rusu Emanuel

Posted on Jun 10, 2022 • Edited on Jul 28, 2022

Sieve of Eratosthenes

#cpp #programming #computerscience #tutorial

Heeelloooo everyone!🔥😎

As I told you in the previous post, I am going to write about the Sieve of Eratosthenes. Commonly, this algorithm comes together with the primality of a number problem. It is used in contests, websites with algorithmic problems, competitive programming, Olympiads, and maybe at university admission competitions. Sieve of Eratosthenes finds all prime numbers between an interval. It is a time-efficient algorithm because instead of checking all numbers between a given interval, we can mark some numbers as not primes without even checking them. This way, we avoid unnecessary analyses.
On the other hand, it is not a space-efficient algorithm. We have to keep track of numbers marked as not primes and numbers marked as primes if we don't store this information (number, marked/unmarked) we can't implement the Sieve of Eratosthenes idea. Enough introduction. Let's dig in.

Problem to solve: Print all prime numbers between 2 and n.

Idea🤔

We have to analyze numbers from 2 to n. Firstly, we write them down:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 ... n

Next, we will do something which is called marking. To illustrate this, I will use strikethrough digits. A strikethrough digit means a marked number, so not a prime number; a simple number means an unmarked number, so a prime number.

Then we begin from 2 till n, and for every number, we mark all its multiples till n inclusive.

We are at 2, so we mark 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26 and all numbers 2 * k, with k > 1 and stop when 2 * k > n.

2 3 ̷4 5 ̷6 7 ̷8 9 ̷1̷0 11 ̷1̷2 13 ̷1̷4 15 ̷1̷6 17 ̷1̷8 19 ̷2̷0 21 ̷2̷2 23 ̷2̷4 25 2̷6 ... n

We move at 3 so we mark 6, 9, 12, 15, 18, 21, 24 and all numbers 3 * k, with k > 1 and stop when 3 * k > n.

2 ̷3 ̷4 5 ̷6 7 ̷8 ̷9̷ ̷1̷0 11 ̷1̷2 13 ̷1̷4 ̷1̷5 ̷1̷6 17 ̷1̷8 19 ̷2̷0 ̷2̷1 ̷2̷2 23 ̷2̷4 25 2̷6 ... n

We continue this process by taking every number till n and marking all its multiples till n inclusive.

By the end of this algorithm, the numbers that remained unmarked are only prime numbers.

Let's take an example:

Let's find all prime numbers until 22(n).

List all numbers till n = 22:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Start from 2 and mark all its multiples smaller or equal than 22.

2 3 ̷4 5 ̷6 7 ̷8 9 ̷1̷0 11 ̷1̷2 13 ̷1̷4 15 ̷1̷6 17 ̷1̷8 19 ̷2̷0 21 ̷2̷2

Move to 3 and mark all its multiples smaller or equal than 22.
2 3 ̷4 5 ̷6 7 ̷8 ̷9̷ ̷1̷0 11 ̷1̷2 13 ̷1̷4 ̷1̷5 ̷1̷6 17 ̷1̷8 19 ̷2̷0 ̷2̷1 ̷2̷2

If we want to move to 4, we can see that it's multiples smaller
than 22 are already marked, similarly with 5, 6, 7, ...,
22.

2 3 ̷4 5 ̷6 7 ̷8 ̷9̷ ̷1̷0 11 ̷1̷2 13 ̷1̷4 ̷1̷5 ̷1̷6 17 ̷1̷8 19 ̷2̷0 ̷2̷1 ̷2̷2

Write down unmarked numbers: 2 3 5 7 11 13 17 19

Done!
Congratulations!🥳 You found all primes numbers using the Sieve of Eratosthenes algorithm.

Maybe you want to see the code but let me explain it before.👨‍🏫

First, I will use a straightforward algorithm and then optimize it as we go deeper and deeper.

As I said in the beginning, we need to write down the numbers from 2 to n. We do that by using a boolean array like this:

#define MAX 1000000001
bool sieve[MAX];

Why do we use such a big number? Because we don't know how big the interval is, we give as much freedom as possible; note that this number may not work for your computer, and you will need to make it smaller.

Then I said that we needed to mark numbers somehow. In Sieve of Eratosthenes, we use a global array, and global variables are automatically initialized with 0. We use that to our advantage instead of a for to set all memory spaces manually with 0.

We are doing marking like this:

0 - prime
1 - not prime

Why? I know that using 1 for primes and 0 for not primes would have been more intuitive but declaring an array in global scope, it got initialized with 0, so it is just a waste to overwrite all memory spaces with 1.

Then we mark in advance 0 and 1 because we know these numbers are not prime.

sieve[0] = sieve[1] = 1;

Now we start to iterate from 2 till n.

for (unsigned long long i = 2; i <= n; ++i)

For every i, we mark all its multiples smaller or equal than n.

for (unsigned long long j = 2; j <= n / i; ++j) sieve[i * j] = 1;

We iterate till n / i because this division gave us the biggest number, which can be multiplied by j, which is smaller or equals to n, so iterating till this number, we can guarantee that we marked all multiples of i smaller than n. sieve[i * j] is the multiple of i.

The last thing we have to do is to iterate the interval and print all unmarked values. We do that by asking if sieve[i] == 0 in a for loop like this:

    for (unsigned long long i = 2; i <= n; ++i)
    {
        if (sieve[i] == 0) cout << i << ' ';
    }

The first version of our algorithm is done:


#include <iostream>
using namespace std;


#define MAX 1000000001
bool sieve[MAX];


void PrintSieveofEratosthenes(unsigned long long n)
{
    for (unsigned long long i = 2; i <= n; ++i)
    {
        if (sieve[i] == 0) cout << i << ' ';
    }
}


void SieveofEratosthenes(unsigned long long n)
{
    sieve[0] = sieve[1] = 1;

    for (unsigned long long i = 2; i <= n; ++i)
    {
        for (unsigned long long j = 2; j <= n / i; ++j) sieve[i * j] = 1;
    }

}


int main()
{
    unsigned long long n; cin >> n;
    SieveofEratosthenes(n);
    PrintSieveofEratosthenes(n);

    return 0;
}

I like to refactor this a bit. I consider that the function call for printing is a bit unnecessary because we can print numbers inside the first loop. We know that if sieve[i] == 0 we found a prime number. Here is another version that I like more:


#include <iostream>
using namespace std;


#define MAX 1000000001
bool sieve[MAX];


void SieveofEratosthenes(unsigned long long n)
{
    sieve[0] = sieve[1] = 1;

    for (unsigned long long i = 2; i <= n; ++i)
    {
        for (unsigned long long j = 2; j <= n / i; ++j) sieve[i * j] = 1;

        if (sieve[i] == 0) cout << i << ' ';
    }
}


int main()
{
    unsigned long long n; cin >> n;
    SieveofEratosthenes(n);

    return 0;
}

Now we get into optimizations. Sieve of Eratosthenes is more than two loops. If you read till now, congratulations!🥳

We know that every even number greater than 2 is not prime. We can mark in advance these numbers by beginning from 4 and iterating from 2 to 2.

for(unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;

Because we marked even numbers before, we don't need to analyse them again so we can start from i = 3 and increment it by 2. This way we take into consideration only odd numbers.

Then we can iterate only till √n using i * i <= n.


    sieve[0] = sieve[1] = 1;
    for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;

    for (unsigned long long i = 3; i * i <= n; i += 2)
    {
        for (unsigned long long j = 2; j <= n/i; ++j) sieve[i*j] = 1;
    }

We can optimize how we iterate through multiples of every number. We can begin from i * i because all numbers till i * i have been already marked; and increment j with i.


    sieve[0] = sieve[1] = 1;
    for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;

    for (unsigned long long i = 3; i * i <= n; i += 2)
    {
        for (unsigned long long j = i * i; j <= n; j+=i) sieve[j] = 1;
    }

We can iterate with j += 2 * i to ignore odd numbers.


    sieve[0] = sieve[1] = 1;
    for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;

    for (unsigned long long i = 3; i * i <= n; i += 2)
    {
        for (unsigned long long j = i * i; j <= n; j += 2 * i) sieve[j] = 1;
    }

Here is the final code:

#include <iostream>
using namespace std;

#define MAX 1000000001
bool sieve[MAX];


void PrintSieveofEratosthenes(unsigned long long n)
{
    for (int i = 2; i <= n; ++i)
    {
        if (sieve[i] == 0) cout << i << ' ';
    }
}


void SieveofEratosthenes(unsigned long long n)
{
    sieve[0] = sieve[1] = 1;

    for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;

    for (unsigned long long i = 3; i * i <= n; i += 2)
    {
        for (unsigned long long j = i * i; j <= n; j += 2* i) sieve[j] = 1;
    }
}


int main()
{
    unsigned long long n; cin >> n;
    SieveofEratosthenes(n);
    PrintSieveofEratosthenes(n);

    return 0;
}

This block of code gives time complexity:

for (unsigned long long i = 3; i * i <= n; i += 2)
{
    for (unsigned long long j = i * i; j <= n; j += 2 * i) sieve[j] = 1;
}

The first loop performs √n operations because we iterate till i * i <= n. The idea of the second loop is to mark all multiples of i. So, it will perform n / i steps. It is a known formula that if you want to find out how many divisors of a number till n are, you divide n / your_number. So for every i we will perform n / i steps. (technically, we perform fewer steps because I increased j with 2 * i to avoid iterating through even multiples of i, but ignore that. What I did is not enough to decrease the order of complexity anyway).

For every i till √n we perform n / i operations, where i is prime.

The ecuation is:
√n * (n/3 + n/4 + n/5 + n/6 + n/7 + ...)
= √n * n * (1/3 + 1/4 + 1/5 + 1/6 + 1/7 + ...)
The third factor, according to Euler grows the same as ln(ln(n))
So the time complexity is: O(√n * n * ln(ln(n))).

Observations❗:

Note that I used unsigned long long for variables, and operations like raising to power can easily exceed unsigned long long, so be careful with input numbers.
You can play around with #define MAX 1000000001 and #define MAX_PRIMES 100000001. Try to find which is the maximum limit accepted by your computer.
To make this algorithm space-efficient, you can use a vector container from the STL library like this vector <bool> vector_name. Before that, include vector library like #include <vector>. That container is not a regular one. It is a memory-optimization of template classes like vector <T>, which uses only n/8 bytes of memory. Many modern processors work faster with bytes because they can be accessed directly. vector <T> template class works directly with bits. But these things require knowledge about template classes, which are very powerful in C++.
This algorithm can be optimized even more using bits operations to achieve linear time. But ln(ln(n)) can be approximated to O(1).

And here is the bonus algorithm for testing primality I spoke about in the post before(probably the fastest for checking primality)🏎️:

#include <iostream>
using namespace std;


#define MAX 1000000001
#define MAX_PRIMES 100000001
bool sieve[MAX];
unsigned long long primeNumbers[MAX_PRIMES];


void SieveofEratosthenes(unsigned long long n)
{
    sieve[0] = sieve[1] = 1;

    for (unsigned long long i = 4; i <= n; i += 2) sieve[i] = 1;

    for (unsigned long long i = 3; i * i <= n; i += 2)
    {
        for (unsigned long long j = i * i; j <= n; j += 2 * i) sieve[j] = 1;
    }


    unsigned long long k = 0;
    for (int i = 2; i <= n; ++i)
    {
        if (sieve[i] == 0) primeNumbers[k++] = i;
    }
}


bool IsPrime(unsigned long long Number)
{
    unsigned long long k = 0;

    while (Number > 1)
    {
        unsigned long long power = 0;

        if (Number % primeNumbers[k] == 0)
        {
            Number /= primeNumbers[k];
            ++power;
        }

        if (power) return false;

        ++k;

        if (primeNumbers[k] * primeNumbers[k] > Number) return true;
    }
}


int main()
{
    unsigned long long number; cout << "Enter number: "; cin >> number; cout << '\n';
    SieveofEratosthenes(number);

    IsPrime(number) == 1 ? cout << "Prime\n" : cout << "Not prime\n";

    return 0;
}

Emanuel Rusu
You can visit my GitHub
Or you can find me on [Linkedin]
Rusu Emanuel
Next topic: Binary search trees🌳
See you next time ! 👋

DEV Community

Sieve of Eratosthenes

Idea🤔

Let's take an example:

Maybe you want to see the code but let me explain it before.👨‍🏫

Observations❗:

Top comments (0)

Read next

How to Set Git Username and Password in Git Bash?

Un senior destaca por su forma de pensar, no de codear

How I Improved My Productivity as a Developer in 30 Days 🚀

AI Search Capabilities Hit Roadblock: Transformers Struggle with Large-Scale Path Finding