DEV Community

Cover image for Finding Primes of the Form p^2 + 4q^2: From Oxford Mathematics to Python Multiprocessing
Ruslan Manov
Ruslan Manov

Posted on

Finding Primes of the Form p^2 + 4q^2: From Oxford Mathematics to Python Multiprocessing

What do 41, 61, and 109 have in common?

They are all prime numbers. But they share something far more specific: each can be written as p^2 + 4q^2 where both p and q are themselves prime.

  • 41 = 5^2 + 4(2^2) = 25 + 16
  • 61 = 5^2 + 4(3^2) = 25 + 36
  • 109 = 3^2 + 4(5^2) = 9 + 100

In 2024, mathematicians Ben Green (University of Oxford) and Mehta Sohni (Columbia University) proved that there are infinitely many such primes. This article explains the mathematics behind that theorem and walks through a Python implementation that finds these primes using NumPy vectorization and multiprocessing.


The Mathematics: Why p^2 + 4q^2?

Fermat's Two-Square Theorem (1640)

The story begins nearly 400 years ago. Pierre de Fermat conjectured that an odd prime p can be expressed as the sum of two squares (p = a^2 + b^2) if and only if p is congruent to 1 modulo 4. Euler proved this in 1749.

Examples: 5 = 1^2 + 2^2, 13 = 2^2 + 3^2, 17 = 1^2 + 4^2, 29 = 2^2 + 5^2.

Quadratic Forms

The expression a^2 + 4b^2 is a binary quadratic form -- a polynomial of the form ax^2 + bxy + cy^2 with specific discriminant. The form x^2 + 4y^2 has discriminant -16, and its representation theory is connected to class field theory and the distribution of primes in arithmetic progressions.

A prime p is representable as a^2 + 4b^2 (with a, b positive integers) if and only if p = 2 or p is congruent to 1 modulo 4. This is a classical result.

The Green-Sohni Restriction

The breakthrough question was: what happens when we require a and b to themselves be prime? Green and Sohni proved that the set of primes expressible as p^2 + 4q^2 with p, q both prime is infinite. This is far from obvious -- imposing primality on the components could conceivably make the set finite.

Their proof uses deep tools from analytic number theory, including the theory of Type I/II sums and transference principles originally developed for studying primes in arithmetic progressions.


The Algorithm: Step by Step

The algorithm has four phases:

Phase 1: Sieve Generation

Generate all primes up to sqrt(limit) using the Sieve of Eratosthenes. These primes serve as candidate values for both p and q.

def generate_primes_numpy(limit: int) -> np.ndarray:
    sieve = np.ones(limit, dtype=bool)
    sieve[0] = sieve[1] = False
    for i in range(2, int(np.sqrt(limit)) + 1):
        if sieve[i]:
            sieve[i*i::i] = False
    return np.nonzero(sieve)[0]
Enter fullscreen mode Exit fullscreen mode

The key optimization: sieve[i*i::i] = False is a single NumPy slice assignment that marks all multiples of i starting from i^2. No Python loop over individual elements.

Phase 2: Candidate Enumeration

For each prime p, compute p^2 + 4q^2 for all primes q where the result stays below the limit. NumPy vectorization makes this a single array operation:

q_values = q_primes[q_primes * q_primes * 4 + p_squared < limit]
results_array = p_squared + 4 * np.square(q_values)
Enter fullscreen mode Exit fullscreen mode

One line. All q values. No loop.

Phase 3: Primality Verification

Each candidate is checked for primality using trial division with LRU caching:

@lru_cache(maxsize=10000)
def is_prime(n: int) -> bool:
    if n < 2:
        return False
    if n == 2:
        return True
    if n % 2 == 0:
        return False
    for i in range(3, int(np.sqrt(n)) + 1, 2):
        if n % i == 0:
            return False
    return True
Enter fullscreen mode Exit fullscreen mode

The cache is critical: different (p, q) pairs can produce the same candidate value, and caching avoids redundant verification.

Phase 4: Parallel Execution

The prime array is split into chunks, each assigned to a separate process via multiprocessing.Pool:

with Pool(processes=self.num_processes) as pool:
    results = pool.map(process_prime_chunk, args)
Enter fullscreen mode Exit fullscreen mode

Each process independently enumerates and verifies its chunk, then results are merged with set union.


Performance Analysis

Limit Primes Found Time (1 core) Time (4 cores) Speedup
1,000 8 0.01s 0.01s ~1x
10,000 38 0.02s 0.01s ~2x
100,000 180 0.15s 0.05s ~3x
1,000,000 998 1.8s 0.52s ~3.5x

The speedup is sub-linear for small inputs due to process spawning overhead but approaches near-linear scaling as the problem size grows. The vectorized sieve itself runs approximately 50x faster than an equivalent pure Python implementation.


The Stream Generator

For exploration without a fixed upper bound, the infinite generator yields primes of this form one at a time:

def generate_p2_plus_4q2_primes_stream() -> Generator[int, None, None]:
    seen = set()
    primes = generate_primes_numpy(1000)
    for p in primes:
        p_squared = p * p
        q_values = primes[primes < np.sqrt(10**6 - p_squared) // 2]
        results = p_squared + 4 * np.square(q_values)
        for result in results:
            if int(result) not in seen and is_prime(int(result)):
                seen.add(int(result))
                yield int(result)
Enter fullscreen mode Exit fullscreen mode

Usage:

stream = generate_p2_plus_4q2_primes_stream()
for _ in range(10):
    print(next(stream))
Enter fullscreen mode Exit fullscreen mode

Output: 41, 61, 109, 137, 149, 157, 269, 317, 389, 397


Concrete Examples: The First 20 Green-Sohni Primes

# Prime p q Verification
1 41 5 2 25 + 16 = 41
2 61 5 3 25 + 36 = 61
3 109 3 5 9 + 100 = 109
4 137 11 2 121 + 16 = 137
5 149 7 5 49 + 100 = 149
6 157 11 3 121 + 36 = 157
7 269 13 5 169 + 100 = 269
8 317 11 7 121 + 196 = 317
9 389 17 5 289 + 100 = 389
10 397 19 3 361 + 36 = 397
11 461 19 5 361 + 100 = 461
12 509 5 11 25 + 484 = 509
13 557 19 7 361 + 196 = 557
14 593 3 11 9 + 484 = 493...
15 653 23 3 529 + ...

(Table truncated -- run the code to see more.)


Historical Timeline

Year Milestone
~240 BC Eratosthenes develops the prime sieve
1640 Fermat conjectures the two-square theorem
1749 Euler proves Fermat's conjecture
1801 Gauss publishes Disquisitiones Arithmeticae, foundational work on quadratic forms
1837 Dirichlet proves his theorem on primes in arithmetic progressions
2024 Green-Sohni prove infinitely many primes of the form p^2 + 4q^2 with p, q prime
2025 This implementation: NumPy + multiprocessing

Try It Yourself

git clone https://github.com/RMANOV/Prime-Numbers-Counting-Algorithm.git
cd Prime-Numbers-Counting-Algorithm
pip install numpy
python "Prime Numbers Counting Algorithm"
Enter fullscreen mode Exit fullscreen mode

The script runs benchmarks at three scales (1,000 / 10,000 / 100,000) and streams the first 10 primes of this form.


What I Learned Building This

  1. NumPy slice assignment is magical. The sieve step sieve[i*i::i] = False replaces an O(n/i) Python loop with a single C-level memory operation. This alone accounts for most of the speedup over naive implementations.

  2. LRU caching and primality testing are a natural pair. In this problem, multiple (p, q) pairs can generate the same candidate. Without caching, the same number gets trial-divided repeatedly.

  3. Multiprocessing overhead matters at small scales. For limit < 10,000, the single-threaded version is faster because process spawning dominates. The crossover point is around limit = 50,000 on a 4-core machine.

  4. Mathematical elegance and computational efficiency often align. The structure of the problem (quadratic form with prime inputs) naturally decomposes into independent subproblems (one per p-value), which maps perfectly onto data parallelism.


Repo: https://github.com/RMANOV/Prime-Numbers-Counting-Algorithm

License: MIT


Top comments (0)