zuka3

Posted on Mar 26

The Math Behind RSA #3: Implementing RSA from Scratch in Python

#rsa #cryptography #python #security

This article has a more math-focused version with formal proofs on Folio.

In the first two parts of this series, we built up the number theory toolkit: modular arithmetic, Euler's totient function, the extended Euclidean algorithm, and Fermat's little theorem. Now it's time to put all of that to work and implement RSA from scratch in Python.

By the end of this article you will have a working RSA implementation, understand why each mathematical piece is needed, and see a formal proof that decryption actually recovers the original message.

RSA in Three Steps

The entire RSA cryptosystem reduces to three operations:

Key Generation -- produce a public key $(n, e)$ and a private key $(n, d)$ .
Encryption -- given plaintext $m$ , compute ciphertext $c = m^e \bmod n$ .
Decryption -- given ciphertext $c$ , recover $m = c^d \bmod n$ .

The security rests on one assumption: given $n = pq$ , it is computationally infeasible to factor $n$ back into $p$ and $q$ when the primes are large enough.

Recap: Helper Functions from Parts #1 and #2

We need three building blocks that we derived in earlier articles. Here they are as Python functions.

Greatest Common Divisor

def gcd(a, b):
    while b:
        a, b = b, a % b
    return a

Extended Euclidean Algorithm

Returns $(g, x, y)$ such that $ax + by = g = \gcd(a, b)$ .

def extended_gcd(a, b):
    if a == 0:
        return b, 0, 1
    g, x1, y1 = extended_gcd(b % a, a)
    return g, y1 - (b // a) * x1, x1

Modular Inverse

Using the extended GCD, we can find $e^{-1} \bmod \phi(n)$ :

def mod_inverse(e, phi):
    g, x, _ = extended_gcd(e % phi, phi)
    if g != 1:
        raise ValueError("Modular inverse does not exist")
    return x % phi

Miller-Rabin Primality Test

We need a way to find large primes. Miller-Rabin is probabilistic but extremely reliable with enough rounds:

import random

def is_prime(n, rounds=40):
    if n < 2:
        return False
    if n < 4:
        return True
    if n % 2 == 0:
        return False
    # Write n-1 as 2^r * d
    r, d = 0, n - 1
    while d % 2 == 0:
        r += 1
        d //= 2
    for _ in range(rounds):
        a = random.randrange(2, n - 1)
        x = pow(a, d, n)
        if x == 1 or x == n - 1:
            continue
        for _ in range(r - 1):
            x = pow(x, 2, n)
            if x == n - 1:
                break
        else:
            return False
    return True

def generate_prime(bits):
    while True:
        p = random.getrandbits(bits) | (1 << (bits - 1)) | 1
        if is_prime(p):
            return p

Key Generation

def generate_keypair(bits=1024):
    p = generate_prime(bits // 2)
    q = generate_prime(bits // 2)
    n = p * q
    phi = (p - 1) * (q - 1)

    e = 65537
    assert gcd(e, phi) == 1, "e and phi(n) are not coprime; regenerate primes"

    d = mod_inverse(e, phi)

    return {
        "public": (n, e),
        "private": (n, d),
        "_p": p,
        "_q": q,
    }

Why e = 65537?

The public exponent $e$ must satisfy $1 < e < \phi(n)$ and $\gcd(e, \phi(n)) = 1$ . In theory, any such $e$ works. In practice, nearly every RSA implementation uses $e = 65537 = 2^{16} + 1$ for two reasons:

Speed -- its binary representation 10000000000000001 has only two set bits, making modular exponentiation via square-and-multiply very fast (just 17 squarings and 1 multiplication).
Security -- small exponents like $e = 3$ are vulnerable to certain attacks (e.g. Coppersmith's attack on low public exponents) when messages are short or padding is absent.

Encryption and Decryption

With the keys in hand, encryption and decryption are each a single modular exponentiation:

def encrypt(plaintext_int, public_key):
    n, e = public_key
    if plaintext_int >= n:
        raise ValueError("Message must be less than n")
    return pow(plaintext_int, e, n)

def decrypt(ciphertext_int, private_key):
    n, d = private_key
    return pow(ciphertext_int, d, n)

Python's built-in pow(base, exp, mod) uses fast modular exponentiation internally, so this is efficient even for 2048-bit numbers.

Quick Test

keys = generate_keypair(512)  # small for demo purposes
pub, priv = keys["public"], keys["private"]

message = 42
cipher = encrypt(message, pub)
plain = decrypt(cipher, priv)

assert plain == message
print(f"Original:  {message}")
print(f"Encrypted: {cipher}")
print(f"Decrypted: {plain}")

Proof of Correctness

Why does decrypting an encrypted message give back the original? This is where Euler's theorem from Part #2 pays off.

Claim. For any $m$ with $0 \le m < n$ , we have $m^{ed} \equiv m \pmod{n}$ .

Proof. By construction, $ed \equiv 1 \pmod{\phi(n)}$ , so there exists an integer $k$ such that:

ed = 1 + k\,\phi(n)

Case 1: $\gcd(m, n) = 1$ . Euler's theorem gives us $m^{\phi(n)} \equiv 1 \pmod{n}$ . Therefore:

m^{ed} = m^{1 + k\,\phi(n)} = m \cdot (m^{\phi(n)})^k \equiv m \cdot 1^k = m \pmod{n}

Case 2: $\gcd(m, n) \ne 1$ . Since $n = pq$ with distinct primes $p, q$ , if $\gcd(m, n) \ne 1$ then $p \mid m$ or $q \mid m$ (but not both, since $m < n$ ). Suppose $p \mid m$ . Then $m^{ed} \equiv 0 \equiv m \pmod{p}$ . Meanwhile, $\gcd(m, q) = 1$ , so by Fermat's little theorem, $m^{q-1} \equiv 1 \pmod{q}$ . Since $(q-1) \mid \phi(n)$ , the same argument gives $m^{ed} \equiv m \pmod{q}$ . Combining via the Chinese Remainder Theorem:

m^{ed} \equiv m \pmod{n} \qquad \blacksquare

CRT Optimization: 4x Speedup

In production RSA, the private key holder knows $p$ and $q$ . Instead of computing $c^d \bmod n$ directly (which operates on a number with $b$ bits), we can split the computation using the Chinese Remainder Theorem and work with numbers of $b/2$ bits:

def decrypt_crt(ciphertext, p, q, d):
    dp = d % (p - 1)
    dq = d % (q - 1)
    q_inv = mod_inverse(q, p)

    m1 = pow(ciphertext, dp, p)
    m2 = pow(ciphertext, dq, q)

    h = (q_inv * (m1 - m2)) % p
    return m2 + h * q

Why is this roughly 4x faster? Modular exponentiation with a $b$ -bit modulus and a $b$ -bit exponent takes roughly $O(b^3)$ time. Splitting into two $b/2$ -bit operations gives:

2 \times O!\left(\left(\frac{b}{2}\right)^3\right) = 2 \times \frac{O(b^3)}{8} = \frac{O(b^3)}{4}

That is a 4x speedup -- significant when you are doing thousands of decryptions per second on a server.

Verify CRT produces the same result

keys = generate_keypair(1024)
pub, priv = keys["public"], keys["private"]
p, q = keys["_p"], keys["_q"]

message = 123456789
cipher = encrypt(message, pub)

plain_standard = decrypt(cipher, priv)
plain_crt = decrypt_crt(cipher, p, q, priv[1])

assert plain_standard == plain_crt
print("CRT decryption matches standard decryption.")

String Encryption Demo

Real messages are strings, not single integers. Here is a simple demo that converts a string to an integer and back. (Production systems use proper padding schemes -- see the comparison table below.)

def str_to_int(s):
    return int.from_bytes(s.encode("utf-8"), "big")

def int_to_str(i):
    length = (i.bit_length() + 7) // 8
    return i.to_bytes(length, "big").decode("utf-8")

# Generate keys (2048-bit for realistic string lengths)
keys = generate_keypair(2048)
pub, priv = keys["public"], keys["private"]

message = "Hello, RSA!"
m = str_to_int(message)

cipher = encrypt(m, pub)
decrypted_int = decrypt(cipher, priv)
decrypted_msg = int_to_str(decrypted_int)

print(f"Original:  {message}")
print(f"Encrypted: {cipher}")
print(f"Decrypted: {decrypted_msg}")
assert decrypted_msg == message

Our Implementation vs. Production RSA

Our implementation is mathematically correct but not safe for production use. Here is what real libraries add:

Aspect	Our Implementation	Production RSA (e.g. OpenSSL)
Key size	512-2048 bit demo	2048-4096 bit minimum
Padding	None (textbook RSA)	OAEP (PKCS#1 v2.2)
Prime generation	Probabilistic (Miller-Rabin)	Miller-Rabin + additional checks
Side-channel protection	None	Constant-time operations, blinding
CRT decryption	Optional	Always used, with fault checks
Key storage	Plain Python dict	Hardware security modules (HSMs)
Random source	`random` module	OS-level CSPRNG (`/dev/urandom`)
Signature support	Not implemented	PSS padding (PKCS#1 v2.2)

Critical differences to understand:

Textbook RSA is deterministic: encrypting the same message twice produces the same ciphertext, which leaks information. OAEP padding adds randomness.
Side-channel attacks: our pow() call has data-dependent timing. Production implementations use constant-time modular exponentiation and RSA blinding.
The random module is not cryptographically secure. Use secrets or os.urandom for real applications.

Putting It All Together

All the code above composes into a single file. Concatenate the helper functions (gcd, extended_gcd, mod_inverse, is_prime, generate_prime), the core functions (generate_keypair, encrypt, decrypt, decrypt_crt), and the string utilities (str_to_int, int_to_str). Then run:

if __name__ == "__main__":
    keys = generate_keypair(2048)
    pub, priv = keys["public"], keys["private"]

    msg = "Hello, RSA!"
    m = str_to_int(msg)
    c = encrypt(m, pub)
    assert int_to_str(decrypt(c, priv)) == msg
    assert decrypt_crt(c, keys["_p"], keys["_q"], priv[1]) == m
    print(f"Original:  {msg}")
    print(f"Encrypted: {c}")
    print(f"Decrypted: {int_to_str(decrypt(c, priv))}")
    print("All assertions passed.")

What's Next

We now have a fully working RSA implementation and a proof that it is mathematically correct. But how secure is it really? In Part #4, we will explore:

Practical limitations of RSA: key size growth, performance costs, and why RSA is rarely used to encrypt data directly.
Known attacks: Wiener's attack on small $d$ , Bleichenbacher's padding oracle, and factoring advances.
Elliptic Curve Cryptography (ECC): how curves over finite fields give us the same security with dramatically smaller keys, and why the industry is moving toward ECC and post-quantum alternatives.

Stay tuned -- the finale brings everything full circle.

Note: A version of this article with formal theorem/proof environments is available on Folio.

DEV Community