Dmytro Huz

Posted on Jul 30 • Edited on Dec 1

Building Own Stream Cipher: Part 1 — The Essential Guide to Stream Ciphers for Beginners

#webdev #devops #learning #programming

Preface

TL;DR:

This article breaks down the core building blocks of stream ciphers—the powerful encryption tools that turn secret keys and nonces into secure keystreams using deterministic pseudo-random generators (PRGs) and XOR operations. You’ll learn why nonce discipline matters and see a practical, minimal Python implementation to cement the concepts.

If you want a deeper foundation before diving in, I recommend these earlier articles in the Introduction to Cryptography series, which cover key background topics (though they’re not strictly required):

Basic Blocks of Cryptography — explains randomness, entropy, and fundamental primitives.
Perfect Secrecy — explores one-time pads and theoretical limits of secrecy.
A Beginner’s Guide to Computational Security — introduces complexity assumptions behind modern cryptography.

With or without these, this article will give you a clear, practical grasp of stream ciphers—their anatomy, why they work, and how to build your own.

1 What’s a Stream Cipher & Why Should You Care?

A stream cipher turns a short secret key into a continuous, unpredictable sequence of bytes—called a keystream—and then “mixes” that keystream with your message one byte at a time via a simple XOR. Because it never needs to buffer or pad blocks, a stream cipher can encrypt or decrypt data on the fly with minimal memory and practically zero delay.

Key benefits:

Low latency: Perfect for real-time applications like voice or video calls, online gaming, and live sensor feeds.
Tiny footprint: Requires only a small amount of working memory—ideal for microcontrollers, IoT sensors, or embedded devices.
Simplicity: The core operation is just XOR, which makes implementations compact and fast.
Flexible length: Encrypt any number of bytes without worrying about padding or incomplete blocks.

Where you see stream ciphers today:

TLS 1.3 (ChaCha20-Poly1305): One of the two mandatory encryption choices for secure web traffic in modern browsers and servers.
OpenSSH: Offers ChaCha20 as a high-speed alternative for secure remote logins.
WireGuard VPN: Leverages ChaCha20-Poly1305 to protect packet payloads with minimal code.
GSM Cellular (A5/1, A5/2): Early mobile networks used native stream ciphers to encrypt calls and SMS (some of these were later cracked, underscoring the need for strong designs).
Wi-Fi WEP (RC4): The first Wi-Fi security standard used RC4—a classic stream cipher—but was broken due to poor nonce handling.

In this article you’ll learn exactly how stream ciphers work—breaking them down into three simple pieces—before we dive into real-world examples like RC4 and ChaCha20 in the next installments.

2 A 30 000-ft View: How a Stream Cipher Works End-to-End

Before zooming into its three parts, here’s the big picture of encrypting and decrypting a single message:

Sender and Receiver share a secret key K.
Sender picks a fresh nonce N and runs the cipher:

keystream = PRG(K, N)            # a long pseudo-random byte stream
ciphertext = plaintext ⊕ keystream
on the wire → N ‖ ciphertext

Receiver reads N from the first bytes, recomputes the same keystream:

keystream = PRG(K, N)
plaintext  = ciphertext ⊕ keystream

Because both sides use the same (K, N) to drive the PRG, the XOR “undoes” itself and recovers the original message.

Breaking it into three building blocks

Now that you see the end-to-end flow, let’s unpack the three simple pieces inside that magic box:

Layer	Job	Why it matters
1. PRG	Stretch (key ‖ nonce ‖ counter) into a pseudo-random byte stream.	Supplies unpredictable bytes.
2. XOR combiner	Compute ciphertext_byte = plaintext_byte ⊕ keystream_byte.	Hides your message under the stream.
3. Nonce discipline	Guarantee the pair (key, nonce) is never reused.	Ensures every keystream is fresh.

    Sender side                          Receiver side

    key ─┐                                    ┌─ key
         │                                    │
   nonce ├──► [ PRG ] ──► keystream ──► ⊕ ──► ciphertext
         │                                    ▲
  counter┘                                    │
                                              │
                       ◄───── read nonce ─────┘

PRG (Pseudo-Random Generator)
- Inputs:
  - Key K: the shared secret bits.
  - Nonce N: a fresh, public value per message.
  - Counter: an internal integer to generate as many bytes as needed.
- Output: A long, unpredictable keystream.
XOR Combiner
- Operation: Byte-wise plaintext ⊕ keystream → ciphertext.
- Reversibility: Decryption is the same XOR, making implementation trivial.
Nonce Discipline
- Rule: Never call encrypt with the same (K, N) twice.
- Why: Reusing N under K makes two ciphertexts share the same keystream, enabling the two-time-pad attack (C₁ ⊕ C₂ = P₁ ⊕ P₂).

Stream Cipher Mnemonic:

Stream Cipher Mnemonic:
PRG → XOR → Nonce = Stream Cipher

With that big-picture flow and the three-box model in place, you’re ready to explore each piece in detail—starting next with PRGs.

3 What Is a Pseudo‐Random Generator (PRG) & Why It Matters

A PRG (pseudo‐random generator) deterministically stretches a short input into a long sequence of bytes that looks random. In our stream‐cipher model, that input is:

Key (shared secret)
Nonce (per‐message public value)
Counter (internal block index)

The PRG is the engine that turns (key‖nonce‖counter) into the keystream bytes we’ll later XOR with our data.

Security goal: No efficient attacker can distinguish PRG output from true random bits.

3.1 Toy “Bad” PRG: the Linear Congruential Generator (LCG)

An LCG produces a sequence of integers x0,x1,x2... by the rule

where:

x0 is the seed (initial state).
a is the multiplier.
c is the increment.
m is the modulus.

To emit bytes, we take the low-order 8 bits of each xn. Here’s a Python implementation:

def bad_prg(seed: int,
            length: int = 16,
            a: int = 1103515245,
            c: int = 1,
            m: int = 2**31) -> bytes:
    """
    A simple Linear Congruential Generator (LCG):
      x_{n+1} = (a * x_n + c) % m
    Emits the lowest 8 bits of each x_n.
    """
    x = seed % m
    out = bytearray()
    for _ in range(length):
        x = (a * x + c) % m
        out.append(x & 0xFF)  # low‐order byte
    return bytes(out)

Despite its simplicity, no choice of 𝑎,𝑐,𝑚 yields cryptographically secure output. Let’s see why by examining the least‐significant bit (LSB) of each output byte.

3.1.1 Failure Mode 1 – Predictable parity

Working mod 2, the recurrence reduces to:

Let

Then

where A⋅Pn is just bitwise AND. There are four cases:

A=a mod 2	C=c mod 2	Recurrence	Parity behavior
0	0	Pn+1=0	Always 0
0	1	Pn+1=1	Always 1
1	0	Pn+1=Pn	Constant = seed’s parity
1	1	Pn+1=Pn⊕1	Toggles each step (1→0→1→0… or 0→1→0→1…)

# Demonstrate parity for all four (A,C) cases:
for a in [2, 3]:             # a even→A=0, a odd→A=1
    for c in [0, 1]:         # c even→C=0, c odd→C=1
        print(f"a={a}, c={c} →", end=" ")
        bits = [b&1 for b in bad_prg(seed=7, length=8, a=a, c=c)]
        print(bits)

a=2, c=0 → [0, 0, 0, 0, 0, 0, 0, 0]
a=2, c=1 → [1, 1, 1, 1, 1, 1, 1, 1]
a=3, c=0 → [1, 1, 1, 1, 1, 1, 1, 1]  # seed=7 is odd
a=3, c=1 → [0, 1, 0, 1, 0, 1, 0, 1]

Why this is broken:

In every configuration, the LSB of each output byte is fully determined (constant, copied from the seed, or perfectly periodic)—there is zero entropy in that bit. A cryptographic PRG must make all bits unpredictable.

3.1.2 Failure Mode 2 – Visible Spectral Defects (the RANDU case)

What is RANDU?

RANDU was IBM’s built-in LCG on 1960s System/360 and System/370 mainframes, defined by

with no increment (c=0). Despite its “industrial strength” modulus, RANDU is infamous for producing severe non-random patterns.

Below is a simple experiment showing why any LCG—even one as large as RANDU—fails as a cryptographic PRG. We:

Generate 50 000 bytes in one contiguous stream.
Plot each pair.

import matplotlib.pyplot as plt

# Spectral-defect demo: one contiguous LCG stream from seed=1
stream = bad_prg(
    seed=1,
    length=50_000,
    a=65539,    # RANDU multiplier
    c=0,        # RANDU increment
    m=2**31     # RANDU modulus
)

# Build byteₙ vs. byteₙ₊₁ pairs
pairs = [(stream[i], stream[i+1]) for i in range(len(stream)-1)]

# Plot
xs, ys = zip(*pairs)
plt.scatter(xs, ys, s=1)
plt.title("RANDU LCG: byteₙ vs. byteₙ₊₁ (50 000 samples)")
plt.xlabel("byteₙ")
plt.ylabel("byteₙ₊₁")
plt.savefig('lcg_output.png')

Figure: Instead of a uniform cloud, you see distinct diagonal bands—each stripe is a fingerprint of the LCG’s linear formula.

Why this breaks security:

In a true random generator, points (byte n,byte n+1) fill the 256×256 square evenly. These stripes make RANDU’s output trivially distinguishable from randomness and allow attackers to infer relationships between outputs. No cryptographic PRG should leak such linear structure.

Real-World “Good” PRG: A Deterministic, Keyed DRBG (HMAC-DRBG)

In practice, a stream cipher’s PRG must be both cryptographically strong and deterministic: given the same key and nonce, it must always spit out the exact same keystream so that decryption simply “XORs again” to recover plaintext. A widely used pattern is the HMAC-DRBG (per NIST SP 800-90A).

HMAC-DRBG is a deterministic, keyed pseudorandom‐bit generator built on HMAC (e.g. HMAC-SHA256). Its overall flow is:

State = (K, V)

• K is the HMAC key

• V is the “value” that feeds into each HMAC
Instantiation

• Mix your secret key and public nonce into (K, V) via one or two HMAC-based Update steps.
Generate
• Loop:
- V = HMAC(K, V) - yields one block of pseudorandom bytes
- Append V to your output until you have enough • Optionally run an Update after generation to refresh (K, V).
Reseeding (optional)

• Mix fresh entropy into (K, V) whenever you need forward/backward security—e.g. after heavy use or potential compromise.

“Decoding” a stream cipher in practice

A PRG on its own isn’t “decoded” like an encrypted message—you can’t invert a DRBG to get back the seed. Instead, in a stream cipher, you use the PRG’s output as a keystream and rely on the XOR operation for both encryption and decryption:

Encryption

keystream  = PRG(key, nonce, length)
ciphertext = bytes(p ^ k for p, k in zip(plaintext, keystream))

Decryption

keystream  = PRG(key, nonce, length)     # exactly the same call
plaintext  = bytes(c ^ k for c, k in zip(ciphertext, keystream))

Because the PRG is deterministic, the receiver—knowing the same key, nonce, and length—regenerates the identical keystream. XORing twice cancels out:

Key point: You do not reverse the DRBG itself; you simply rerun it to produce the keystream and then XOR against the ciphertext.

Key take-aways

DRBG are one-way: you can’t “decode” them, only reseed and rerun.
Decryption in a stream cipher is always “XOR again with the same keystream.”
A secure channel depends on both a good PRG (no hidden patterns) and strict nonce discipline (never reuse). Let’s consider the nonce itself in the next section.

4 What Is a Nonce & Why It Matters

A nonce (number used once) is a small, public value that you pair with your secret key every time you encrypt a new message. Its sole job is to guarantee that each keystream you generate is unique—even though the key stays the same.

4.1 Nonce in action: avoiding the two-time pad

If you ever reuse the same (key, nonce), two ciphertexts share the exact same keystream:

C₁ = P₁ ⊕ KS    (using key K, nonce N)
C₂ = P₂ ⊕ KS    (same key K, same nonce N)

=> C₁ ⊕ C₂ = P₁ ⊕ P₂

An attacker who sees C₁ and C₂ can compute P₁ ⊕ P₂, leaking relationships (and often enough structure to recover both plaintexts). This is the classic two-time pad failure.

4.2 Nonce properties

Property	Requirement	Why it works
Unique	Never reuse the same nonce for one key	Guarantees each keystream is fresh
Public	Sent in clear alongside the ciphertext	Receiver needs it to regenerate stream
Any format	Random bytes or simple counter — doesn’t matter, as long as uniqueness holds	Flexibility in implementation

4.3 Common nonce patterns

Pattern	How it works	Pros & Cons
Explicit random nonce	Sender generates a fresh random nonce per message and prefixes it to the ciphertext.	+ Robust to loss/reordering
		– Adds ~12 bytes overhead
Implicit counter nonce	Both sides maintain a synchronized counter; each message uses the next integer as the nonce.	+ Zero on-wire overhead
		– Breaks on packet loss or reordering unless you add recovery
Derived-from-previous	Compute nonceₙ₊₁ = HMAC(key, nonceₙ ‖ ciphertextₙ).	+ Tightly chained

4.4 Client ⇄ Server flow with explicit nonces

┌───────────────────────────────┐      ┌───────────────────────────────┐
│  Client (sender)             │      │  Server (receiver)           │
├───────────────────────────────┤      ├───────────────────────────────┤
│ 1. Generate fresh nonce N     │      │                               │
│ 2. KS = PRG(key, N)           │      │  ◄─ read N from first bytes   │
│ 3. C = P ⊕ KS                  │      │  KS = PRG(key, N)             │
│ 4. Send “N ‖ C”               │─────►│  P = C ⊕ KS                   │
│ 5. Discard N (never reuse)    │      │  Discard N                    │
└───────────────────────────────┘      └───────────────────────────────┘

Resilience: If a packet is dropped or arrives out of order, the receiver still recovers plaintext correctly, because every record carries its own nonce.

4.5 Key take-aways

Never reuse a nonce with the same key – that mistake instantly collapses security.
A nonce need not be secret or random; it just must be unique.
Picking an explicit random nonce per message is the simplest, most robust approach for general networks.

With nonce discipline firmly in place, our stream cipher can safely reuse a single key to encrypt unlimited messages. Next we’ll tie PRG+XOR+nonce together into a tiny Python class and watch it encrypt “HELLO” back and forth.

5 XOR & Building a MiniStreamCipher in Python

The final piece of our three-block model is the XOR combiner. You take each byte of your plaintext, XOR it with the corresponding byte of the keystream, and out pops the ciphertext. To decrypt, you XOR again with the same keystream and recover the original message.

Below is a minimal—but complete—stream-cipher class that ties together:

PRG: stretches (key ‖ nonce ‖ counter) into keystream bytes
XOR combiner: masks/unmasks your data
Nonce discipline: caller must supply a fresh nonce each time

import hmac
import hashlib
import secrets

class MiniStreamCipher:
    """
    A simple, deterministic PRG (HMAC-DRBG) wired into a stream cipher.
    """
    def __init__(self, key: bytes):
        # Shared secret key (e.g., 32 bytes for 256 bits)
        self.key = key

    def _prg(self, nonce: bytes, length: int) -> bytes:
        """
        Expand (key‖nonce) into `length` pseudorandom bytes:
          For counter = 0,1,2…:
            block_i = HMAC_SHA256(key, nonce‖counter)
          Concatenate and truncate to `length`.
        """
        out = bytearray()
        counter = 0
        while len(out) < length:
            ctr = counter.to_bytes(4, 'big')
            block = hmac.new(self.key, nonce + ctr, hashlib.sha256).digest()
            out.extend(block)
            counter += 1
        return bytes(out[:length])

    def encrypt(self, nonce: bytes, plaintext: bytes) -> bytes:
        """XOR `plaintext` with the deterministic keystream."""
        ks = self._prg(nonce, len(plaintext))
        return bytes(p ^ k for p, k in zip(plaintext, ks))

    def decrypt(self, nonce: bytes, ciphertext: bytes) -> bytes:
        """Re-XOR `ciphertext` with the same keystream to recover plaintext."""
        ks = self._prg(nonce, len(ciphertext))
        return bytes(c ^ k for c, k in zip(ciphertext, ks))

5.1 Round-Trip Demo: “HELLO” Back and Forth

# 1) Both sides agree on a shared secret key once (e.g., after a handshake)
key = secrets.token_bytes(32)            # 256-bit key

# 2) Each message uses a fresh nonce
nonce = secrets.token_bytes(12)          # 96-bit nonce

# 3) Create the cipher object
cipher = MiniStreamCipher(key)

# 4) Encrypt your plaintext
plaintext  = b"Attack at dawn!"
ciphertext = cipher.encrypt(nonce, plaintext)
print("Nonce      :", nonce.hex())
print("Plaintext  :", plaintext)
print("Ciphertext :", ciphertext.hex())

# 5) On the receiving end, decrypt with the same key & nonce
recovered = cipher.decrypt(nonce, ciphertext)
print("Recovered  :", recovered)

Sample output

Nonce      : 887a11ff7ecd7c4a249396cf
Plaintext  : b'Attack at dawn!'
Ciphertext : 7e2443171339d6bc9cf5b9b26e8b61
Recovered  : b'Attack at dawn!'

Nonce: travels with your message (prefix it on the wire).
Ciphertext: looks random (hex shown).
Recovered: confirms the XOR “undoes” itself.

Strip away the XOR step, and you’ve “just a PRG.”

Add XOR + strict nonce usage, and you’ve built a full stream cipher.

With this in place, you now have a working stream-cipher channel where:

The PRG box generates unpredictable keystream bytes.
The XOR box mixes/unmixes your data.
The nonce discipline (fresh every time) guarantees each invocation is safe.

Summary

This article unpacked the core of stream ciphers—how they use a deterministic, keyed pseudo-random generator (PRG), XOR, and strict nonce rules to securely encrypt data byte-by-byte. We saw why simple PRGs like LCGs are insecure and introduced a strong, practical PRG based on HMAC-DRBG.

Understanding these basics is crucial because they form the foundation for real-world security protocols like TLS, SSH, and VPNs. In upcoming articles, we’ll apply this knowledge to build and analyze these protocols from the ground up.

DEV Community