DEV Community

Alex Towell
Alex Towell

Posted on • Originally published at metafunctor.com

Random Oracles in Python

A random oracle is a function $\mathcal{O}: {0,1}^* \to {0,1}^\infty$ where each output bit is independently and uniformly random, but the function is consistent: same input, same output, every time (Bellare & Rogaway, 1993).

You cannot build one. But you can build three approximations, each showing a different tradeoff.

Approximation 1: True Randomness, Cached

class OracleDigest:
    """Infinite digest backed by true randomness and a cache."""

    def __init__(self, entropy=None):
        self._entropy = entropy or (lambda: hashlib.sha256(os.urandom(32)).digest())
        self._cache = {}

    def __getitem__(self, index):
        if index not in self._cache:
            self._cache[index] = self._entropy()[0]
        return self._cache[index]
Enter fullscreen mode Exit fullscreen mode

Each new index draws a fresh random byte from the OS entropy pool and caches it. Genuinely random, but the cache grows without bound, cannot be serialized, and dies with the process. Each instance is a different random oracle from the family of all possible oracles.

Approximation 2: Deterministic, Infinite

class LazyDigest:
    """Infinite deterministic digest via counter-mode PRF: h(seed || i)."""

    def __init__(self, seed, hash_fn=hashlib.sha256):
        self._seed = seed
        self._hash_fn = hash_fn

    def __getitem__(self, index):
        h = self._hash_fn()
        h.update(self._seed)
        h.update(index.to_bytes(8, "big"))
        return h.digest()[0]
Enter fullscreen mode Exit fullscreen mode

Counter-mode PRF: $\text{byte}_i = H(\text{seed} | i)[0]$. Same seed gives the same infinite sequence, on any machine, forever. $O(1)$ space, random access to any position. Only as "random" as the seed and hash function.

Note the fixed-width index encoding (to_bytes(8, 'big')). If you used str(index).encode() instead, index 10 and the concatenation of index 1 + index 0 would collide. Real KDFs like HKDF use fixed-width encodings for the same reason.

Approximation 3: Full Oracle

class OracleHash:
    """Approximates {0,1}* -> {0,1}^inf via hash + lazy expansion."""

    def __init__(self, hash_fn=hashlib.sha256):
        self._hash_fn = hash_fn

    def __call__(self, x):
        return LazyDigest(self._hash_fn(x).digest(), self._hash_fn)
Enter fullscreen mode Exit fullscreen mode

This composes the two: hash the input to get a seed, expand with counter-mode PRF. Type: ${0,1}^* \to {0,1}^\infty$.

oracle = OracleHash()
d = oracle(b"hello")
print(bytes(d[i] for i in range(16)).hex())  # 4c9460cf7b89806e9ebc7b94c815d8c0
Enter fullscreen mode Exit fullscreen mode

The Tradeoff

OracleDigest LazyDigest OracleHash
Randomness True (OS entropy) Pseudo (PRF) Pseudo (PRF)
Space $O(k)$ queries $O(1)$ $O(1)$
Serializable No Yes (seed) Yes (hash fn)
Lifetime Process-scoped Permanent Permanent

Why This Matters

Random oracles are the backbone of the random oracle model (ROM), used throughout cryptography to prove security. The Fiat-Shamir transform, which turns interactive proofs into non-interactive signatures, is the most important example: replace the verifier's random challenge with $H(\text{transcript})$.

The gap between a random oracle and a real hash function is real. Canetti, Goldreich & Halevi (2004) showed there exist schemes provably secure in the ROM that break when instantiated with any hash function. In practice, it usually works. Understanding why starts with understanding what you are approximating.

Top comments (0)