DEV Community

Haji Rufai
Haji Rufai

Posted on

I Implemented AES-128 from Scratch and Built a Secrets Vault in Python

Most developers import cryptography or pycryptodome and call it a day. I wanted to understand what happens inside the black box — so I implemented AES-128 byte-by-byte from the NIST FIPS 197 specification and built a full secrets manager around it.

VaultLite is the result: a lightweight HashiCorp Vault alternative with zero external dependencies. Everything from the S-Box substitution to the HTTP API runs on Python's standard library.

Why AES from Scratch?

AES (Advanced Encryption Standard) is everywhere. Your browser uses it right now. AWS KMS, HashiCorp Vault, Signal — they all rely on AES. But what actually happens when you encrypt a block of data?

I spent a week with the NIST FIPS 197 spec open in one tab and Python open in the other. Here's what I learned.

The AES-128 Pipeline

AES operates on a 4×4 matrix of bytes called the "state." A 16-byte plaintext block goes through 10 rounds of four transformations:

Plaintext (16 bytes)
    ↓
[AddRoundKey] ← round key 0
    ↓
┌─────────────────┐
│ Round 1-9:      │
│  SubBytes       │  ← S-Box substitution (GF(2⁸) inversion)
│  ShiftRows      │  ← Row rotation
│  MixColumns     │  ← Column mixing (Galois Field math)
│  AddRoundKey    │  ← XOR with round key
└─────────────────┘
    ↓
[SubBytes → ShiftRows → AddRoundKey] ← round 10 (no MixColumns)
    ↓
Ciphertext (16 bytes)
Enter fullscreen mode Exit fullscreen mode

The S-Box: Where the Security Lives

The S-Box is a 256-entry lookup table. Each byte in the state gets substituted with its corresponding S-Box entry. It's the only non-linear part of AES — which is what makes it hard to break.

The S-Box is computed by finding the multiplicative inverse in GF(2⁸), then applying an affine transformation. In my implementation:

def _build_sbox():
    sbox = [0] * 256
    sbox[0] = 0x63  # 0 has no inverse; spec maps it to 0x63

    for i in range(1, 256):
        inv = gf_inverse(i)  # Multiplicative inverse in GF(2^8)
        # Affine transformation over GF(2)
        b = inv
        result = 0
        for bit in range(8):
            val = ((b >> bit) & 1)
            val ^= ((b >> ((bit + 4) % 8)) & 1)
            val ^= ((b >> ((bit + 5) % 8)) & 1)
            val ^= ((b >> ((bit + 6) % 8)) & 1)
            val ^= ((b >> ((bit + 7) % 8)) & 1)
            result |= (val << bit)
        sbox[i] = result ^ 0x63

    return sbox
Enter fullscreen mode Exit fullscreen mode

MixColumns: Galois Field Arithmetic

MixColumns is the transformation that had me staring at my screen the longest. Each column of the state matrix gets multiplied by a fixed polynomial in GF(2⁸).

The key operation is xtime — multiplication by x (i.e., by 2) in GF(2⁸):

def xtime(a):
    result = a << 1
    if result & 0x100:  # If bit 8 is set, reduce
        result ^= 0x11B  # x^8 + x^4 + x^3 + x + 1
    return result & 0xFF
Enter fullscreen mode Exit fullscreen mode

The magic number 0x11B is the irreducible polynomial that defines AES's finite field. Without it, multiplication would overflow — with it, results always stay within one byte.

Key Expansion: 16 Bytes → 176 Bytes

The 16-byte encryption key expands into 11 round keys (176 bytes total). Each new word is derived from the previous word, with every fourth word going through an additional transformation using the S-Box and a round constant:

def key_expansion(key):
    words = [key[4*i:4*i+4] for i in range(4)]

    for i in range(4, 44):
        temp = list(words[i-1])
        if i % 4 == 0:
            temp = temp[1:] + temp[:1]      # RotWord
            temp = [SBOX[b] for b in temp]   # SubWord
            temp[0] ^= RCON[i//4]            # Round constant
        words.append(bytes(a ^ b for a, b in zip(words[i-4], temp)))

    return [words[4*r:4*r+4] for r in range(11)]
Enter fullscreen mode Exit fullscreen mode

Beyond the Cipher: Building a Vault

AES gives you a way to encrypt 16 bytes. A secrets manager needs a lot more:

Envelope Encryption

Each secret gets its own randomly generated Data Encryption Key (DEK). The DEK encrypts the data. The master key (KEK) encrypts the DEK. This way, rotating the master key only means re-encrypting the DEKs, not every secret.

Master Key (KEK)
    │
    ▼
┌─────────┐
│ Wrap DEK │──→ Encrypted DEK (stored alongside ciphertext)
└─────────┘
    │
Random DEK
    │
    ▼
┌───────────────────┐
│ AES-CBC(DEK, data)│──→ Ciphertext + IV + HMAC
└───────────────────┘
Enter fullscreen mode Exit fullscreen mode

Seal/Unseal Ceremony

The master key never touches disk. On startup, the vault is sealed — it can't decrypt anything. To unseal, you provide key shares that were generated at initialization via XOR-based secret splitting.

This means no single person (or compromised server) can access the secrets without all the shares.

Hash-Chained Audit Log

Every vault operation gets logged in an append-only audit trail. Each entry contains the SHA-256 hash of the previous entry — forming a chain like a simplified blockchain. Tamper with any entry and the chain verification fails.

entries = vault.audit_log(token)
# [write] secret/database/prod  → allow
# [read]  secret/api/stripe     → deny (no capability)

integrity = vault.verify_audit_chain(token)
# {"valid": True, "total_entries": 47}
Enter fullscreen mode Exit fullscreen mode

The Result

VaultLite ended up at ~6,300 lines of Python with 258 passing tests:

  • Crypto layer: AES-128, CBC mode, PKCS7 padding, HMAC-SHA256, PBKDF2, envelope encryption
  • Auth: Token trees with cascading revocation, app-role for machines
  • Access control: Path-based policies with glob matching
  • Versioning: Full version history, soft-delete/undelete, permanent destroy
  • Leasing: TTL-based leases with renewal and cleanup
  • API: 25+ RESTful endpoints using only http.server
  • Storage: In-memory, file, and SQLite backends

Zero dependencies. Just Python 3.10+.

Try It

git clone https://github.com/hajirufai/vaultlite.git
cd vaultlite
python -m vaultlite demo
Enter fullscreen mode Exit fullscreen mode

Or use it as a library:

from vaultlite.vault import Vault

vault = Vault()
result = vault.initialize(shares=3, threshold=3)
for share in result.unseal_keys:
    vault.unseal(share)

vault.write("secret/db", {"password": "s3cur3"}, result.root_token)
secret = vault.read("secret/db", result.root_token)
Enter fullscreen mode Exit fullscreen mode

The full source is at github.com/hajirufai/vaultlite. The landing page is at hajirufai.github.io/vaultlite.

What I Learned

  1. AES is elegant once you see the math. The S-Box, MixColumns, and key expansion all work together to achieve diffusion (small input changes cascade) and confusion (no simple relationship between key and ciphertext).

  2. Envelope encryption is underrated. Separating the data key from the master key makes rotation and key management dramatically simpler.

  3. Audit chains catch more than you'd think. Even in testing, the hash chain caught bugs where I accidentally mutated entries.

  4. The stdlib is more capable than people think. http.server, hashlib, hmac, sqlite3 — Python ships with enough to build serious security infrastructure.

Building AES from scratch won't make your code more secure (use cryptography in production). But it will make you understand what "128-bit encryption" actually means — and that understanding makes every security decision you make afterward a little bit sharper.

Top comments (0)