Dmytro Huz

Posted on Dec 25, 2025 • Originally published at dmytrohuz.substack.com

Building Own Block Cipher: Part 3 - AES

#programming #security #learning #development

In the previous article, we built DES from scratch — not because it’s still relevant, but because it forces you to understand how block ciphers are assembled from specifications.

DES is awkward, bit-heavy, and full of historical baggage.

That’s exactly why it’s a good learning tool.

AES continues that journey — but with very different design goals and very different failure modes.

Where DES teaches structure,

AES teaches discipline.

Before touching code, let’s be honest about one thing:

AES is not interesting because it’s new.

It’s interesting because it’s everywhere.

Right now, AES is used in:

HTTPS (TLS)
Disk encryption (BitLocker, FileVault, LUKS)
Password managers
Cloud storage
Mobile devices
VPNs
Hardware security modules

If data is encrypted in 2025, there’s a very high chance AES is involved.

And yet, despite being studied, standardized, and deployed for more than 20 years, AES is still frequently implemented incorrectly.

Not because AES is broken.

But because people misunderstand what AES actually is.

AES Is Not Broken — Implementations Are

There are no practical cryptanalytic breaks of AES.

What does exist is a long history of failures caused by:

using AES-ECB (still happening)
reusing nonces in CTR or GCM
broken key expansion
incorrect byte/row/column layout
treating AES as “encryption” instead of a primitive

So the goal of this article is not to “learn AES”.

It’s to remove the illusion that AES is magic.

Scope of This Article (Read This First)

This article is an overview of how AES works and how it is implemented, based on a real, working implementation.

👉 The full implementation with detailed comments is here:

https://github.com/DmytroHuzz/build_own_block_cipher/blob/main/aes/aes.py

What you’ll find on GitHub:

full AES-128 implementation
key expansion
block encryption
CTR mode
tests against known vectors

What you’ll find here:

the mental model
the algorithmic structure
why the code is written the way it is

Think of this article as the assembly instructions

and the GitHub file as the assembled furniture.

Reading the AES Specification Like IKEA Instructions

AES is defined in FIPS-197: https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.197.pdf.

The document is:

precise
clean
unforgiving

It is also extremely easy to think you understand while silently misreading it.

So I approached the spec the same way I approach IKEA manuals:

don’t assume understanding
follow the order literally
assemble exactly what is written
only then ask why it works

Before touching individual transformations, we need to understand the whole machine.

Step 0: The Big Picture — What AES Actually Does

Before key expansion, S-boxes, or finite fields, answer this:

What does AES look like as a complete algorithm?

AES in One Sentence

AES repeatedly transforms a 4×4 byte matrix (the state) using round-specific keys, until the original plaintext is no longer recognizable.

That’s it.

Everything else is detail.

The Three Moving Parts of AES

AES consists of exactly three conceptual components:

1. The State

16 bytes (128 bits)
arranged as a 4×4 matrix
transformed in place

2. The Round Keys

derived from the original key
one per round
injected using XOR

3. The Round Function

a fixed sequence of transformations
applied repeatedly

No branching.

No randomness.

No conditions.

AES is completely deterministic.

AES as a Loop (Not a Black Box)

For AES-128, the algorithm looks like this:

state = plaintext_block_as_state
round_keys = expand_key(key)

state ^= round_keys[0]

for round = 1..9:
    SubBytes(state)
    ShiftRows(state)
    MixColumns(state)
    state ^= round_keys[round]

SubBytes(state)
ShiftRows(state)
state ^= round_keys[10]

ciphertext = state_as_bytes

That’s the entire cipher.

If you understand this loop, every AES implementation becomes readable.

Why AES Starts with AddRoundKey

AES begins with AddRoundKey, before any substitutions or mixing.

This is deliberate.

It ensures:

the key affects the cipher immediately
no “raw plaintext” enters a round
every transformation is key-dependent

This is one of those design choices that looks boring — and turns out to be essential.

Why the Final Round Is Different

The final round does not include MixColumns.

Why?

Because MixColumns exists to:

spread changes across columns
increase diffusion between rounds

After the final round, there is no next round.

Removing MixColumns:

simplifies decryption
keeps symmetry clean
avoids unnecessary diffusion

AES isn’t symmetric by accident.

It’s symmetric because it was engineered to be implemented correctly.

AES Is Not “Encryption” Yet

At this point we have:

a block algorithm
deterministic output
no randomness

That means:

AES alone does not encrypt messages.

It encrypts one 16-byte block.

This is why modes of operation exist — something we’ll come back to later, especially since the implementation includes CTR mode.

Step 1: Key Expansion (Before Anything Else)

Before AES can encrypt anything, it expands the key.

This is not optional.

This is not a helper.

This is half the cipher.

Why Key Expansion Matters

AES does not reuse the same key every round.

Instead:

the original key is expanded into round keys
each round uses a different key
each round key depends on all previous ones

This prevents:

simple patterns
slide attacks
related-key attacks

What Key Expansion Actually Does

High-level process:

Split the key into 4-byte words
For every new word:
- rotate
- apply SubBytes
- XOR with round constant (Rcon)
- XOR with the word 4 positions back

The implementation mirrors the spec line-by-line — and that’s exactly what you want here.

This is not a place to be clever.

This is a place to be correct.

AddRoundKey: The Most Honest Operation in AES

Now that we have round keys, AES begins with AddRoundKey.

It’s simply:

XOR the state with the round key

Why XOR?

reversible
fast
symmetric
no information loss

AddRoundKey is how key material enters the cipher.

Everything else rearranges or mixes data —

this is where the key actually does something.

The AES State (Where Most Bugs Are Born)

AES operates on 16 bytes, arranged into a 4×4 matrix.

And this is where implementations silently fail.

The state is column-major, not row-major.

[ b0   b4   b8   b12 ]
[ b1   b5   b9   b13 ]
[ b2   b6   b10  b14 ]
[ b3   b7   b11  b15 ]

Your block_to_state and state_to_bytes functions make this explicit.

If this mapping is wrong:

AES still runs
output still looks random
tests almost pass

This is why AES bugs are dangerous.

SubBytes: The Only Non-Linear Step

SubBytes replaces each byte using a fixed lookup table (S-box).

Important fact:

This is the only non-linear operation in AES.

Everything else is linear algebra and XOR.

You can derive the S-box mathematically.

You don’t need to to implement AES.

Using the fixed table is correct and intentional.

ShiftRows: Small Function, Big Consequences

Each row is rotated left:

row 0 → 0
row 1 → 1
row 2 → 2
row 3 → 3

This step:

breaks column alignment
ensures diffusion across rounds

If your state layout is wrong, this is where everything collapses.

MixColumns: Algebra Without Fear

MixColumns treats each column independently.

It mixes bytes using fixed coefficients in GF(2⁸).

In practice:

multiplication by 2 → xtime
multiplication by 3 → xtime(x) XOR x
addition → XOR
reduction → fixed polynomial

The implementation strips this down to what actually matters.

No matrices.

No abstractions.

Just mechanics.

AES Rounds (Putting It Together)

AES-128:

initial AddRoundKey
9 full rounds:
- SubBytes
- ShiftRows
- MixColumns
- AddRoundKey
final round (no MixColumns)

At this point:

nothing is hidden
nothing is magical
everything is mechanical

That’s exactly what you want in cryptography.

CTR Mode: Turning AES into Real Encryption

Important clarification:

AES itself is not encryption.

It’s a block cipher.

To encrypt real data, we need a mode of operation.

Your implementation includes CTR (Counter) mode — deliberately.

How CTR Mode Works

CTR turns AES into a stream cipher:

Combine nonce + counter
Encrypt with AES
XOR with plaintext
Increment counter
Repeat

Key properties:

no padding
encryption == decryption
parallelizable
fast

Critical rule:

Never reuse the same key + nonce pair.

CTR is secure only if nonces are unique.

What AES Actually Teaches

DES taught structure.

AES teaches discipline.

Most AES failures are not cryptographic.

They are:

layout mistakes
key handling errors
mode misuse
overconfidence

Implementing AES once forces you to:

respect specifications
respect data layout
respect boring correctness

Summary

If you’ve made it this far —

you’re no longer “using AES”.

You’re implementing it.

And that changes how you read every crypto API forever.

DEV Community