Encoding vs encryption vs hashing, explained simply

#cybersecurity #security #fundamentals

Encoding, encryption, and hashing get confused constantly, and the confusion causes real security bugs (like "encrypting" passwords, or trusting Base64 to hide a secret). They solve three different problems. Here is the clear version.

Encoding: for format, not secrecy

Encoding transforms data into another format so it can be stored or transmitted safely, for example Base64 to put binary into text. It is fully reversible by anyone, with no key. Base64 is not security; it is the digital equivalent of writing in a different alphabet.

import base64
base64.b64encode(b"hello")   # b'aGVsbG8='  -> anyone can decode this

Use it for: moving data through channels that expect text (data URLs, JSON, email). Never use it to protect anything secret.

Encryption: for secrecy

Encryption scrambles data with a key so that only someone with the right key can read it. It is reversible, but only if you have the key. This is what protects messages, files, and traffic.

plaintext + key -> ciphertext   (and back, with the key)

There are two families: symmetric (same key to encrypt and decrypt, fast, for data at rest) and asymmetric (a public key encrypts, a private key decrypts, the basis of HTTPS). The golden rule: do not invent your own; use a vetted library.

Use it for: anything that must stay secret but be recovered later.

Hashing: for integrity, one way

A hash function turns any input into a fixed-size fingerprint, and it is deliberately not reversible. The same input always gives the same hash; a tiny change gives a totally different one. You cannot get the original back from the hash.

import hashlib
hashlib.sha256(b"hello").hexdigest()   # a fixed 64-char fingerprint

Use it for: verifying a file has not changed, storing passwords (with a slow, salted hash like bcrypt or argon2, never plain SHA-256), and de-duplication.

The mistakes to avoid

"We encrypt passwords." No. Passwords should be hashed (slow + salted), never encrypted, because you never need to recover the original, only check a match.
"Base64 hides it." Encoding is not encryption. Base64 protects nothing.
Plain SHA-256 for passwords. Too fast, so it is brute-forceable. Use a purpose-built password hash.

A simple way to remember: encoding is for format, encryption is for secrecy, hashing is for integrity.

See it by building it

These ideas stick when you implement them: encode and decode Base64 by hand, build a hash-based tamper check, and store a password the right way. The cybersecurity track walks through encoding, hashing, classical and modern crypto, and password cracking, all built and run in your browser. The first project is free.

Top comments (2)

Mdm • Jun 8

Clear, no-nonsense breakdown. In my pentesting work, I still trip over teams that Base64-encode secrets in CI/CD variables or config files and call it "encrypted." It's not. I've also seen plain SHA-256 used for API tokens inside pipeline scripts — one rainbow table away from a full compromise. The "encrypt passwords" mistake is alive and well, even in production.

What I'd add for anyone working in DevSecOps: treat every encoded string as plaintext, and always ask "does this secret need to be reversible?" If yes, use a proper KMS. If no, hash it with bcrypt/argon2. Simple grep for base64.b64encode in your codebase can surface low-hanging fruit.

Curious — do you have a favorite technique or one-liner that helps devs internalize the difference during code review? That's often where the real education happens.

I Want To Learn Programming • Jun 9

Thanks, and yes, the Base64-as-encryption thing genuinely never dies.

My favorite code-review rule of thumb is the reversibility question, asked out loud for every transformed value:

Can I get the original back with no secret? → it's encoding (Base64, hex, URL-encode). It protects nothing.
Can I get it back only with a key? → encryption. Fine for data you must read later. KMS, never a hardcoded key.
Can I never get it back? → hashing. The only correct answer for passwords: bcrypt/argon2, salted.

The thing that actually makes it stick: make the dev reverse it in the PR. Running echo | base64 -d in front of the team is a one-second demo once someone watches their "encrypted" secret print in cleartext, they never call it encryption again.

Love the grep idea. I'd pair it with flagging any hashlib.sha256( / md5( that sits near the words password or token in review. And +1 on "does this need to be reversible?" as the very first question it decides hash-vs-encrypt before a single line of code is written.