Hashing Is Not Encryption: A Practical Guide to SHA-256, MD5, and When to Use Each

#webdev #beginners #programming #security

A hash function takes input of any size and produces output of a fixed size. The same input always produces the same output. You cannot reverse the output back to the input. That is the entire concept, and getting those three properties straight will save you from a surprising number of mistakes.

I have seen developers confuse hashing with encryption, use MD5 for passwords in 2024, and store unsalted SHA-256 hashes thinking they were secure. Each of these is a different category of wrong. Let me break down what hashing actually does and when each algorithm is appropriate.

How hash functions work

A cryptographic hash function has four key properties. Deterministic: the same input always gives the same output. Fast to compute: you can hash gigabytes of data in seconds. Irreversible: given a hash output, you cannot compute the input (this is called preimage resistance). Collision resistant: it is computationally infeasible to find two different inputs that produce the same output.

The output size is fixed regardless of input. SHA-256 always produces 256 bits (64 hex characters). MD5 always produces 128 bits (32 hex characters). Whether you hash a single character or an entire filesystem, the output length is the same.

MD5("hello")    = 5d41402abc4b2a76b9719d911017c592
SHA-256("hello") = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

Change one character and the output is completely different:

MD5("Hello")    = 8b1a9953c4611296a827abf8c47804d7
SHA-256("Hello") = 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969

This is called the avalanche effect. A single bit change in the input flips roughly half the bits in the output. There is no way to tell from the outputs that the inputs differed by only one bit.

MD5: when it is fine and when it is dangerous

MD5 was designed in 1991 and first broken for collisions in 2004. Researchers can now generate MD5 collisions in seconds on a laptop. This means two different files can have the same MD5 hash, which is devastating for security applications.

But MD5 is still perfectly fine for non-security uses. Checking whether a file download is corrupted. Generating cache keys. Deduplicating files in a storage system. Creating deterministic identifiers from strings. If an attacker deliberately crafting a collision is not in your threat model, MD5 is fast and widely supported.

Where MD5 is dangerous: password hashing, digital signatures, certificate verification, or any context where someone might intentionally create a collision.

SHA-256: the default choice

SHA-256 is part of the SHA-2 family and currently has no known practical attacks against it. It is the hash function used in Bitcoin's proof-of-work, TLS certificates, and most modern security applications.

// Browser
async function sha256(message) {
  const data = new TextEncoder().encode(message);
  const hash = await crypto.subtle.digest("SHA-256", data);
  return Array.from(new Uint8Array(hash))
    .map((b) => b.toString(16).padStart(2, "0"))
    .join("");
}

// Node.js
const crypto = require("crypto");
crypto.createHash("sha256").update("hello").digest("hex");

If you are choosing a hash function and have no specific constraints, use SHA-256. It is fast enough for most applications, secure enough for all current applications, and supported everywhere.

Why you should never hash passwords with SHA-256

This sounds contradictory after what I just said, but hear me out. SHA-256 is designed to be fast. A modern GPU can compute billions of SHA-256 hashes per second. If an attacker steals your database of SHA-256 password hashes, they can brute-force common passwords in minutes.

Password hashing requires a deliberately slow function. bcrypt, scrypt, and Argon2 are designed for this. They include a "work factor" parameter that controls how long each hash takes to compute. Argon2 also requires a configurable amount of memory, making GPU-based attacks expensive.

// Node.js with bcrypt
const bcrypt = require("bcrypt");
const hash = await bcrypt.hash("user_password", 12); // 12 rounds
const match = await bcrypt.compare("user_password", hash); // true

The salt is built into bcrypt's output. You do not need to manage it separately. Each hash is unique even for identical passwords.

Salting: why identical passwords need different hashes

Without a salt, every user with the password "password123" gets the same hash. An attacker with a precomputed table (a rainbow table) can look up common password hashes instantly.

A salt is random data added to the input before hashing. Each user gets a unique salt, so identical passwords produce different hashes. The salt is stored alongside the hash in plain text -- it does not need to be secret. Its purpose is to make precomputed tables useless, not to add secrecy.

Common mistakes

Comparing hashes with == in languages with timing attacks. String comparison in most languages short-circuits on the first differing character, which leaks information about how many characters matched. Use constant-time comparison functions like Node's crypto.timingSafeEqual() or Python's hmac.compare_digest().

Using hash functions for message authentication. A hash proves that data has not been accidentally corrupted. It does not prove who sent it. For authentication, use HMAC (Hash-based Message Authentication Code), which combines a hash function with a secret key.

Truncating hashes to save space. If you take only the first 16 characters of a SHA-256 hash, you have reduced your collision resistance from 2^128 to 2^32. That is the difference between "impossible to brute force" and "trivial to brute force."

For quick hash generation during development -- verifying checksums, generating test data, comparing algorithms side by side -- I use a hash generator at zovo.one/free-tools/hash-generator that supports MD5, SHA-1, SHA-256, and SHA-512 with instant output.

Hashing is one of those fundamentals that touches everything: authentication, data integrity, caching, deduplication, digital signatures. Get the basics right and the rest follows.

I'm Michael Lip. I build free developer tools at zovo.one. 350+ tools, all private, all free.