DEV Community

Mike Knights
Mike Knights

Posted on • Originally published at datatoolkit.net

MD5 is broken - here is what to use instead

MD5 is everywhere. It is in legacy codebases, old tutorials, and still used by developers who have not stopped to check whether it is still appropriate. In most cases it is not.

Here is a clear breakdown of what hash functions are, why MD5 is broken, and what you should use instead.

What is a hash function?

A hash function takes any input - a word, a file, an entire database dump - and produces a fixed-length string called a hash or digest. The same input always produces the same hash. Change even one character and the output changes completely.

SHA-256("hello")  = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
SHA-256("hello!") = ce06092fb948d9ffac7d1a376e404b26b7575bcc11ee05a4615fef4fec3a308b
Enter fullscreen mode Exit fullscreen mode

This "tiny input, wildly different output" property is the avalanche effect and is fundamental to why hash functions are useful.

Hash functions are one-way. You cannot reverse a hash back to the original input. This makes them useful for verifying data without storing the data itself.

The algorithms compared

Algorithm Output Status Use today?
MD5 128-bit / 32 chars Broken Checksums only - never security
SHA-1 160-bit / 40 chars Broken Legacy only - avoid
SHA-256 256-bit / 64 chars Secure Yes - general purpose
SHA-512 512-bit / 128 chars Secure Yes - where extra strength is needed

Why is MD5 broken?

A hash function is considered broken when researchers can produce a collision - two different inputs that produce the same hash output.

MD5 collisions can be generated in seconds on consumer hardware. Researchers have demonstrated collision attacks that produce two entirely different files with the same MD5 hash. SHA-1 was formally broken in 2017 with the SHAttered attack.

SHA-256 has no known practical collisions. Its 256-bit output space is so vast that even with all the computing power on Earth, brute-forcing a collision would take longer than the age of the universe.

What is MD5 still acceptable for?

MD5 is still fine when:

  • You need a fast, non-cryptographic checksum to detect accidental corruption (not adversarial tampering)
  • You are checking whether a cached resource has changed
  • The context has no security implications whatsoever

MD5 is not acceptable for:

  • Password hashing (use bcrypt or Argon2 - not any raw hash function)
  • File integrity verification where tampering is a concern
  • Digital signatures
  • Any security-sensitive context

What should you use?

For file integrity / checksums: SHA-256. It is fast enough for almost all use cases and is the standard for software distribution (you see it on download pages as the verification hash).

For password storage: Never use a raw hash function. Use bcrypt, Argon2id, or scrypt. These are purpose-built password hashing algorithms that are deliberately slow and include salting. See How Passwords Are Hashed for the full explanation.

For HMAC / API signing: SHA-256 via HMAC (HMAC-SHA256).

For digital signatures: SHA-256 (part of RSA-SHA256 and ECDSA-SHA256).

Generate hashes online

You can generate MD5, SHA-1, SHA-256, and SHA-512 hashes instantly in your browser - nothing is sent to any server - at:

For a broader explanation of how hash functions work, see What Is a Hash Function?

TL;DR

  • MD5 and SHA-1 are cryptographically broken - collisions can be generated deliberately
  • Use SHA-256 for checksums, file integrity, HMAC, and signatures
  • Never use any raw hash for passwords - use bcrypt or Argon2id
  • SHA-256 has no known practical collisions and is the current standard

Top comments (0)