DEV Community

Saurabh Sharma
Saurabh Sharma

Posted on

Explain Hashing + salting Like I'm Five

Why password are saved in that way and how they are more reliable to simpler encryption.

You can easily guess that I have used them while making backend projects but I still don't understand them.

Latest comments (19)

Collapse
 
jackforbes234 profile image
Jack Forbes

The goal when creating or changing your password is to make it as unique as possible so that it cannot be easily guessed and hence compromised. The main goal of salts is to achieve this. They increase the uniqueness of your password on the site you're visiting and give an extra degree of security to the user password so that your information isn't readily compromised.

Refer this link: loginradius.com/blog/start-with-id...

Collapse
 
0xyasser profile image
Yasser A • Edited

If you are using hashing+salt already and still don’t know why, I would suggest reading this aertical to give you a better understanding of password security. Will give you more information too on why sometime using salt is not enough. Unless you are using a strong hashing function such as bcrypt, scrypt or Argon2
patrickmn.com/security/storing-pas...

Collapse
 
ijlee2 profile image
Isaac Lee • Edited

Both terms are inspired from cooking.

A hash, like a hash brown, means something that has been chopped (into bits, heh) and mixed, whereas salt is an ingredient that we add beforehand to make the hash taste better—to make it unique.

Source: my blog :D

Collapse
 
slavius profile image
Slavius • Edited

There is a general misconception about what the password hashing function should be.

It must be the (contextually reasonably) slowest function to compute. That's why you pick algorithm that does hashing function thousand's of times in a row consuming output as it's next input.

Pick a fast hashing function (the one that is simple, can be accelerated by CPU instruction set or GPU or even FPGA or ASIC programmed) and your passwords are no more safe than using plaintext. It's just a matter of some time.

You aim for security and if a user's login takes 1 second to complete due to necessity to calculate a computationally intensive hash function it's fine because logins happen occasionally and you know that nobody would be able to precompute your hash algorithm rainbow tables with something like 1 kH/s in next 10 years.

Edit:
CRC, MD5, SHA - those are all hashing functions aiming at speed. To calculate unique hash for a chunk of data. They are often used as integrity hashes. You receive a data (a file, a network packet, etc.) that has hash included. You can easily and quickly calculate the hash yourself with these functions and compare it to included hash to verify the file was not tampered with/corrupted during transit.

Collapse
 
bgadrian profile image
Adrian B.G.

Have you read the OWASP password pages? With minimal tech knowledge is pretty easy to understand.

Collapse
 
aswathm78 profile image
Aswath KNM

Refer to this link!!

Collapse
 
mygnu profile image
Harry Gill

Agree, most implementations use bcrypt or scrypt these days

Collapse
 
kayis profile image
K

Doesn't your second point negate the first?

Collapse
 
link2twenty profile image
Andrew Bone • Edited

Here's an example you're the owner of the data and I'm the nasty hacker.

Someone signs up for your site and uses the password 'password'
When you save it you MD5 hash it you get '5f4dcc3b5aa765d61d8327deb882cf99'
You now have no idea what the password is, a hash can't be reversed.

Let's say I get a dump of all your users and their encrypted passwords.
I make a script to test every common password, this includes 'password'.
Anyone can do a straight conversion to a MD5 hash so I have '5f4dcc3b5aa765d61d8327deb882cf99'

But you're smarter than I gave you credit for.
When the user created their account you took:

  • the date
  • their forename
  • your website name

and appended them to the password before you hashed it.

The hash has now been salted.
'password_20180816_andrew_dev.to' is the string that now gets encrypted
'9db61ea3e3b86adb63b507cb2a1b2951' is the output.

As I'm scanning through your files looking for '5f4dcc3b5aa765d61d8327deb882cf99' I go right past '9db61ea3e3b86adb63b507cb2a1b2951' and have no idea that's what I was looking for.

Of course, you need to remember the salt in order to convert their password into the hash for checking later.

Collapse
 
slavius profile image
Slavius

Points:

Hash itself can be reversed by using precomputed hashtables (aka rainbow tables). May be easier than you think. There are ways to compute and save hastables with considerable space savings thanks to packing.

The corect sentence should be: Hash function cannot be reversed.

Using a date/time in a salt is stupid idea. Salt must be some algorithm known constant so you can re-use it. If you use changing variable (like a datetime, unless it's immutable, like your birthdate) then you have to store it somewhere with relation to the hash so you can actually compute that hash again to compare it with user's provided password. It's like using user's firstname as part of the salt. It is too obvious and it's right next in the user's table in the database.

The best way is to keep the salt solely in your obfuscated code in memory and compressed and encrypted on the disk. Stealing the database does not then give too many information to guess the salt.

Collapse
 
brodan profile image
Brodan

I don't understand rainbow tables fully, can you explain how a good salt doesn't make them pointless?

Thread Thread
 
slavius profile image
Slavius

Let's imagine I am able to dump the Users table of your application using an undiscovered SQL injection error.

I will register as a user for your application and use password 'ABCD1234'.

Secretly your application appends '_S3cr3t!' to the plaintext password as a salt and caluclate a hash.

I will dump your database, find a hash of my password, feed it to JohnTheRipper with a mask of 'ABCD1234?????????' if not working then '????????ABCD1234'.
Just a matter of time (and money if I want a fast hashrate accelerated by GPUs) until I find a hash of 'ABCD1234_S3cr3t!' matches.

Then I build a rainbow tables of all hashes '[A-Z][a-z][0-9][special_chars1]{1-10}_S3cr3t!' to decrypt all hashes in your application.

Thread Thread
 
sami3160 profile image
Sami3160

then what is the best practice to store password?

Collapse
 
ekansss profile image
ekanSSS

Encryption is an image puzzle, we cannot guess the image when it's in pieces but it's meant to be built to find it. So with only pieces you can find easily(more or less) this image. In password case this is bad because you can find an original password with an encrypted one easily.

Hashing is one-way, but deterministic: hash twice the same value, and you get twice the same output. So in password case it is hard to find the original string because there are no logic built for it, and you need to "find" real password to check if it's the one in your database.

Salting is adding a personal touch to every hashing. For exemple is your password case, is two user use same password, because you hash it, it will produce the same output (a given entries, always have same output) but if instead of only password you hash a string with password+login, the output will be different, even if two user use the same password.

so in summary :

Encryption => easy to crack, once an attacker find encryption type + secret key all password in your database are exposed.

Hashing => harder to crack, need to guess password and compare output to find password. So it's an one by one work.

Hashing + Salting => make everything hashing unique, even harder to crack, attacker need to split password from salt. Still an one by one work, even once it's decrypted