A little salt is a good thing

June 23rd, 2010 Bill Shaw, Consultant  (email the author)

Every once in a while, I’ll talk with somebody who’s confused about hashed passwords and the notion of cryptographic salt; what it is, and why you use it.  It’s really a simple idea that greatly enhances password security.  For those that are curious, here’s a 100,000 foot overview that glosses over many of the details but will give you the general idea.

The problem:

Your application uses usernames and passwords for authentication.  This is so a bad guy can’t log into the application and pretend to be someone else.  You have to store those usernames and passwords somewhere.  A common place to store usernames and passwords is in a password file.  A simple password file might look like this:

Username Password
alice foo
bob bar
chuck baz

Now of course if a bad guy gets his hands on this file, he can log in as any one of these users.  We need to hide these passwords so he can’t use them to log in.  One solution is to encrypt the passwords in the file so the bad guy can’t read them.  This is generally done using a one-way hash algorithm, like MD5 or SHA1.  A one-way hash algorithm works like this: it’s fast and easy to encrypt the password (or “compute the hash”), but it’s difficult (practically impossible) to take the hash and work backwards to figure out what the original password was.  So using the password file above as an example, Alice’s password is “foo”.  When we compute the hash for “foo” using the MD5 algorithm, we get “d3b07384d113edec49eaa6238ad5ff00″.  Doing this for all of the passwords in the file, we end up with this:

Username Hashed Password
alice d3b07384d113edec49eaa6238ad5ff00
bob c157a79031e1c40f85931829bc5fc552
chuck 258622b1688250cb619f3c9ccaefb7eb

When Alice logs into the program, she enters her password “foo”.  The program hashes her password and gets the value “d3b07384d113edec49eaa6238ad5ff00″.  It then compares this value with the value in the password file.  If they match, Alice has been successfully authenticated.

Problem solved?  Not exactly.  Although the bad guy can’t easily figure out the original password from a hash directly, he has another trick up his sleeve.  He creates a big table of likely passwords and their hashes beforehand.  A small part of the table might look like this:

Password Hash
font 0daf79671e01b6ef22bf498e444fe360
foo d3b07384d113edec49eaa6238ad5ff00
foot ccb5648dfed003bce37ffe6441024fc5

When he gets a hold of Alice’s hashed password, he compares it against all of the entries in this table.  Even though there are likely thousands of entries in the table, it’s not too hard for the bad guy to quickly see that Alice’s hashed password matches the entry in his table for “foo”.

In reality, creating a table containing all the possible passwords along with their hashes is more complicated than this.  Usually the size of a simple table like this would be far too large to work with.  The bad guy might actually use a more sophisticated way to do this using what’s called a rainbow table.  But the general idea is pretty much the same.  Hashes are pre-computed beforehand.  If the number of possible words in the table is relatively small, it’s possible to match them up with the hashed passwords in the password file.

This is where we come to the notion of salt.  Salt is simply another value we tack on to the words before they’re hashed.  This makes it less likely that the bad guy has a table containing that hash.  For example, say we pick the word “salt.”  To compute the hash, we first combine the salt and the password; e.g. “foo” becomes “foosalt”.  The MD5 hash of “foosalt” is “a9bc4eedd446f8099fa56d05f1748222″.  Similarly, when we apply this same salt to the rest of the passwords in the table:

Username Hashed Password + Salt
alice a9bc4eedd446f8099fa56d05f1748222
bob c68fbc1701ee181aee62b2c0bf871cd4
chuck 5bba8889940c6cf9bbdcd08ff6aaecd5

When Alice logs into the program, she enters “foo” as usual.  The program knows to take her password and tack on the salt (“foosalt”) and hash that.  The program gets “a9bc4eedd446f8099fa56d05f1748222″ as the hash for “foosalt”, sees that it matches Alice’s entry in the password table, and Alice is authenticated.  Take that, bad guy!  It’s not likely you have a table with “foosalt” as one of your entries!

Don’t get cocky.  First of all, “salt” wouldn’t be the best choice for a salt value.  Maybe he’s got a table for that one.  A good next step might be to use a random number, a giant random number, as your salt.  You’ve generated your giant random number: it’s “87645871009146197361983746″.  Use that for the salt for all of your passwords.  Your password file would be:

Username Hashed Password + Giant Random Salt
alice 135e92bfada05693e6c1d13390ac3879
bob dc4a98f2008730fdf505cd5e453df1eb
chuck 90a4107a5bef494de7bb35f313725665

There’s no way he’s got a table using that as the salt.  But wait… Where do you store that salt?  The salt itself can’t be encrypted.  What happens if the bad guy gets hold of that salt?  It doesn’t matter what the value of the salt is at that point.  He’ll just make a new table using your salt, and he’ll be off to the races again.  His new table will just be:

Password Hash
font87645871009146197361983746 11c5ab388f1304785f4961c11fa82f74
foo87645871009146197361983746 135e92bfada05693e6c1d13390ac3879
foot87645871009146197361983746 99005f12a7e3466447fd49d8edff83be

Once again, this part of his table matches Alice’s password.  What’s the problem here?  It’s only partly that the bad guy knows the secret salt.  The real problem is that the salt is the same for every single password.  Once he knows the salt, he can create one table that can be used to attack your entire password file.

The answer: create a new random salt value for every password.  Make it a giant random number; you were on the right track with that one.  The new password file would be something like:

Username Hashed Password + Giant Always Changing Random Salt : original salt value
alice 2aa1d59707e28fbbd94aad6e97b7952b:8327845250412854089
bob b0cd9763935a195dc9963aa7e10c2b8c:2730638019321621664
chuck 7f3ca8410d4ef93061bb8ae9afaf0fd9:3046267048637898574

Now, when Alice goes to log in and enters her password “foo”, the program combines “foo” with the salt value “8327845250412854089″, hashes it and gets “2aa1d59707e28fbbd94aad6e97b7952b”.  The program gets the salt value directly from the password file (it’s the part after the colon in the password field).  It sees that the hash it gets matches what’s in the password file and Alice is authenticated.

One last thing.  Since the salt is no longer the same for every user, we had to include it in the password file for the program to use when it authenticated Alice.  The salt is just sitting there in the password file, appended to each hash, for any bad guy to see.  Is that a problem?  It turns out, not really.  Now the bad guy can’t create only one table and use that to try to figure out everyone’s password.  Each table he creates can only be used to try to figure out one person’s password (since the salt is different for each person).  We’re beginning to make the problem hard enough for him that he’ll go off and bother someone else, or maybe turn over a new leaf and help the elderly.

This only scratches the surface of how hashed passwords and salt work together to secure a system, and also only scratches the surface of how bad guys go about attacking those systems.  The good guys can do things like enforce stronger passwords (e.g. “Iw1Fo0+St” instead of “foo”) and hash passwords multiple times instead of once.  The bad guys can keep looking over your shoulder as you type your password, or steal the post-it note that has your password on it that’s stuck to your monitor. Security is a multi-layer thing after all. But using a little salt in your application is one of those things that’s easy to do, and goes a long way.

Entry Filed under: Agile and Development

3 Comments Add your own

  • 1. darren fix  |  June 24th, 2010 at 10:51 am

    Thanks for the article. I do something similar in a database app that I’m writing, except instead of using a really long random number for the salt, I use the unique userid as the salt. So I guess that I have two questions:

    1) Does a long random number make a significant improvement to the password security?
    2) If #1 is true, wouldn’t it be better to use a long random string instead?

    I know that your article was a “100,000 foot overview” and that my questions fall outside of the scope of the article, but I’m afraid that my curiosity has gotten the better of me.

    Thanks,
    Darren.

  • 2. Bill Shaw  |  June 24th, 2010 at 11:52 am

    Thanks Darren. Using usernames probably isn’t as good as random numbers (or as you suggest, random strings), since an attacker might already have precomputed hashes for well-known usernames (such as “root”). However, this article was only meant to show how salt works in general. In a real-life implementation, you should use a library that’s already been created to do this stuff, like Jasypt if you’re doing Java.

  • 3. Nick  |  December 23rd, 2010 at 6:33 am

    A beautiful, simple explanation. Thank you.

Leave a Comment

Required

Required, hidden


9 + = eighteen

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed

© 2010-2012 Summa All Rights Reserved -- Copyright notice by Blog Copyright