How to effectively salt a password stored as a hash in a database

A salt is an additional sequence of data (bits or characters, however you want to think about it) that is appended to a password before hashing the password and storing the hash in a database.

There are three types of salts that can be added to a password. Assume the password is passwd! and the salt is 12345, here are the three types described in a table.

Salt Type Salted Password SHA256 hash
unsalted passwd! 12f225551a043b6e136b2cf03546b06efb289d29ab42cebfd78ee101d8555304
Prefix 12345passwd! b38c49760d484119f227fab640cb1415d58f765c0727dc9ad7e0a5a66003d041
Infix pass12345wd! a9fa7c693b691c8ceded848877afdb1259f28f194863ebb33c07c4bbfa1ff04c
Postfix passwd!12345 1315abea2576c3b6270712df587359d614ae1a1698d69dd2175459e411f678f5

Of course you can use multiple of these in any combination as well. You can surround the password by using both a prefix and a postfix. You can use all three. You can have multiple infixes, so a password of 12 characters such as MyPassword1! could become MyP12345assw12345ord1!.

Reasons for salting

Salting was created because hashing is imperfect. Hashing has multiple problems. However, I am only going to describe two problems and how salting improves the security in each situation.

Problem 1 – Rainbow tables

A database of hashes for all character sequences up to a certain number of characters can be generated over time. This database is called a rainbow table.

Scenario

Let say you registered at site ABCStore.tld. Some time later, ABCStore.tld reports that their database was compromised.  Your password was not stored in the database, but the hash of your password was. Lets say your password was passwd! and you use it on every site on the internet.

With a rainbow table, your password will be discovered in moments and that password will work on other sites perfectly.

How a Rainbow table works

It works by generating a table of passwords and their matching hashes. As each character is added to a password, the total possible password options increases, making the next number of passwords hard to generate. For an English keyboard, there are 26 upper case letters, A-Z, 26 lower case letters, a-z. There are ten numbers, 0-9. There are 33 special characters, 34 if you include tab.  There are other characters available, but the majority of people who speak English, those are the only characters used in a password. That is 95 possible options for a password (96 if you include a tab).

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
1 2 3 4 5 6 7 8 9 0
  ~ ` ! @ # $ % ^ & * ( ) - _ = + [ { ] } \ | ; : ' " , < . > / ?

So for every character in length the password, the possible values grow exponentially.

Password Length Possible combinations
6 96^6 782,757,789,696
7 96^7 75,144,747,810,816
8 96^8 7,213,895,789,838,336
9 96^9 692,533,995,824,480,256
10 96^10 66,483,263,599,150,104,576
11 96^11 6,382,393,305,518,410,039,296
12 96^12 612,709,757,329,767,363,772,416

However, it is actually easier than this. Most passwords are dictionary words, so a Rainbow table can be created with dictionary words first. Now a large portion of passwords are covered.

Then because most passwords are lowercase characters only, a hacker can have their rainbow table created by first creating only the options using all lowercase letters. Notices the numbers are significantly smaller.

Password Length Possible combinations
6 26^6 308,915,776
7 26^7 8,031,810,176
8 26^8 208,827,064,576
9 26^9 5,429,503,678,976
10 26^10 141,167,095,653,376
11 26^11 36,703,444,86,987,776
12 26^12 95,428,956,661,682,176

Also it is common to only put numbers on the end of a password, so they can then do all lowercase passwords with one number at the end, then two numbers at the end. It is also to common to replace letters with number, E = 3, l or i = 1, S = 5, etc. It is also common to only capitalize the first letter, so that can be done too. So as you can see, by doing far less than generating all passwords, a rainbow table can be created that matches almost all passwords used.

Password Creation Hint: When creating your own password, avoid these trends and use long passwords. If it is allowed, use a sentence with punctuation as a password if you can such as this:

I love my wife, Michelle!

Sentence passwords are easy to remember, easy to type, and are freaking hard to populate into a rainbow table. The above sentence is 25 characters. Each space, the comma, and the ! are special characters, that is 6 special characters. There are two capital letters, one not just at the front.

So once such a rainbow table is created, if your password is easy, your password is just a select statement away from a hacker who compromises a database.

Select * from RainbowTable where hash=?

The hacker now knows your actual password  and can use that on other websites, like your bank account. You scared now? You should be!

How a salt improves security against a Rainbow table

Remember our scenario. How is that scenario is less of a security concern because salting was used? The answer is fairly simple. Because salting changes the string being hashed, the hash is different. So a look-up in a Rainbow table will be less likely to return a result. The probability is high that even if a simple password was used, the rainbow table will not return a result. If a result is returned, the likelihood that the returned string is a collision not your password was increased.

Note: If the salt is known, or can be figured out, obtaining your password is still possible, but it is harder and will take longer.  If you change your passwords often enough, you only need the building of the Rainbow table to take longer than the time you use your password.

Problem 2 – Collisions

Another problem with hashing is that there is a possibility, however remote, that two completely different strings return the same hash.

Scenario

You register using the same username and password at both ABCStore.tld and XYZStore.tld. Two sites don’t use a salt, but they both hash your password with the same hashing algorithm. ABCStore.tld reports that their database was compromised.  Your password was not returned, but a collisions was returned.

Even though it is not your password, the collision string can be used to authenticate at XYZStore.tld also.

How a collision occurs

Because a hash is limited in the amount of characters it generates, but there are an infinite amount of character sequences, it stands to reason that infinite is greater than anything finite, once all hashes are used, the next hash created from a unique sequence will be a duplicate.

(Read this: What’s the shortest pair of strings that causes an MD5 collision?)

Lets make the numbers ridiculously small so you can understand easier. Imagine a hash that changes everything to a single digit number, 0-9. Once all ten hashes are used, the eleventh hash created is guaranteed to be a duplicate.

With SHA-256, the rules still applies, only there are a plenty of unique hashes and collisions are extremely hard to calculate, even for multiple computers over a long period of time (at least today in June, 2012).

How salting improves security against collisions

If we look at our scenario where a hacker was able to determine a collision for the password used, not the password itself, they may not even have the salt string.

Well, if salting were used by both ABCStore.tld and XYZStore.tld, and both used different salts, then the collision that worked for ABCStore.tld doesn’t work from XYZStore.tld because they used different salts, and therefore the passwords aren’t the same. So that means that the hackers have to calculate your actual password, which is harder to calculate.

Even better, the hacker wouldn’t even be able to log into to ABCStore.tld because the login system will append the salt to the entered password, so having a collision doesn’t help, because after entering the password, the collision gets the different salt appended to it and it is no longer the same and is not a collision on another system.

Less-effective salt implementation

Salting works. However, it is often implemented in a way that is less-effective. Remember, “less-effective” is still somewhat effective. Let’s look at some less-effective uses of salting.

Using a single salt for all passwords

Look at the following table of passwords. Can you easily tell which is the salt and which is the password?

123456789passwd!
123456789myPass12
123456789littleJohn
123456789jennifer
123456789secret100

The salt is 12345679. Now because the salt is known, a hacker can now create a rainbow table using the salt.

This is “less effective” because it is still somewhat effective. This method is just not as effective as it could be.

According to scenario 1, it is effective because the standard rainbow table will not return a result. The hacker is now forced to create a new rainbow table. However, it is less effective because the hacker only has to create one more Rainbow table, and they can probably get all passwords with a partial rainbow table.

According to scenario 2, finding a collision will not allow authentication to another site with a different salt or a site without a salt. However, because the hacker knows the salt, they can find a collision from which they can remove the salt, and that leaves them with a collision string that works on ABCStore.com.

Using a static hash resource in the code

If a static hash is in the code, it is possible that this value can obtained from a binary or by compromising the code itself. Storing the hash in the code is effective as if the database is compromised, the hash was not in it.

It is somewhat effective in that the hacker now has to figure out the hash on their own. It is less-effective in that the code could be what was compromised.

Publishing the salt in the database

Since a single salt is “less effective” the author of an authentication system may decide to use a separate salt for each user. However, now he has so many salts, he needs to track them somewhere. The salt becomes a column, usually next to the password column, and usually the column is clearly titled: “Salt”.

According to scenario 1, this is still “effective” in that now a hacker must generate a Rainbow table for each user. Why is it still less effective? Because the salt is known, and only one password per Rainbow table is needed, a complete rainbow table is not needed. As soon as a string is found matching the hash, that Rainbow table can be abandoned. A dictionary designed to try high probability passwords first can significantly limit the time needed to find the average person’s passwords, which are usually simple and common dictionary words or names. Most passwords can be determined in a short amount of time. The time is of course based on resources. Of course, with cloud services, resources are cheap and powerful.

Using an existing user value as a salt

Instead of using a column called “Salt’ and existing known column, such as the username, first name, or last name is used. Of course, you wouldn’t use the actual value in the known column, it is probably too short. You would use a hash of the first name or a cryptographically random character sequence added to the value in the known column.

This is effective in that the hacker has to figure out which column is used as the salt. It is less effective because the hacker only has to do this on the first user and once the column is determined, for the rest of the users it is the same as publishing the salt in the database.

More effective implementations

One of the main failures of salting is the common idea that the salt is not secret. A secret or calculated salt increases security.

Salts should be unique per user and they should be hard and as close to impossible to figure out.  Here are some ways that a hacker may never figure out the salt, and never be able to create a Rainbow table using it. Something that provides the security as close to that of a one-time pad as possible.

Use a variably located calculated salt

There is not one salt, there are one or more salts. The salts don’t exist in the code, nor do the exist in the database. However, it is a combination of both. And the salt is not always a prefix, infix, or postfix, or any combination of the three, however, the number of salts and the location(s) are determined by one or more calculations.

There are seven possible salt combinations, however, because the infix could have multiple locations as well, the salt locations could be many more.

  1. prefix
  2. infix
  3. postfix
  4. surround (prefix and postfix)
  5. prefix and infix
  6. infix and postfix
  7. all (prefix, infix, and postfix)

When there is an infix, it could be multiple infixes every n* characters where n could even be calculated so a 16 character password could have 1-15 infixes.

The number of salts and their locations and where they go could be determined with a calculation as simple as this:

SaltLocations = Firstname % 7

Of course a more complex method could be used as well. Again, the number of infixes could be calculated. Each salt location should then have a separate salt and each salt should be created using a separate calculation. Also, make sure both the code and the database are used for the calculations. Even a simple algorithm makes life harder on the hacker. Go ahead and create a column called salt in the database, however, only store a random character sequence there.

Salt (database part) = First letter of first name + random character sequence + first letter of last name

This can be made even more complex by altering this using a function.

Salt (code part) = function(First two letters of first name + random character sequence + first two letters of last name)

Even if the function in code is a simple shift one bit function, the hacker will not know the function and will have difficulty figuring it out. Even if they figure out one salt, will they be able to figure out the number of salts, what each salt is, and where each salt is used.

This is more effective because the hacker will have to compromise both the database and code to figure out each salt and often the database and the code are stored on separate servers.

Use a variably located calculated salt including information outside the database and the code

No matter how many algorithms you use or how complex they are, if the hacker has the code and the database, they have the salt.

External File

So what if having the code and the database is not enough.  What if the code was obtained and the database was obtained but on inspection of the code, a file on the system, located nowhere near the code, held a value that was part of the hash. Now that file has to be accessed too. This file could be anywhere. On the same server, or on a separate server. It could be accessible by different credentials.

Salt = function(First two letters of first name + random character sequence + first two letters of last name + GetInfoFromFile())

Web Service

What if the calculations were performed on a separate server entirely using a web service and lets assume the web service is secure.

hash = Remote_function(password, First two letters of first name + random character sequence + first two letters of last name + GetInfoFromWebService())

The web service could also determines where to place the hash, in any of the seven ways mentioned, so compromising the local code no longer benefits the hacker.

However, adding a third server is costly and more complex and adds additional need for security. For example, if you don’t add security and the hacker gets the database and code, they could simply call the same web service. Even security such as source IP address of the calling server would be effective.

Now the database, the code, and the third-party server all have to be compromised by the hacker for them to determine your salt and then your password or a collision of your password.

Hashing your hash a calculated amount of times

Ok, so if you do everything above, you can still make life even harder for a hacker by hashing  your hash a number of times.

long RoundCount = CalculateRoundCount();
hash = hash(saltprefix + password)
for (long i = 0; i < RoundCount; i++)
hash = hash(hash)

Since we calculate the RoundCount so that every user in the database was hashed a different amount of times, it is difficult for the attacker to know how many hashes were run and so they have to try them all.

Conclusion

Salting is effective. Even when a salt is implemented in a less effective manner, it is still better than having no salt at all.

Typically the salt is considered “not secret” but I would argue that a hacker who compromises a database, the code, or both, should still have difficulty determining the salt or how the the salt is added to a password as that information allows for an easier creation of a rainbow table.

Implementing salting in these more effective ways can make increase the complexity, making it unlikely a hacker could reverse engineer a password using the hash or find and use a collision.

8 Comments

  1. zunimountaintrading.com

    How to effectively salt a password stored as a hash in a database | Rhyous

  2. rainstormtech.com

    How to effectively salt a password stored as a hash in a database | Rhyous

  3. Zouheir says:

    Zouheir

    How to effectively salt a password stored as a hash in a database | Rhyous

  4. Medquarters Direct

    How to effectively salt a password stored as a hash in a database | Rhyous

  5. tenerife hotels las americas mapa

    How to effectively salt a password stored as a hash in a database | Rhyous

  6. chambit.wo.Tc

    How to effectively salt a password stored as a hash in a database | Rhyous

  7. Joshua says:

    Excellent post. Keep up the great work!

  8. Rhyous says:

    Also, see http://CrackStation.net/hashing-security.htm#salt

Leave a Reply

Powered by sweet Captcha