Lately, I have been asked several times why we hash some of our data and why we salt the hashes. We’re not talking about breakfast here. We’re talking about data security.
Let’s start with some definitions.
hash /haSH/: Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string.
salt /sôlt/: Salt is a random string of data used to modify a hash.
In other words, a hash is in essence a one-way encryption of a piece of data. Hashes are impossible to invert back to the original data, but they are useful because the same original piece of data will create the same hash each and every time. A great example of hash usage is the password that you use to sign into your computer. The password is stored as a hash, and when you enter your password, it is hashed and then compared to what is stored as your password. If it matches, you are able to sign in.
In the Nortridge Loan System (NLS), we hash data that requires search functionality, such as a customer’s Social Security Number. We also have a field for an encrypted version of the Social Security Number. This is what you see on the screen when viewing a customer in NLS.
So if you hash a Social Security Number, you can’t invert it. But what you can do is create a table of all known values and then compare against that table the hashes. Since the Social Security Number is a short value – nine digits – there are only one billion combinations. With the power of a desktop computer, the table can be put together in a week or less. This makes the hashes easy to exploit should someone get their hands on the data.
This is where salting comes in. By salting the hash, we can effectively change the nine-digit number to a larger value. In this specific case, the Social Security Number is salted with nine additional digits of information. This increases the number of possible combinations dramatically. Now instead of having one billion combinations, there are 999 quadrillion possibilities. To build this table using the same linear method above would take 19 million years instead of a week.
This is why we salt our hashes.