Sony PSN Data Breach – Plain Text vs. Hashed Passwords Explained

There has been a rash of data breaches where passwords are compromised that were stored as plain text and not converted to a one-way hash as they should be. However, most consumers and even many developers, particularly in startups, don’t know about the best practice of hashing passwords, what it means and how it can help protect users.

The Sony PlayStation Network hack and data breach are one of the most prominent examples to date, putting over 70 million customers at risk. But sadly, they are not alone. DSL Reports, Gawker and Trapster, have also learned this lesson the hard way and in the process, lost the trust of their customers.

Attackers who gain access to databases where passwords are stored in plain text it especially problematic. Recent studies show the majority of us use the same password across multiple sites. Worse yet, according to SecurityWeek, about 75% of social network usernames and passwords are identical to the ones used for email accounts.

What Is A “Hash”?

A hash is like a digital fingerprint of a chunk of data. It is a way of passing data through a one-way algorithm that returns a digital signature in place of the original data. A critical property of that signature is that it is unique but cannot be turned back into the original data. Another way to think about this is in terms of sausages. A sausage can be identified as pork, but it cannot be turned back into a pig.

The unique and irreversible nature of this process makes hashes ideal for storing your passwords. Although an attacker may compromise a database and reveal your list of password hashes, they can’t determine from the hashes alone what the actual password is and will not be able to try and log into other accounts with that password.

For example, if I use a popular hashing algorithm called SHA-1 (Secure Hash Algorithm) and run the word “sausage” through it I get a value of:

“0bd7ea460f5fb0fa2d368f737c3ce63e19fdec50“

If I run “sausage” through the same algorithm I get the same result every time, but if I change the word slightly and run “snausage” the signature is completely different:

“c419e1d2f0f173b170d85b520db7acb2bb777604“

You may see that there is an issue here. Assume, for example, the password the user sets is “password123” which generates a signature of:

“cbfdac6008f9cab4083784cbd1874f76618d2a97“

If a hacker runs this through a simple batch process of common password hashes, the hacker will be able to see that the user is using the password of “password123”. So we will need to take things a step further. We are going to add what is called a salt.

A salt is an additional value that helps randomize the unique key with a secret key that only we know. In our sausage analogy think of the salt as a proprietary secret blend of spices that we sprinkle in our sausage to make it uniquely ours. For this example I will hash the word “sausage” with a salt of “mysecretsalt” using the SHA-1 algorithm which gives me :

“1cf4c502ddd89b918c4bfefea76dadd590693b48“

This process will give me a result unique to my application that will be different from the generic “unsalted” one, so the hacker will not be able to guess what the value is based on known unsalted signatures.

One important thing to think about with security, a mentor told me a while back, is that it is never “if your system is compromised.” The attitude instead should be “when the system is compromised” and then think about how you can mitigate the risk when the data breach occurs. By hashing the passwords used to login you help protect your customers from the inconvenience of having to change all of their passwords or worse, risk having their email compromised.