Sony PSN Data Breach – Plain Text vs Hashed Passwords Explained

There have been a rash of data breaches where passwords have been compromised that were stored as plain text and not converted to a one-way hash as they should be. However, most consumers and even many developers particularly in startups don’t know about the best practice of hashing passwords, what it means and how it can help protect users.

The recent Sony Playstation Network hack and data breach is one of the biggest examples to date, putting over 70 million customers at risk, but they are not alone DSL Reports, Gawker and Trapster have also learned this lesson the hard way and losing the trust of their customers.

When an attacker gains access to a database where passwords are revealed in plain text it is particularly problematic given that recent studies show the majority of people use the same password across sites and as high as 75 percent of social networking username and password samples collected online were identical to those used for email accounts ( Source: SecurityWeek ).

What is a “hash”?

A hash is like a digital fingerprint of a chunk of data, it is a way of passing data through a one-way algorithm that returns a digital signature in place of the original data. The signature is unique but cannot be turned back into the original data. On way to think about this is as if we are making sausage, sausage can be identified as pork, but it cannot be turned back into a pig.

The unique and irreversable nature of this process make hashes ideal for storing your passwords. Although an attacker may compromise a database and reveal your list of password hashes, they can’t determine from the hashes alone what the actual password is and will not be able to try and log into other accounts with that password.

For example if I use a popular hashing algorithm called SHA-1 (Secure Hash Algorithm) and run the word “sausage” through it I get a value of:

0bd7ea460f5fb0fa2d368f737c3ce63e19fdec50

If I run “sausage” through the same algorithm I get the same result every time, but if I change the word slightly and run “snausage” through the signature is completely different:

c419e1d2f0f173b170d85b520db7acb2bb777604

You may see that there is an issue here, if for example the password the user sets is “password123″ which generates a signature of:

cbfdac6008f9cab4083784cbd1874f76618d2a97

If a hackers runs this through a simple batch process of common password hashes the hacker will be able to see that the user is using the password of “password123″. So we will need to take things a step further, we are going to add what is called a “salt” this is an additional value that helps randomize the unique key with a secret key that only we know. In our sausage analogy think of the “salt” as our own secret blend of spices that we sprinkle in our sausage to make it uniquely ours. For this example I will has the word “sausage” with a salt of “mysecretsalt” using the SHA-1 algorithm which gives me :

1cf4c502ddd89b918c4bfefea76dadd590693b48

This process will give me a unique value to my application that will be different from the generic “unsalted” value, so the hacker will not be able to guess what the value is based on known unsalted signatures.

One important thing to think about with security a mentor told me a while back is that it is never “if your system is compromised” the attitude should instead be “when the system is compromised” and then think about how you can mitigate the risk when there is a data breach, if you hash the passwords used to login you help protect your customers from the inconvenience of having to change all of their passwords, or risk having their email compromised which can lead to even nastier things.