I am catching some heat over the Encryption Basics post from some of my more ardent detractors that have called me on the carpet over making security and PCI “too simple” or “dumbed down.” As I said in that post, it was in no way meant to be a full dissertation on the topic. This post, as with the previous post, is not a complete dissertation either. I just want to clarify some things so that people have a reasonable foundation on the topic of hashing.
Hashing comes up in requirement 3.4 which says:
“Render PAN unreadable anywhere it is stored (including on portable digital media, backup media, and in logs) by using any of the following approaches: one-way hashes based on strong cryptography (hash must be of the entire PAN), truncation (hashing cannot be used to replace the truncated segment of PAN), index tokens and pads (pads must be securely stored), strong cryptography with associated key-management processes and procedures.”
Before we get into hashing algorithms, I need to make sure everyone is clear that hashing is a one-way operation. Unlike encryption, you cannot reverse data that has been hashed. So if you are considering using hashing, you need to make sure that you will never need the original information such as a PAN.
Probably the most common one-way hash algorithm is SHA-1. Unfortunately, it has been proved that SHA-1 has some issues that make it not as secure as its later iteration, SHA-2. As such, those of you using SHA-1 should be migrating to SHA-2 or another hashing algorithm. Those of you considering a solution that relies on SHA-1 should be asking the vendor if SHA-2 or another secure algorithm can be used.
The other most common one-way hashing algorithm is RSA’s Message Digest 5 (MD5) algorithm. Like SHA-1, MD5 has also been deemed no longer acceptable due to a number of issues that make it no longer secure. As a result, users of MD5 are recommended to migrate to RSA’s MD6, SHA-2 or another hashing algorithm. And just so you are not confused, MD5 is still commonly used as a way to generate a checksum for confirming the validity of a download which is an entirely different purpose than what I am discussing here.
And to show you that things do not stand still in the cryptographic hashing world, the United States National Institute of Standards and Technology (NIST) is in the process of selecting the winner of their SHA-3 competition. That winner will be announced sometime in late 2012.
While you can use a hash function “as is,” security professionals typical recommend the addition of a “salt” to complicate the result of the resulting hashed value. A salt is two or more characters, usually non-displayable binary values, appended to the original value, in our example the PAN. The salt adds a level of complexity to the resulting hashed value thus making the use of rainbow tables to break hashed values more difficult, if not impossible, to use.
One useful thing you can accomplish with a hashed value is that you can still use it for research as long as you are not using a rotating salt. That means that the hashed PAN should have the same hashed value every time the same PAN is hashed. This can be very important for fraud investigators and anyone else that needs the ability to research transactions conducted with the same PAN. If these people do need the actual PAN, they will have to go to a different system that stores the encrypted PAN or go to their card processor for the PAN.
The PCI DSS issues a warning about using hashing and truncation together. There is a note in requirement 3.4 that states:
“It is a relatively trivial effort for a malicious individual to reconstruct original PAN data if they have access to both the truncated and hashed version of a PAN. Where hashed and truncated versions of the same PAN are present in an entity’s environment, additional controls should be in place to ensure that the hashed and truncated versions cannot be correlated to reconstruct the original PAN.”
This note is more relevant to situations where the truncation is the first six digits AND the last four digits. However, it brings up a good point regarding all methods of information obscuring including encryption. Never, ever store the obscured value along with the truncated value. Always separate the two values and also implement security on the obscured value so that people cannot readily get the obscured value and the truncated value together without oversight and management approval.
Hopefully now we all have a base level knowledge of hashing.