Hashing Hashed

Bruce Scheiner repoprts that a reputable team in China appears to have found significant collision problems with the SHA-1 algorithm. Of course, this is just as I revised my old article on PHP Cryptography to include a footnote on the MD5 section that SHA-1 is now the preferred hashing algorithm for security purposes. Such is life.

The truth is that while collisions may be bad news for verifying the absolute pristine uniqueness of data via a signature, many of the ways that SHA-1 and even MD5 are used in the PHP world are unlikely to be seriously affected by these vulnerabilities. So SHA-1 is not bullet proof. Does that really mean that someone shy of a postgraduate student in mathematics is going to be able to devise a different set of data that produces the same hash–let alone a different set of data that actually does something malicious in the context of the running program? Give me some statistics on that likelihood, and my sense is that the server housing the code is much more likely to be struck by lightning instead.

And then, what about using SHA-1 and MD5 signatures in combination, like many sites do to verify the veracity of downloads? How likely is it that someone will sneak in a malicious binary whose hashes collide with both signatures? How much more likely is it that the data that satisfies both hashes (if it exists) is simply garbage that won’t execute?

I could be wrong or crazy here, but I think it is very important to put security alerts in context. Saying SHA-1 is “broken” doesn’t help programmers decide what steps, if any, they need to take with existing applications that use SHA-1. I’m sure it is a valid statement in mathematics and cryptography–but what should we coders do with the information? Too soon to tell, I’m afraid–the paper is not out yet.