hash collision Q&A

Cryptography Research has received many inquiries about the hash collision attacks that were recently announced at the CRYPTO 2004 conference. This document attempts to address these questions.


Q: What hash functions are now broken?
A: Collisions were announced in SHA-0, MD4, MD5, HAVAL-128, and RIPEMD. Antoine Joux presented the collision in SHA-0. The other collisions were found by the Chinese researcher Xiaoyun Wang with co-authors Dengguo Feng, Xuejia Lai, and Hongbo Yu. (See http://eprint.iacr.org/2004/199.pdf.)

Q: What is a collision attack and a preimage attack?
A: A preimage attack would enable someone to find an input message that causes a hash function to produce a particular output. In contrast, a collision attack finds two messages with the same hash, but the attacker can't pick what the hash will be. The attacks announced at CRYPTO 2004 are collision attacks, not preimage attacks.

Q: What is the connection between digital signatures and hash functions?
A: All major digital signature signing techniques (including DSA and RSA) involve first hashing the data then signing the hash. Raw message data is not signed because of both performance and security reasons.

Q: How might an attacker exploit a collision attack?
A: To exploit a collision attack, an adversary would typically begin by constructing two messages with the same hash where one message appears legitimate or innocuous. For example, suppose the attacker (Charlie) discovers that the message "I, Bob, agree to pay Charlie $ 5000.00 on 4/12/2005." has the same hash as "I, Bob, agree to pay Charlie $18542841.54 on 9/27/2012." Charlie could then try to get Bob (the victim) to digitally sign the first message (e.g., by purchasing $5000 of goods). Charlie would then claim that Bob actually signed the second message, and "prove" this assertion by showing that Bob's signature matches the second message.

Q: What are the implications of collision attacks for code signing systems?
A: Collisions can be a problem for systems that involve signed code. In particular, a collision attack can enable adversaries to construct an innocuous program and a malicious program with the same hash. For example, a trusted compiler/verifier might accept and sign the innocuous program, which could then be substituted for the malicious one. Collision attacks do not allow tampering with arbitrary programs; this would require a preimage attack. (Note: Java accepts MD5 hashes in signatures on JAR files, e.g. see http://www.hmug.org/man/1/jarsigner.html.)

Q: What are the implications for certificate authorities, such as those issuing SSL web server certificates containing MD5 hashes?
A: Collision attacks do not enable tampering with existing certificates. There is, however, a concern that an adversary might be able to construct a valid certificate request that had a corresponding hash collision with a certificate conferring greater or different powers. For example, a devastating attack would be one that enabled adversaries to obtain a legitimate server certificate with a collision to one containing a wildcard for the domain name and an expiration date far in the future. The use of unpredictable serial numbers early in the certificate data structure may prevent such attacks, but further research is required. From a cryptographic perspective, the best solution to this problem is to transition away from MD5, but this is difficult since many CAs and software programs currently support MD5.

Q: Are all hash functions broken?
A: No. The new attacks affect specific hash functions which happen to share a related class of vulnerabilities. In particular, these attacks are all based on the neutral bit technique of Biham and Chen (see http://eprint.iacr.org/2004/146.ps). There is no evidence suggesting that strong hash functions cannot be constructed.

Q: Is SHA-1 broken?
A: No. Eli Biham described attacks that work against simplified versions of SHA-1, but there is no suggestion that any known attack technique can be extended to break the full SHA-1. (The attacks presented against SHA-0 are also effective against a 36-round reduced variant of SHA-1, but the standard version of SHA-1 uses a full 80 rounds and has not been compromised.) Although there has been speculation that SHA-1 will fall soon, extending the current results to the full SHA-1 appears to be an extremely difficult problem and we do not anticipate such an attack in the immediate future. Nevertheless, the new results certainly do merit a full re-evaluation of all hash functions.

Q: What is the best known attack against SHA-1?
A: At this point, the best known collision attack against SHA-1 is still a brute-force search. This requires computing on the order of 2^80 (or 1,208,925,819,614,629,174,706,176) arbitrary hashes until two messages happen to hash to the same value.

Q: Do these attacks break HMAC using MD5?
A: No. Because of the way hash functions are used in the HMAC construction, the techniques used in these recent attacks do not apply.

Q: Do these attacks allow somebody to break tools that use MD5 to check for malicious binaries?
A: Not usually, as this would require a preimage attack. It would, however, be possible for someone to construct an innocuous program and a malicious program with the same hash. If this adversary could get the innocuous version on the "good" list (e.g. by having a trusted authority sign the hash value), the malicious program would also be accepted.

Q: What is the difference between SHA-0 and SHA-1? Is SHA-0 widely used?
A: SHA-0 was initially proposed in FIPS 180 (May 1993) as hashing standard by the U.S. government, but was replaced by SHA-1 in FIPS 180-1 (April 1995). SHA-1 adds an additional circular shift operation that appears to have been specifically intended to address the weaknesses found in SHA-0. SHA-0 is not widely used and should not be used in new systems.

Q: Do systems designed by Cryptography Research use any of the hash functions that were broken?
A: No. We normally use AES, triple DES, RSA, and SHA-1 unless there is a reason to use different algorithms. (See the following question for information about SSL.)

Q: Is SSL 3.0/TLS affected by these results?
A: The SSL 3.0 protocol (which was co-authored by Cryptography Research President & Chief Scientist Paul Kocher) uses MD5 and SHA-1 in a redundant fashion in the handshake protocol and also supports MD5 HMAC. Neither use is affected by these attacks. While there is also some concern that signing authorities could be affected (see the question above on certificate authorities), certificate formats and procedures are beyond the scope of the SSL/TLS protocol.

Q: Can the problem be solved by updating hash function implementations to detect the messages that produce collisions?
A: No. The attack methods are general and enable the construction of additional collisions.