There’s a lot to remember about hashes, so I’m bringing the definitions into one place as a reference.
Term | Definition |
Avalanche effect | Small changes in the output lead to bif changes in the output, with one bit change on the inpute creating change to at least 50% of the output. |
Deterministic | An input will always give the same output |
Preimage Resistance | Hard to work out the inputs from the hash |
Second Preimage Resistance | If you have a (specific) hash, it should be hard to find another input that will give the same hash |
Collision Resistance | Hard to find two imputs that create the same hash (any hash, we don’t care what it is) |
Common Hash Attacks
Dictionary attack
A dictionary attack is a type of brute force attack that uses a predefined list of passwords, including common words, phrases or passwords. It can be pruned if the attacker knows something about the target, for instance using children’s names or birthdays, pets names, street name, company name etc.
Rainbow/Lookup Tables
A rainbow or lookup table is a list of hashes and their respective plaintext. If an attacker has a password hash, they can simple look it up and find the actual password.
Birthday Attack
A birthday attack can be carried out by an adversary who has access to a legitimate and illegitimate replacement. They keep hashing/modifying each input until they both create the same hash.
Chosen-plaintext Attack
A chosen plaintext attack is when an attacker can influence the input and also view the output. An example being, back in WWII. The English know there are spies living amongst them reading the newspaper and encrypting messages and sending them back to Germany. The English are intercepting these messages and are slowly building up a code book but have some words missing, say the word ‘king’. The English print a small snippet about the King, with known codewords, knowing it will be encoded and sent to Germany. It is then a matter of elimination to find the ciphertext for ‘king’. As with all deciphering, the more messages intercepted the better for verifying correct translation.
D. Learn about HMACS (and revise stream ciphers, MACs, and length extension attacks)
I enjoyed this video it gave a good overview of stream ciphers, with the message being XOR’d with a ‘keystream’. A keystream is a ramdomised string that the receiver can also generate for decryption. Because the stream cipher is a stream, the bits can be flipped by a man in the middle (we’ll learn how to prevent this).
On to block encryption.
Fingerprinting
So how do we create integrity in a message? A naive approach is to append a hash to a message, where the hash provides a fingerprint of the message.
Hash
Fingerprint: h(m) where
- m = message
- h = hash function
- | = append
But of course an attacker could just replace the message and append a hash for that new message and the receiver would have no way of checking.
MAC
So MACs (Message Authentication Code) were developed. A MAC relies on a shared secret between the sender and legitimate receiver, the message is still appended with a finder print, but the fingerprint is more complex.
Fingerprint: h(m|k), where:
- m = message
- k = key
- h = hash function
- | = append
This is better but is vulnerable to a length extension attack because of the underlying Merkle-Damgard in the encryption.
HMAC
Enter HMAC (Hash based Message Authentication Code). It works in a similar way where we add a fingerprint to the message but the fingerprint is more complex. Like the MAC it has a shared secret, the key, but the key is split into two subkeys, k1 and k2.
Fingerprint: h(h(k1|m)|k2), where:
- m = message
- k1 = sub key 1
- k2 = subkey 2
- h = hash function
- | = append
HMAC is immune to length extension
E. Length Extension Attack
A length extension attack is possible for encryption that relies on the hash of the previous block and the previous message being incorporated into the encryption of the current block (eg Merkle Damgard). Because of the inclusion of the hash, an attacker could take the hash of the current message, then add it to a new block, encrypt it and all of a sudden the message has been added to – The MAC will be valid and the receiver has no way of knowing. Unless of course the parties have an agreed number of blocks in their message.