Skip to 0 minutes and 13 secondsHashes are useful for more than just passwords. Remember, hashes protect passwords by allowing us to store a value that’s been recreated with the correct password but doesn’t make it easy to generate an input to give a specific hash value. Hashes are used in other situations too. They are useful in data structures such as hash maps or bloom filters for example. Now, let's look at a situation in which we want to prove who wrote something. We will consider the scenario where Alice, who we will refer to as (A), wants to send messages, we’ll call this (M), to Bob, who we will refer to as (B).

Skip to 0 minutes and 46 secondsHere are their requirements: 1. It should be difficult for a third party to create messages that look like they were made by our sender. 2. We assume encrypted communication so keeping a message secret isn't the problem. However, we cant guarantee there is no man-in-the-middle doctoring the messages. 3. (A) and (B) have some shared secret, a code or word that only they know. We’ll call this (S). 4. We have a hash function ,(H), that can be used to hash arbitrary inputs and produces a cryptographically sound hash. If we send a message, such as "Sell 100 shares" from Alice to Bob, there will be potential problems, even over an encrypted channel. Firstly, how does Bob know that Alice was the real sender?

Skip to 1 minute and 30 secondsAnyone could have initiated the encrypted communication and sent the message. So, can we prove the authenticity of the message? Secondly, if Bob acts on the message and Alice claims not to have sent it can Bob prove that they did? In other words, do we have any way to enforce non-repudiation? Sending the message on its own clearly doesn’t solve the problem. We could send the message along with a secret, but this works only the first time and only if we know there is no man-in-the-middle, because as soon as the secret is sent it has become possible for a third party to view it and then use it themselves.

Skip to 2 minutes and 2 secondsWe could send the hash of the secret but then the hash can be reused to send other messages by a third party. Instead, we send the hash, the secret and message along with the plain text. This code is then used by Bob to verify the sender of the message. This code is easy for Alice to create and since the secret is never sent, it can't be discovered without a brute-force search of all possible secrets. Bob can check the authenticity of the message by recreating the hashed portion, and can prove that the message came from Alice because nobody else should be able to create the same hash. This is the basis of hash based message authentication codes, or HMAC.

Skip to 2 minutes and 41 secondsThere is a potential problem with our HMAC algorithm. Let's say the secret, (S), is 'avocado' and the message (M) is 'sell 100 shares'. The final message we sent is the hash of S concatenated with M followed by M. It's true that an attacker would be unable to extract the secret and use it to send their own message, but what if they sent the same message again, multiple times? They could cause Bob to sell thousands of shares instead of the 100 requested. This is known as a 'replay attack'

Skip to 3 minutes and 10 secondsSo, we have a new requirement: The same message sent multiple times should have a different hash. This sounds difficult, but it can be solved if Alice and Bob keep a count of message sent and received. Let's call this (C). Instead of sending the hash of the message and the secret, we can send the hash of the secret, the counter and the message. Alice and Bob need to keep track of their own counter, and each time they send or receive a message, they increment the counter. This means that the code that goes with the message can be reproduced by Bob, but cant be re-used, even for the same message, since it is constructed with a new number each time.

Skip to 3 minutes and 46 secondsNeither the number (C), nor the secret (S), can be gleaned from the code itself because the one-way hash function effectively hides them. There is, however, a potential for messages to be lost. Imagine Alice sends a message with C = 12, but Bob doesn’t receive it due to network problems, interception, full buffers or whatever other possible reason. Alice is going to send their next message with C = 13, but Bob will try to check the code by recreating it with C = 12 and they won't match. Does this mean an attacker can break the system by clobbering just one message? To overcome this, the receiver, Bob, should try the current expected value for the counter.

Skip to 4 minutes and 25 secondsIf that fails, then he should try the next one and so on up to a maximum value, say, the expected counter value plus 20. If Bob doesn’t find a match the message will be rejected. If Bob succeeds, he accepts the code and sets the counter value to be one higher than the successful value and now both sender and receiver will be synchronised again.

Hashes for authenticating messages

Now we have looked in more detail at hashes, we will investigate other ways they can be used to add proof of origin to messages or allow two parties to convince each other that the other knows the same secret as them without eavesdroppers learning what the secret is or giving away the secret if the other party was lying.

A hash is a one-way function and we have already seen how we can use the equivalence of two hash values as a pretty good proxy for equivalence of the values that were hashed. Now we will see how we can apply this to ideas of ‘shared secrets’ that can be verified between two parties without actually giving away the secret.

Your task

Conduct research and discuss hash-based message authentication codes used in real products and services. Are there any uses that surprise you?

If you’d like to get technical, try reading about how such protocols are constructed to defend against a ‘length extension’ attack. It is interesting to see that simply changing the order in which things are computed can change the security of a protocol even when the underlying components (in this case, the hashing algorithm) remain the same. In other words: having a cryptographically secure hash doesn’t mean its use will always lead to a cryptographically secure protocol.

Share this video:

This video is from the free online course:

An Introduction to Cryptography

Coventry University