# Deciphering using randomness

A slight detour to see how substitution ciphers can be broken using a technique called Markov chain Monte Carlo.

Suppose that we have a tool which gives us a measure of how “likely” it is that a given piece of text is written in English: let’s call this the text’s plausibility score. We can use this, along with some ideas from probability, to write a computer algorithm to decipher substitution codes.

Starting from the ciphertext, what our algorithm is going to do is the following:

1. Pick two letters at random, and swap all of their occurrences in the current piece of text. (For example, suppose that the computer chooses letters K and M: it then replaces each K in the current piece of text with M, and vice versa.)
2. Decide whether or not this swap might help us to find something that looks more “like English”. Either accept this swap (and keep the new version of the text), or reject it (and keep the previous version of the text).

The algorithm then iterates those two steps. If we do this carefully, then we might hope that after a lot of iterations the resulting text will have a high plausibility score, and hence be a good approximation to the true plaintext.

The difficult part of this plan is, of course, working out how to do Step 2 above: given the plausibility scores of the previous and new versions of the text, how do we decide whether or not we want to accept or reject the new version? It turns out that the following is a mathematically sensible rule:

1. If the new version’s score is higher than the previous one: this looks like a good move and so we accept the swap.
2. If the new version’s score is lower than the previous one: this doesn’t look promising, and so we might think that we’d be better off rejecting this swap and sticking with the previous version of the text. However, we don’t want to be too hasty! Even though the swap that we’re considering isn’t itself very plausible, perhaps it’s the case that doing this swap now will actually help us to find something even more plausible when we perform some other random swaps in later iterations. So in this case, the right thing to do is to sometimes accept the swap, and sometimes reject.

The precise details of how to make the accept/reject decision are beyond the scope of this course. The idea behind why the algorithm works can be understood as follows. Picture the set of all possible permutations of the English alphabet as a hilly landscape; peaks in this landscape correspond to permutations which, when we apply them to the ciphertext, produce something with a high plausibility score; valleys represent permutations which give us output that looks nothing like English. Our algorithm effectively “walks around” on this landscape, hoping to find a high peak. It considers taking a step in a random direction: if that would move it uphill then it accepts, and steps in that direction; if it would move it downhill, then it sometimes accepts it, in the hopes that this helps it to find a peak later on.

This prevents the algorithm from getting stuck in a local maximum. Consider the sketch below, which represents a very simple surface. If the algorithm is currently at the top of the small peak, then moving in any direction will take it downhill: if it always rejects such a move, then it will never be able to find its way to the much bigger peak on the other side of the valley.

### Seeing the algorithm in practice

A student in the Department of Mathematics at the University of York implemented this algorithm as part of his undergraduate final year project. You can see it in action in this video:

Here are some images from the video to help explain what’s going on. Soon after the algorithm has started, the screen looks like this:

The first column shows the number of iterations; the program prints a new line of information after every 100 iterations. Next comes the plausibility score of the version of the text that it has reached after that number of iterations. Then comes a “percentage correct” column; since the computer actually knows the original plaintext, it can see what percentage of letters in the current version are correct. (In reality we wouldn’t know this, of course, but it’s helpful here to get an idea of how well the algorithm is working!) At the end of each line the screen shows the first 80 or so letters of the current version of the text; as it performs more and more iterations, we hope that this starts to look more and more like something that’s recognisable as English!

Note that in this version we’re working with more than 26 letters in the alphabet! We’re actually taking the alphabet to include 26 lower case letters, 26 upper case letters, the digits 0–9, plus all of the standard punctuation marks. (That is, when the ciphertext was created, all of those characters were jumbled up.) The total number of possible permutations is therefore more like 80 factorial – that’s roughly 7 followed by 118 zeros!

The next image shows the algorithm output after a few more iterations: you can see that after 2,800 iterations the percentage correct jumps from pretty much 0% to 17%. At this point the algorithm has just (randomly) managed to get the spaces in the correct place: you can see that the writing at the bottom of the right-most column now looks like it could be made up of words with spaces between them, even though the “words” so far don’t look like English!

Here’s a shot of the final output: the plausibility score is very high, and the algorithm has arrived at a version of the text in which 97% of the characters agree with the true plaintext. The output now definitely looks like English, and in fact you might recognise it as an excerpt from Roald Dahl’s Charlie and the Chocolate Factory. Notice that this took about 22,300 iterations to complete: much quicker than checking all possible permutations!

If you’d like to examine the ciphertext and the final output of the algorithm in more detail, they can both be downloaded from the bottom of this page.

#### Going further

The algorithm that we’ve introduced here is an example of something called Markov chain Monte Carlo. You can easily find out plenty more about this topic by doing a simple web search. The substitution cipher problem was studied by the author of this page when he was an undergraduate: you can read more about what he did with the algorithm, as well as learn about the way in which we can arrive at a plausibility score, in his dissertation.