# Modular Exponentiation

The key to Shor’s algorithm is the insight that the function is periodic, and that we can extract that period using the quantum Fourier transform. But this requires us to calculate in superposition. Can we do that efficiently?

## The State

The state we need is the superposition of for all of the possible values of .

For reasons related to cleanly finding the period, we need the number of qubits in to be twice the number of bits in the number . If we want to find the factors of a 1,024-bit number (substantially larger than the 768-bit number that is the best solved classically), we should use 2,048 qubits to represent .

This means that the *exponent* – the power we are raising to – is as
much as (all ones in a 2,048-bit number). You
might guess that the numbers grow very quickly, and indeed they do.
Let’s pick . The “small” number ,
and that’s still just the beginning; to factor a 1,024-bit number, we
would need to calculate the enormous number , which
has more than *digits* in it!

Fortunately, *we don’t have to keep all of the digits*, since we are
working with modulo arithmetic. (Good thing, too, since writing down
such a number would take more than all of the atoms in the Universe,
even if we could write one decimal digit in each atom.) Rather than
try to calculate the whole thing, then take the modulo, instead we do
all of multiplications modulo as we go along, and the math works
out just fine.

So, to write down our state, if is a 1,024-bit number, we need 2,048 qubits to hold and 1,024 to hold . We will also need various temporary variables during the computation, so we might need a total of five or six thousand qubits in the fastest, most straightforward form of the computation.

## The Algorithmic Idea

If you are asked to calculate , how will you do it? The most straightforward way is to keep multipling by three: which gives us . Repeat that a full 1,024 times. But that is extremely tedious even for exponents as small as a thousand, and instead we need to do this for exponents that are many, many digits long when we write them out. We would be stuck with the same problem of the exponential growth in computation that we’re trying to avoid. Above we solved the space problem by only keeping the modulo results; now we need to solve the time problem.

There is a simple fix: repeated squaring. Instead of multiplying by 3 each time, square the result of the last computation:

and so on. This will require repetitions for bits.

## The Arithmetic

The modular exponentiation clearly requires modulo multiplications. The most straightforward way to multiply is just the way you learned in school: compute all of the partial products, one digit at a time, then sum them. Thus, we create partial products of qubits each, and summing them will take additions. Each of those additions will require gates, using the CNOT and CCNOT gates we discussed earlier in this course.

But it’s worse than that: performing the modulo operations requires a lot of work to decide if we have grown bigger than , and if so then work to subtract from the register. This multiplies our work by a factor of five or so, depending on the details of the algorithm we choose to implement. However, it is still operations.

Our complete modular exponention, then, requires

So our execution time for this phase of the algorithm is to factor an -bit number, which is polynomial rather than exponential.

## Comments

Space (number of qubits), time and fidelity (probability of no errors occurring) are our biggest worries when trying to build a quantum computer to run a particular application.

Various optimizations have reduced the number of qubits from the or more that we mentioned above to about qubit to factor an -bit number, but they are complex and generally slow. At any rate, thousands of extremely high-quality qubits are necessary before Shor’s algorithm solve problems that we can’t solve classically. This is a significant engineering challenge, as we will see later in the course.

The performance depends on the *circuit depth*: how many gates in a
row do we have to execute? Sometimes, we can reduce the amount of
time necessary by executing some gates in parallel (at the same
time). This depends on both the algorithm we are running, and the
structure of the quantum computer itself, most importantly, how easy
it is to move data from one side of the quantum computer to the
other.

The required fidelity depends on the number of qubits in our system and that circuit depth. To figure out how good our quantum computer has to be, multiply the number of qubits times the number of time steps, and we are allowed less than one error, on average, in that many gate operations. Let’s say that we need qubits, and we require about time steps, so we can have less than one error in about gates. That’s pretty high precision! (This computation isn’t exact, but gives you the idea.) Other algorithms require fewer qubits and fewer timesteps, and may be more tolerant of errors; finding those boundaries is an important area of research.

© Keio University