# Linear Systems Algorithms

Machine learning involves *classifying* data, in order to make
decisions. “Efficient” classical algorithms for finding full
solutions exist, but the sheer volume of data means that computation
times are high. Assuming the availability of certain supporting
technologies or constraints on the data, quantum algorithms can answer
certain questions about the data *exponentially faster* than the best
known classical algorithms.

## Matrix Operations

In high school (or junior high school) algebra, you learned that a simple equation can represent a line, and how to solve pairs of equations, like

\(3x + 4y = 7\)

\(4x + 3y = 7\)

to find the answer \(x = 1\), \(y = 1\), indicating the place where
the two lines intersect. With a larger set of equations and a larger
set of variables, we write them as a *matrix*, and combine them in
equations with vectors. A common operation is to try to find \(x\) in
the equation \(Ax = b\), where \(A\) is a matrix that represents the
left hand side of a set of equations, \(b\) is a vector representing
the right hand side of the equations, and \(x\) is a vector of
variables for which we are trying to find values.

The mathematical field of *linear algebra* concerns itself with
matrices and vectors, and these techniques figure prominently in
artificial intelligence, especially the field of *machine learning*,
in which computers are programmed to learn from data.

## Machine learning

Machine learning involves creating a *classifier* that will tell us
whether a datum belongs to a particular class. We can make a
distinction between the strategy used to create the classifier, and
the details of the problem being solved.

Classifiers *learn* how to do their work. They use *training data*
to find the best set of parameters for the problem at hand, then the
training data is discarded before the classifier is used for
production work. *Supervised learning* uses data that someone else
has already classified (e.g., “this is malware, that isn’t”) and
learns what characteristics of the data define a class. *Unsupervised
learning* instead looks at the data and figures out how many different
groups, or clusters, there are in the data. *Reinforcement learning*
is more iterative, with the program learning how to be rewarded in
real-world situations.

There are many mathematical classification techniques. \(k\)-nearest neighbor, support vector machines (SVM), and \(k\)-means clustering are relatively straightforward, and all can be partially solved or supported using a quantum algorithm known as HHL.

## The HHL algorithm

In 2009, Aram Harrow, Avinatan Hassidim and Seth Lloyd found an
algorithm (called, naturally enough, HHL) that helps us prepare a
state \(|x\rangle = A^{-1}|b\rangle\), where \(A^{-1}\) is the
*inverse* of the matrix \(A\). At first glance, it might appear that
it will “solve for \(x\)”, as we are often told to do in math
problems, but there is a catch involving the data representation.

In these linear algebra problems, the natural data representation is
to write down the data as a large vector. We might have, for example,
a billion data items, which we can write as a vector with a billion
entries. Writing down this vector naturally requires a billion time
steps. HHL, instead, encodes each of those data elements in the
amplitude of a *single* quantum value in a register, using a
superposition of all one billion elements. Because \(2^{30} > 10^9\),
we can store all of our data in 30 qubits, total, instead of a billion
separate memories. Of course, the register has to be normalized.

Then we can calculate on all billion values at the same time, in superposition. In the above equation, using HHL, we start with our data in the superposition of \(|b\rangle\), and end with the superposition of all results in \(|x\rangle\). The catch, as in all quantum algorithms, is that we can’t extract all of the data values. Instead, once again, we design our algorithm to create interference in a pattern that tells us something about the original data distribution. For example, we can learn that the 1,117th entry in the vector \(x\) is the largest data element. Or, treating the whole data set as a vector in an \(n\)-dimensional space (in this case, a 30-dimensional space), we can find the distance between \(|x\rangle\) and some other vector \(|z\rangle\).

If we use HHL properly for certain tasks, it is exponentionally faster
than the *best known* classical algorithm for the same task. However,
there are caveats to achieving that performance, and researchers have
not yet been able to rule out that efficient classical algorithms
exist for the same tasks when all of the constraints are taken into
account.

## QRAM

An important caveat is the preparation of the vector \(|b\rangle\)
containing our original data. If it is a superposition of data values
we have stored in a classical computer, then it can take a billion
time steps to create a superposition of all billion values, and we
have discarded our quantum algorithm’s performance advantage. This is
the basis of our assertion in the first week that quantum computers in
general aren’t good at *big data* problems.

However, Giovannetti, Lloyd and Maccone proposed piece of hardware
that, if developed, could be used to prepare \(|b\rangle\) quickly: a
*quantum random access memory*, or QRAM. In a normal (classical)
memory (RAM), we give the memory an address, and it returns the value
stored at that address. A QRAM is designed to hold classical data but
allow quantum queries. If we can give it an address *in
superposition*, and it returns us the data also *in superposition*,
then we can create our superposition for \(|b\rangle\) using only a
few Hadamard gates and a single query to the QRAM. Of course, the
creation of the data itself requires \(O(N)\) time for \(N\) data
items, but then we can run many programs repeatedly against the same
data, amortizing that cost, as opposed to a purely quantum memory,
where the superposition is destroyed on every run and we must recreate
it.

## Other algorithms

HHL planted the seed, but the more practical algorithms have been in follow-on work and in other work. Lloyd, Mohseni and Rebentrost created algorithms for supervised and unsupervised machine learning. Ambainis extended HHL to be more practical with complex datasets. HHL’s running time depends in part on the precision you need in the answer; Childs, Kothari and Somma showed how to reduce this dependence dramatically.

Within the field of quantum algorithms, quantum machine learning and more broadly quantum artificial intelligence, including such topics as quantum neural networks, is perhaps the fastest-moving area. Given the importance of the problems being attacked, if the dependence on large data sets can be resolved, quantum AI has the potential to drive the broad adoption of quantum computers.

© Keio University