3.19

# Linear Systems Algorithms

Machine learning involves classifying data, in order to make decisions. “Efficient” classical algorithms for finding full solutions exist, but the sheer volume of data means that computation times are high. Assuming the availability of certain supporting technologies or constraints on the data, quantum algorithms can answer certain questions about the data exponentially faster than the best known classical algorithms.

## Matrix Operations

In high school (or junior high school) algebra, you learned that a simple equation can represent a line, and how to solve pairs of equations, like

$3x + 4y = 7$
$4x + 3y = 7$

to find the answer $x = 1$, $y = 1$, indicating the place where the two lines intersect. With a larger set of equations and a larger set of variables, we write them as a matrix, and combine them in equations with vectors. A common operation is to try to find $x$ in the equation $Ax = b$, where $A$ is a matrix that represents the left hand side of a set of equations, $b$ is a vector representing the right hand side of the equations, and $x$ is a vector of variables for which we are trying to find values.

The mathematical field of linear algebra concerns itself with matrices and vectors, and these techniques figure prominently in artificial intelligence, especially the field of machine learning, in which computers are programmed to learn from data.

## Machine learning

Machine learning involves creating a classifier that will tell us whether a datum belongs to a particular class. We can make a distinction between the strategy used to create the classifier, and the details of the problem being solved.

Classifiers learn how to do their work. They use training data to find the best set of parameters for the problem at hand, then the training data is discarded before the classifier is used for production work. Supervised learning uses data that someone else has already classified (e.g., “this is malware, that isn’t”) and learns what characteristics of the data define a class. Unsupervised learning instead looks at the data and figures out how many different groups, or clusters, there are in the data. Reinforcement learning is more iterative, with the program learning how to be rewarded in real-world situations.

There are many mathematical classification techniques. $k$-nearest neighbor, support vector machines (SVM), and $k$-means clustering are relatively straightforward, and all can be partially solved or supported using a quantum algorithm known as HHL.

## The HHL algorithm

In 2009, Aram Harrow, Avinatan Hassidim and Seth Lloyd found an algorithm (called, naturally enough, HHL) that helps us prepare a state $|x\rangle = A^{-1}|b\rangle$, where $A^{-1}$ is the inverse of the matrix $A$. At first glance, it might appear that it will “solve for $x$”, as we are often told to do in math problems, but there is a catch involving the data representation.

In these linear algebra problems, the natural data representation is to write down the data as a large vector. We might have, for example, a billion data items, which we can write as a vector with a billion entries. Writing down this vector naturally requires a billion time steps. HHL, instead, encodes each of those data elements in the amplitude of a single quantum value in a register, using a superposition of all one billion elements. Because $2^{30} > 10^9$, we can store all of our data in 30 qubits, total, instead of a billion separate memories. Of course, the register has to be normalized.

Then we can calculate on all billion values at the same time, in superposition. In the above equation, using HHL, we start with our data in the superposition of $|b\rangle$, and end with the superposition of all results in $|x\rangle$. The catch, as in all quantum algorithms, is that we can’t extract all of the data values. Instead, once again, we design our algorithm to create interference in a pattern that tells us something about the original data distribution. For example, we can learn that the 1,117th entry in the vector $x$ is the largest data element. Or, treating the whole data set as a vector in an $n$-dimensional space (in this case, a 30-dimensional space), we can find the distance between $|x\rangle$ and some other vector $|z\rangle$.

If we use HHL properly for certain tasks, it is exponentionally faster than the best known classical algorithm for the same task. However, there are caveats to achieving that performance, and researchers have not yet been able to rule out that efficient classical algorithms exist for the same tasks when all of the constraints are taken into account.

## QRAM

An important caveat is the preparation of the vector $|b\rangle$ containing our original data. If it is a superposition of data values we have stored in a classical computer, then it can take a billion time steps to create a superposition of all billion values, and we have discarded our quantum algorithm’s performance advantage. This is the basis of our assertion in the first week that quantum computers in general aren’t good at big data problems.

However, Giovannetti, Lloyd and Maccone proposed piece of hardware that, if developed, could be used to prepare $|b\rangle$ quickly: a quantum random access memory, or QRAM. In a normal (classical) memory (RAM), we give the memory an address, and it returns the value stored at that address. A QRAM is designed to hold classical data but allow quantum queries. If we can give it an address in superposition, and it returns us the data also in superposition, then we can create our superposition for $|b\rangle$ using only a few Hadamard gates and a single query to the QRAM. Of course, the creation of the data itself requires $O(N)$ time for $N$ data items, but then we can run many programs repeatedly against the same data, amortizing that cost, as opposed to a purely quantum memory, where the superposition is destroyed on every run and we must recreate it.

## Other algorithms

HHL planted the seed, but the more practical algorithms have been in follow-on work and in other work. Lloyd, Mohseni and Rebentrost created algorithms for supervised and unsupervised machine learning. Ambainis extended HHL to be more practical with complex datasets. HHL’s running time depends in part on the precision you need in the answer; Childs, Kothari and Somma showed how to reduce this dependence dramatically.

Within the field of quantum algorithms, quantum machine learning and more broadly quantum artificial intelligence, including such topics as quantum neural networks, is perhaps the fastest-moving area. Given the importance of the problems being attacked, if the dependence on large data sets can be resolved, quantum AI has the potential to drive the broad adoption of quantum computers.