# Finite rank representations of a Gaussian Process

In this video Marcel Lüthi will give a explanation of the Karhunen-Loève expansion, using the example of a Gaussian Process model of 2D hand shapes.
6.7
Welcome back.
10
In this video, we’re going to explore an alternative finite dimensional representation of a Gaussian Process. This is a bit an advanced topic in the sense that it will involve mathematical concepts, which are advanced and we cannot explain them in detail here in this video. But it should be very easy for you to get the basic ideas, such that you can use this concept in practise.
40.7
We have already seen that Gaussian Processes give us both a nice theoretical model to work with, but also a practical implementation, thanks to the marginalisation property. However, there can still be a problem, and the problem can happen when we want to work with the really high resolution meshes. For example, the meshes you use in this Scalismo Lab exercises, which are part of the Basel Face Model, they contain around 50,000 vertices. What this means is that if you want to discretise Gaussian Process on all the vertices, we would end up with a covariance matrix that is 50,000 times 3 times 50,000 times 3 dimensional. So the 3 comes because the vertices are defined in 3D.
97.3
Now, this is clearly too high dimensional to keep in memory of computer. Therefore, we’re now going to show you another trick, how we can work with such high dimensional data.
113.7
The basic ingredient of that is what is called the Karhunen-Loève expansion. What this expansion says is that, if we’re given a Gaussian Process u, then we can write that as a sum of its mean function plus an infinite sum over some basis functions phi multiplied by some factor lambda and a coefficient alpha. And this coefficient is where the probability comes in, because these alphas are chosen as a standard normal distribution.
156.3
And it is known that if we do that it’s just the same Gaussian Process that we get back. Now, these phis and the lambdas there actually eigenfunctions and associated eigenvalues of some linear operator that we have defined here. But there is no need at all to know details about that. Let’s just look at these things graphically that you understand what this all means. So what you see here is you see a plot of the first three eigenfunctions multiplied by the respective eigenvalues. We see that these eigenfunctions, they’re also just deformation fields, exactly as what we want to model with the Gaussian Process.
210.5
Now, what this expansion, what this theorem says is that we can add up the mean of the Gaussian Process, which is a deformation field, plus the first eigenfunction weighted buy some coefficient alpha plus the second eigenfunction weighted by another coefficient, and what we get back is a new deformation of our hand, a new deformation field. If we then choose different coefficients, then we get yet another deformation field. And all possible deformation fields can be written like this, so all deformation fields that can be represented by the Gaussian Process. And now this theorem tells us that if we choose the alphas randomly from a standard normal distribution, then what we actually get back is, of course, also a random quantity.
272.1
We have random deformation fields, and these are distributed according to the original Gaussian Process. Now, if we look at this expansion, what we can see is that, of course, there’s variance in the process here. And this variance comes from the component here and the component here so from all the components that involve these random variables alpha. If we would now choose, let’s say, only one eigenfunction and would set all the others to 0, we would, of course, get less variance. And we could ask the question, how much variance is actually represented in each component?
318.4
And it turns out there’s a simple answer to that namely, the amount of variance that we have in one component is just the eigenvalue lambda i. So this is the interpretation of the eigenvalue. And if I want to know what is the total variation in the process, I can just sum up all the eigenvalues here, which I plotted here to show you how they are in magnitude. And what you see is that we have actually most variance represented by this first component, by the first eigenvalue. And the components when we go along, like the letter components in the series, they don’t show much variance anymore.
366.4
And so we can use that observation, and we could think that the process would actually not change much if I would just discard these last components. And this is really the idea behind the low rank approximation that we’re now doing. So instead of representing this sum here as an infinite sum, we restrict it to the first r components. And if the r is chosen such that all the eigenvalues are small after that r, so the tail sum is small then we get a really good approximation of the full process. And this is a nice representation, because what I have now, it is a finite and parametric representation.
420
So I don’t have an infinite number of parameters anymore as I would have in general before for a Gaussian Process, but it’s just a finite number. And we know that any deformation that I want to represent is actually completely determined by these coefficients alpha.
439.7
So what that, for example, means: if I’m interested in the distribution of the deformation field, I can also just look at the probability distribution of the alphas, which just follow a standard normal distribution. So it’s very, very easy to actually compute probabilities using these alphas.
467.3
Now, one thing that we need to discuss in this whole expansion is whether we can actually compute this eigenfunction phi and the eigenvalues lambda. And it turns out that we can compute those. And if we use Gaussian Process models that are learned from examples, as we’ve done like last week, then they’re actually very efficient methods for doing that. But there are also numerical methods that work for any general Gaussian Process. And all of these methods have the property that we never compute this big, full covariance matrix that we cannot keep in memory in our computer. I’m coming back to our face example.
516.3
So if the rank r that we chose is kind of low that means in the order of 100 or 1,000 components then it’s actually possible to represent shapes to have millions of points. Now, this concept of the Karhunen-Loève expansion is not only important for computational reasons, but it’s also important for shape modelling. You will explore this aspect of it in the next article.

In this video we will discuss an alternative representation of Gaussian Processes using the the Karhunen-Loève expansion (KL). We will provide a visual explanation of this expansion and discuss how we can obtain a finite-rank approximation of the Gaussian Process.

Thanks to this approximation we can obtain algorithms that can deal with shapes that are represented using millions of points. Furthermore, the representation gives us a simple and efficient means for computing the probability of any shape in the family.