Skip to 0 minutes and 1 secondSo now we're going to look at an example exercise where we perform cross-validation. And we're actually going to be performing cross-validation to work out an optimal value for a hyperparameter. So what are hyperparameters? Well, hyperparameters are not parameters of the model. They're parameters of the algorithm that generates the model. And examples that we have, are-- just so far-- are, for example, the order of a polynomial regression model, or the number of hidden nodes in a neural network. Since we've worked both with polynomial regression and with neural networks in these examples, we've actually been using hyperparameters. The question then is, obviously, how can we work out a good value for a hyperparameter?

Skip to 0 minutes and 47 secondsHow can we work out what order polynomial regression models should be, or how many hidden nodes a neural network should have. Well, the answer is that it all just comes down-- well, the simplest way to do it is to simply build models with different values of the hyper parameter, and then perform model selection on the resulting models. So we're going to do this. In this example, we'll do it with polynomial regression. We'll build a series of polynomial regression models of different orders. And then we will evaluate the performance of the models generated with different order polynomial regression on validation data to see which one performed best. OK, so this example is in cross-validation EX1.

Skip to 1 minute and 40 secondsLike always, we prepare the data.

Skip to 1 minute and 45 secondsWe've just got a synthetic y versus x. We're going to do a two-way split. Now this will give us training and test data. But of course, in this case, because we're doing cross-validation, the training data will actually participate in the validation itself. How is that going to work? Well, we remember cross-validation essentially uses the training data and splits it up into subsets and proceeds to build a series of models using every subset bar one, and evaluating the performance of the model built on those particular subsets on the remaining hold out subset. And it would do this holding out each possible subset. Of course, you've gone over that in the article. So we don't need to labour the point.

Skip to 2 minutes and 45 secondsLet's see how this works in this case. What we're going to do is we're going to build polynomial regression models of order three, four, five, and six. And we'll use tenfold cross-validation to evaluate their performance. I've built this function here, this sapply function that will do that.

Skip to 3 minutes and 10 secondsWhat it's going to do is it's going to split the training data up into 10 different subsets. And inside the loop or the implicit loop in the sapply function, it will proceed to build models for each set of subsets by a particular one. And based on those models, see how they perform in evaluating the holdout subset. You guys, of course, can attempt to replicate this to get some practise.

Skip to 3 minutes and 47 secondsThe result, once again, this is total squared error rather than mean squared error. This time it's total squared error. And here we see the total squared error results for the four different models. Third order, up over 60 million. Fourth order, little over 50 million-- 50.18 million. Fourth order, 50.20-- sorry, fifth order, 50.20. And sixth order, 50.20. So the best model, the best ordered polynomial regression model, was clearly fourth order.

Skip to 4 minutes and 31 secondsWe wouldn't normally actually look at the results. We'd automate it, allow the computer to examine the total squared error results and find the best model, which we do here. And now we're going to create a model of that order using the whole training set and see how that performs on the test data. What I'm going to want to do is compare how it performs on the test data to how it performed on the validation data. So I'm just going to change the total squared error into a mean squared error.

Skip to 5 minutes and 11 secondsAnd then also generate the mean squared error on the test data.

Skip to 5 minutes and 22 secondsNow that it's done, now that we have both the model's performance on the test data and on the cross validated data, it would be possible to perform a statistical test to make sure that the model is not doing significantly worse on the test data than it was on the validation data. And in doing this, if we can confirm that that's not happening, then this can allow us to be confident that we weren't merely selecting that model based on luck in how well it performed on that particular validation data. Any rate, let's see how our chosen model performs.

Skip to 6 minutes and 10 secondsAnd I've actually also here generated, worked out the standard deviation of the residuals on the test data. And I'm going to use that to create some confidence intervals.

Skip to 6 minutes and 25 secondsAnd there we are. Confidence intervals, exactly like we talked about in the article. I'm just doing the prediction, the regression curve, plus or minus two standard deviations, where the standard deviations were generated by the test data. Standard deviation of the residuals were found from the test data. So the black line is our regression curve. The red lines are our confidence intervals at two standard deviations.

# Cross Validation Exercise 1

The first exercise on cross validation. The associated code is in the *Cross Validation Ex1.R* file. Interested students are encouraged to replicate what we go through in the video themselves in R, but note that this is an optional activity intended for those who want practical experience in R and machine learning.

In this video-exercise, we perform cross-validation to determine a good value for a hyper-parameter in the training algorithm. The example looks at determining the optimal order of a polynomial regression model on a synthetic data set.

We divide the data into training/validation and test subsets, where we then perform cross-validation using the training/evaluation data. We then build a set of polynomial regression models of different order and evaluate their performance via cross-validation. We use these results to determine the best order of polynomial regression models for this problem, and build a model of this order from the combined training/evaluation data and obtain an unbiased estimate of this models expected performance of new data using the test data.

In addition we look at how we can create confidence intervals around our regression curve for our chosen model, and discuss how statistical hypothesis tests on the chosen model’s performance in the validation and test data can be used as an additional safe-guard to ensure that our model-selection process lead to a reasonable result.

Note that the *stats* R package is used in this exercise. You will need to have it installed on your system. You can install packages using the *install.packages* function in R.

© Dr Michael Ashcroft