Skip to 0 minutes and 1 secondSo now we're going to look at the first of two hold out validation data exercises. You can find this in [INAUDIBLE] validation ex1.r. And this is just going to give an example of how we proceed when working with hold up validation. Now we start off as in all these examples by loading the data.

Skip to 0 minutes and 23 secondsNow, in this case, we're working with a data set that has two columns-- o-zone and wind. Some of the values in the ozone column are na. They're missing. So we're just going to remove the rows that have a missing value.

Skip to 0 minutes and 43 secondsAnd then we're going to split the data into training and test. So in this case, we're doing a two-ways split in the second hold up validation data. Hold up validation exercise we'll do a three-way split. Here we do a two-way split. Randomly split our data set into training and test. And we're going to create four models from the training data-- an ordinary least squares model, a Poisson regression model, a third order polynomial regression model, and a neural network model with three hidden nodes. So here we go.

Skip to 1 minute and 20 secondsSo those four models have been created all being trained on the training data of course. And now, we're going to see how well these four models perform on the test data. Using mean squared error as our loss function.

Skip to 1 minute and 39 secondsThere we go.

Skip to 1 minute and 42 secondsOK, so we'll just work out which model is best, and we'll output that to the console. There we go. The best model was Poisson regression. And now that we have our best model, let's plot the data. OK, so we've plotted our four models. The best one was the Poisson regression in red. We saw that was the best by looking at the mean squared error of all the models on the test data. And here we see the different plots.

Skip to 2 minutes and 17 secondsNow, we're doing a two-way split. That means we're really only interested in working out which model is best. We don't care about getting an unbiased estimate of that model's performance on new data. We're happy enough just being confident that we've selected the best model in an unbiased fashion. Now, of course, I went through running components of this. If we run the whole function, then you can run it, the function, rather than--

Skip to 2 minutes and 50 secondsAnd here we go again. We saw a Poisson regression. We've go a slightly different models. The neural network in particular has changed quite a lot. Models are different because they're getting different data each time.

Skip to 3 minutes and 6 secondsNeural network is often struggling because there's simply not enough data to run a neural network model.

Skip to 3 minutes and 14 secondsNow, in fact, we're getting-- it's important to talk about the randomness that's involved here when we're getting different regression curves at each time. There's two important components to this randomness. The first component is that we're getting different splits on the data. When we have our data set, we split it into training and test, randomly, randomly assigning different cases to training and test each time. And so each time we run the function, we get a different set of training data. So each function and each time we run it, the models will be being trained on a slightly different training set. And that's one element of the randomness we see.

Skip to 3 minutes and 56 secondsThe second element of the randomness occurs only for the neural network, and that comes from the fact that the neural network is initialized with random whites. And so these two components are representative of very different types of randomness in the modelling function. The first one is indicative of the randomness of the sample data that we're using. And the second, the random initial weights in the neural network, that is indicative of the randomness involved in a non-deterministic modelling algorithm. If you're interested, we actually ran it 1,000 times. And we saw that overall polynomial regression tended to be the best one followed by Poisson, followed by the neural network, and ordinarily least squares very seldom.

Skip to 4 minutes and 50 secondsIf we had more data of course, then there would be far less variants involved. And we would be far more likely to see a single model performing better uniformly.

# Holdout Validation Exercise 1

The first video-exercise on holdout validation. The associated code is in the *Holdout Validation Ex1.R* file. Interested students are encouraged to replicate what we go through in the video themselves in R, but note that this is an optional activity intended for those who want practical experience in R and machine learning.

In this video-exercise, we work on a regression of Ozone values based on Wind values, where we have to remove missing values from the data. We divide the data into training and validation subsets in order to perform two-way split hold-out validation.

We generate four different models from the training data: An OLS model, a Poisson Regression model, a third order polynomial regression model and a basic neural network model. We then select our best model based on the performance of these models on the validation data, using MSE.

Finally, we discuss the results obtained, with particular attention to the randomness involved in hold-out validation and in neural network modeling algorithms.

Note that the *datasets*, *utils*, and *nnet* R packages are used in this exercise. You will need to have them installed on your system. You can install packages using the *install.packages* function in R.

© Dr Michael Ashcroft