Skip to 0 minutes and 1 secondOK, so here we are on the second hold-out validation exercise example. In this case, we're going to be doing a three-way split. We're going to be dividing our dataset into training, validation, and test data. So we're going to be interested in using the validation data to work out which model is the best model. And we'll use the test data to get an unbiased estimate of that model's expected performance on new data. Now before you run this example, you're going to need to source a function that's at the bottom of the file that will get us our data set. So I'll just highlight it and run. Now, this data is actually taken from the internet at stat.ufl.edu.
Skip to 0 minutes and 59 secondsAnd it's brain weight data. We've got the first column is our gender-- 1, male, 2, female-- second column is our age range, just discretized. Again, two values, 1 is between 20 and 46, 2 is 46 plus. Then we have head size and centimetre squared and brain weight in grammes. So once you've run that function, you can prepare the data using get brain weight. Once again, I'm just going to put it into a data frame called data set. And as normal, we're going to randomly split up this data, this time into three subsets.
Skip to 1 minute and 52 secondsNow, now that we have the data set split into training, validation, and test data, what we're going to do is to create a bunch of regression models. Once again, we'll create an ordinary list squares model, a Poisson regression model, a polynomial regression model. We're going to do a polynomial expansion of 2, but only on the head size feature, because that's the only real valued feature that we have. The other two are discrete. And we'll do a neural network with four hidden nodes. So here we'll create the models, training them on the training data.
Skip to 2 minutes and 34 secondsAnd then we will evaluate the mean squared error for these models on the validation data. And we'll work out which model is best.
Skip to 2 minutes and 52 secondsAnd now we'll output to console which model was best.
Skip to 2 minutes and 58 secondsAnd we see that the best model was polynomial regression. Now of course, because I'm just highlighting and running, rather than running the whole function, this output looks a bit messy, because we have code in between the outputs. But the best model was polynomial regression. Now, because we've done a three way split on the data, that means that we can now evaluate the mean squared error for our final selected model, this polynomial regression model, on the test data. And we can find the test mean squared error of this model. OK, that gives us an idea. Now the best model was polynomial regression.
Skip to 3 minutes and 47 secondsAnd it's giving us different regression curves for each class-- male 20 to 46, male 46 plus, female 20 to 46, female 46 plus. And we're getting regression curves for each of these distinct classes, for head size versus brain weight.
Skip to 4 minutes and 10 secondsThere we are. So this is an example of how you perform three-way-- If you split your data three ways to get training, validation, and test data, so that you can not only unbiasedly select which model is best, but also get an unbiased estimate of how well that chosen model will perform on new data.
Holdout Validation Exercise 2
The second video-exercise on holdout validation. The associated code is in the Holdout Validation Ex2.R file. Interested students are encouraged to replicate what we go through in the video themselves in R, but note that this is an optional activity intended for those who want practical experience in R and machine learning.
In this video-exercise, we work on a regression of an individual’s brain weight based on other characteristics. We divide the data into training. validation and test subsets in order to a perform three-way split hold-out validation.
We generate four different models from the training data: An OLS model, a Poisson Regression model, a polynomial regression model and a basic neural network model. We then select our best model based on the performance of these models on the validation data, using MSE before using the test data to get an unbiased estimate of the expected performance of our final model on new data.
Note that the utils R package is used in this exercise. You will need to have it installed on your system. You can install packages using the install.packages function in R.
© Dr Michael Ashcroft