## Want to keep learning?

This content is taken from the The University of Waikato's online course, Advanced Data Mining with Weka. Join the course to learn more.
1.11

## The University of Waikato

Skip to 0 minutes and 11 seconds Hello, and welcome back to New Zealand for another few minutes of Advanced Data Mining with Weka. We’re going to continue our exploration of the time series forecasting package. In the last lesson I showed you some graphs, which I actually made with Excel for the purposes of presentation, but the time series forecasting package can make such graphs itself, and we’re going to show you how to look at the output of the package. I think you should restart the Explorer, just to reinitialize all of the options in the time series forecasting stuff, and load airline.arff. I’ve done that. I’m going to go to Forecast

Skip to 0 minutes and 49 seconds and click Start here, and we get this output, which we haven’t looked at before: “Train future predictions” it’s called; and you can see this is a graph actually of passenger numbers, and if you look very carefully, you can see that these are square data points, and the very last one is a round data point. That’s the predicted passenger number. We’re only predicting one time unit here, but we can change that. Let’s go up to the interface and change the number of time units to forecast to, say, 12, and try again. Now you can see that we’ve got these 12 predicted points and a dashed line. So we’re forecasting ahead, from the end of the training data.

Skip to 1 minute and 35 seconds Let’s go to the Lag creation panel, and remember we removed the leading instances with unknown lag values. That will remove the first 12 instances, and we can do that again. Actually, it doesn’t affect the graph. We still get the same graph, but we know that the first 12 instances are not being used to create the model. Coming back to the slide, think about the timeline like this. Here’s the dataset, that top line, and underneath we’ve got the dashed line with the leading instances, 12 of them, and then the training data for future predictions, and then the future predictions leading ahead after the end of the dataset. All right. Now let’s do some evaluation here.

Skip to 2 minutes and 19 seconds We’re going to evaluate on the training data and on 24 held-out instances. I’m going to go to the Evaluation panel and evaluate on the training data and 24 held-out instances, two years worth. Run that. Now I get the “train future predictions” output here, which ends at the end of the training data and then shows us the future for 12 future predictions from that point. Coming back to the slide, we’ve got the dataset. We’ve got the training data now, which is all of the dataset except for the last 24 instances, and the future predictions from the training data is the dashed line there.

Skip to 3 minutes and 7 seconds Then if we look at the other output here – going back to Weka – ”Test future predictions”, you can see now that we’ve got the test data here and future predictions from the end of the test data, this dashed line with the round points. Coming back to the slide, we’ve got the whole dataset, then we’ve got the training data, and then we’ve got the test data and future predictions from the end of the test data, that is, after the end of the dataset. Now it would be nice to see the one-step-ahead estimates for the test data. There are a lot of graphing options here.

Skip to 3 minutes and 48 seconds First of all, I’m going to turn off the evaluation on training, because that’s going to give us too much data to look at. Let’s just look at evaluating on the test data. I’m not going to graph the future predictions at all. Now if I run this, I get no graphical output. There’s nothing. Let’s turn on “Graph the predictions at step 1” and run it. Now you can see here the test predictions for the target. You can see in blue the predicted passenger numbers and in red the actual passenger numbers. So we can see there the discrepancy on the test data between the one-step-ahead predictions and the actual data itself. We’re going to do a little bit more on this panel.

Skip to 4 minutes and 40 seconds We’re going to graph the predictions at step 12, that is, 12-step-ahead predictions, and then we’re going to compare 1-step-ahead, 6-steps-ahead, and 12-steps-ahead predictions. Let’s go back here. I’m going to graph the predictions at step 12. Now, I of course get worse predictions, because we’re predicting 12 steps ahead. You’d expect that to get worse. There’s a consistent error, where they undershoot the actual data values because, of course, with multi-step-ahead predictions, with any step-ahead predictions, once you make an error on the first prediction, then that error continues to propagate through the future predictions. Let’s graph the target. We’ve only got one possible target here.

Skip to 5 minutes and 31 seconds If we had other attributes, we could graph them, but we’re just going to graph passenger_numbers at step 12, and actually that’s going to give the same result.

Skip to 5 minutes and 41 seconds I’ve got two graphs here: the one we had before, and the new one, which looks exactly the same. However, you can do better things here. I’m going to turn the old one off just to stop too much confusion, and I’m going to graph – we can put in a comma-separated list of numbers here – so I’m going to graph 1-step-ahead, 6-steps-ahead, and 12-steps-ahead predictions. Now you can see them in different colors. The difference between 1-step-ahead predictions, the most accurate, that’s the blue line; 6-steps-ahead predictions, which is the green line; and, yellow, which is considerably worse, and 12-steps-ahead predictions, which is a

Skip to 6 minutes and 29 seconds bit worse still: the yellow line. You can compare predictions at different points ahead. I’m just going to improve these predictions to finish off. I’m going to go to my base learner and change it from linear regression to SMO. Let’s have a look at that. You can see those predictions are quite a bit better than they were with linear regression. Let’s go and change – we’re using this large model with a large number of attributes here – I’m going to reduce the number of attributes. I’m going to just use a lag of 12, and then I’m going not to include powers of time. I’m not going to include products of time and lag variables.

Skip to 7 minutes and 14 seconds I’m going here, and I’m going to customize this by not including any of these periodic attributes. If I run this again, well I’ve got a much simpler model here. This is the model based on just the date and the lag by 12. Now if I look at those graphs that I saw before. Well, you can’t see them. You can’t see them, because they’re all on top of each other. It’s plotting the red and the blue last, and the green and the yellow are hidden underneath the 1-step-ahead predictions. I’ve shown you several different options for visualizing time series predictions. We

Skip to 7 minutes and 57 seconds talked about the need to distinguish different parts of the timeline: the initialization part with the leading instances, which contain unknown values for the lag variables; extrapolation past the end of the dataset into future predictions; the full training data; the test data, if evaluation is specified; and the training data with the test data held out; and we extrapolate past the end of that for so-called “future predictions” based on the training data. We showed how you can look at different numbers of steps ahead when making predictions. You can read more about this in a document about the time series analysis and forecasting package with Weka, referenced there at the bottom.

# Looking at forecasts

Weka’s time series forecasting package includes options for visualizing predictions for any number of steps ahead, as well as performance on the training data. As well as visualizing future predictions, you can hold out the last few instances of the dataset for testing and visualize performance on these. Errors accumulate on multi-step future predictions, and you can assess the effect of this by looking at 1-step-ahead – or indeed any-number-of-steps-ahead – predictions on held-out test data.