Want to keep learning?

This content is taken from the The University of Waikato's online course, Advanced Data Mining with Weka. Join the course to learn more.

Skip to 0 minutes and 11 seconds MOA can be used in three different ways, using the graphical user interface, the command line, or the Java API. Let’s start with classification evaluation. In batch setting,

Skip to 0 minutes and 24 seconds we have two different types of evaluation: holdout, when we have different data for testing and training, or 10-fold cross-validation, when we are using the same data for testing and training.

Skip to 0 minutes and 35 seconds In the incremental setting, what we have is that we have two types: holdout evaluation and also prequential evaluation. Let’s look at these two types of evaluation. In the holdout evaluation, what we are doing is that we are training our model one instance by one instance and then, periodically, we are doing an evaluation testing using different instances. In the prequential evaluation, what we are doing is that we are using the same data for testing and training. In that sense, what we are doing is that we are testing and training every one of the instances of the stream. Every time a new instance arrives, first we test and then we train. Let’s look at the MOA interface.

Skip to 1 minute and 20 seconds First, we’re going to download the software, so let’s go to the MOA webpage [moa.cms.waikato.ac.nz]. From the MOA webpage, what we are going to do is go to Downloads, and from there, we are going to download the last release.

Skip to 1 minute and 38 seconds OK. Once we have the – once MOA is downloaded, we can run it from the bin folder if it’s in Windows using moa.bat and, if not, using moa.sh. Let’s run it. What we see is that we have several tabs. One is for classification, the other is for regression, also for clustering, outliers, and concept drift. Let’s start with classification. Let’s run a task. We’re going to run an evaluation task. Let’s start with a holdout evaluation, with EvaluatePeriodicHeldOutTest. We need to specify the learner, in this case it’s going to be the HoeffdingTree. What’s the stream? In this case, we’re going to select the HyperplaneGenerator.

Skip to 2 minutes and 29 seconds OK, and then how many instances we want to use for testing, in that case we say that we want to use 1000 instances and we want to train 1,000,000 instances. And we want to see the results every 10,000 instances. OK. That’s the definition of the task. We see that it’s here specified, EvaluatePeriodicHeldout, and then we run it. We see that here, we have all the results. And here we see that there’s a plot of these results where we have also the different measures, like accuracy, kappa. OK, now let’s run a prequential evaluation. Again, we change the task. We’re going to change to EvaluatePrequential. We’re going to define again what’s the learner, in this case, it’s going to be the HoeffdingTree.

Skip to 3 minutes and 31 seconds OK. Then the stream, we’re going to select the HyperplaneGenerator. OK, and then we’re going to train 1,000,000 instances, and we’re going to look at the results every 10,000 instances. Now we run the task. Here we see the results, and here we see the evolution of these measures, and now the nice thing is that we can compare both. If we look at this, we see that one appears in red and the other appears in blue. We can take a look at that and we can also zoom it to look at it in more detail. Another way to use MOA is using the command line.

Skip to 4 minutes and 37 seconds We can reuse the command line that we have in the graphical user interface, when we were selecting what was the task that we want to run. We can use the same text, and we can put it inside the command line. What we are doing then is that we are executing the task using this moa.DoTask. Then, we need only to specify what is the task, what is the learner that we want to use, what is the stream we want to use, how many instances we want to use. In this lesson, we have seen how to use the MOA interface. We know that there are three different ways. We have the graphical user interface, the command line, and the Java API.

Skip to 5 minutes and 22 seconds Also, we have seen the two types of evaluation for incremental learning. That is the holdout evaluation and the prequential evaluation.

The MOA interface

We download MOA and run it. Incremental data stream mining calls for different evaluation methods from batch operation. One possibility is to interleave training and testing by periodically holding out some test data from the input stream (called “periodic” evaluation); another is to test the current classifier on each new instance in the stream before using it for training (called “prequential” evaluation). MOA, which can be invoked through its interactive interface or from the command line, includes many data stream generators. Here we run the HoeffdingTree algorithm on data from the HyperplaneGenerator, and evaluate it both periodically and prequentially.

Share this video:

This video is from the free online course:

Advanced Data Mining with Weka

The University of Waikato

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: