Skip main navigation
We use cookies to give you a better experience, if that’s ok you can close this message and carry on browsing. For more info read our cookies policy.
We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

Prepare for the quiz

Before starting the quiz we suggest you reproduce what Ian did in the video, using the LED24 data in the Command Line Interface.

Make a test file with 100,000 instances. Precede all filenames below with an appropriate directory specification, surrounded by quotation marks if necessary.

java weka.datagenerators.classifiers.classification.LED24 -n 100000 
        -o test.arff

(The Command Line Interface gives no output; you need to look for the file to see if it has worked.)

Make a training file with 10,000,000 instances:

 java weka.datagenerators.classifiers.classification.LED24 -n 10000000 
        -o train.arff

Apply NaiveBayesUpdateable:

 java weka.classifiers.bayes.NaiveBayesUpdateable -t train.arff 
        -T test.arff -v

(This takes about 30 secs on my computer. The “-v” suppresses evaluation on the training file, which the Command Line Interface does by default.)

Verify that Weka runs out of memory if cross-validation is attempted:

 java weka.classifiers.bayes.NaiveBayesUpdateable -t train.arff

Weka will become unresponsive, although you will probably not get an error message. (Unfortunately it is difficult to trap and report out-of-memory errors in a Java program.)

If you feel brave, repeat the exercise with a 100,000,000-instance training file.

Share this article:

This article is from the free online course:

More Data Mining with Weka

The University of Waikato

Contact FutureLearn for Support