In this article, we discuss questions and answers about some basic issues about the use of Weka in practice. I’ve trained a classifier. How can I use it to classify …
In the preceding video I talk about changing the random number seed in the Weka Explorer and getting a different result. Were you mystified? An explanation follows. Here’s the issue. …
Thanks for taking this course. We hope you’ve enjoyed it. We’ve introduced you to practical data mining using the Weka workbench. We explained the basic principles of several popular algorithms …
There’s no magic in data mining! In fact, perhaps Weka makes things too easy. It is important to understand, and evaluate, what you’re doing, not just click around looking for …
Data mining is a powerful technology, and I urge you to be ethical in its use. Data is sensitive stuff and should be treated with care. Personal data is particularly …
Be skeptical, and wary of overfitting. Always use fresh data for evaluation. Datasets often have missing values, which can mean different things – and different classifiers treat them in different …
If your vision of data mining is to get some data, apply Weka, get a cool result, and everyone’s happy – think again! Before you even begin to apply a …
You’ve learned lots in this course about machine learning and its use in data mining. Most importantly, you’ve learned that there’s no magic in data mining, just a bunch of …
Sometimes committees make better decisions than individuals. An ensemble of different classification methods can be applied to the same problem and vote on the classification of test instances. Bagging, randomization, …
In essence, support vector machines drive a straight line between two classes, right down the middle of the channel – which you can see using Weka’s boundary visualizer. If the …
Many classification methods produce probabilities rather than black-or-white classifications. Naive Bayes is an obvious example, but other methods do too. The numbers between 0 and 1 produced by linear regression …
Linear regression can be used for classification too. On the diabetes data, use the NominalToBinary filter to convert the two classes, which are nominal, to the numeric values 0 and …
Classification involves a nominal class value, whereas regression involves a numeric class. Linear regression is a classical statistical method that computes the coefficients or “weights” of a linear expression, and …