How do I evaluate a classifier's performance?
This week is all about evaluation.
Last week you downloaded Weka and looked around the Explorer and a few datasets. You used the J48 classifier. You used a filter to remove attributes and instances. You visualized some data, and classification errors. Along the way you encountered a few datasets: the weather data (both nominal and numeric versions), the glass data, and the iris data.
As you have seen, data mining involves building a classifier for a dataset. (Classification is the most common problem, though it’s not the only one.) Given an example or “instance”, the classifier will predict its class. But how good is it? Predicting the class of the instances that were used to train the classifier is pretty trivial: you could just store them in a database. But we want to be able to predict the class of new instances, ones that haven’t come up before.
You want to estimate the classifier’s performance on new, unseen, instances. But how can you do this if you don’t know what the instances are? It’s a conundrum. But there are good solutions, and we explore them this week.
In the first activity you’re going to experience what it’s like to actually be a classifier yourself, by constructing a decision tree interactively. In subsequent activities we’ll look at evaluation, including training and testing, baseline accuracy, and cross-validation.
At the end of the week you will know how to evaluate the performance of a classifier on new, unseen, instances. And you will understand how easy it is to fool yourself into thinking that your system is doing better than it really is.