Skip main navigation

Training and testing

How can you evaluate how well a classifier does? Ian Witten explains that training set performance is misleading, and demonstrates some alternatives.

How can you evaluate how well a classifier does? Training set performance is misleading. It’s like asking a child to memorize 1+1=2, 1+2=3 and then testing them on exactly the same questions, whereas you really want them to be able to answer questions like 2+3=?. We want to generalize from the training data to get a more widely applicable classifier. To evaluate how well this has been done, it must be tested on an independent test set. If you only have one dataset, set aside part of it for testing and use the rest for training.

This article is from the free online

Data Mining with Weka

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now