Skip main navigation

Baseline accuracy

Ian Witten runs several classifiers, and compares their results with a simple baseline. For another dataset they do much worse than the baseline!

The diabetes dataset has several attributes and a class that is either tested_negative or tested_positive (for diabetes). With Percentage split evaluation (66% training set, 34% test set), J48 yields 76% correctly classified instances. You can try other classifiers such as NaiveBayes (77%), IBk (73%), PART (74%). These results can be compared with a simple classifier called a “baseline”; the ZeroR baseline yields 65%. But in other situations the baseline does equally well – and sometimes much better than – more sophisticated classifiers. Beware!

This article is from the free online

Data Mining with Weka

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now