Skip main navigation

Pitfalls and pratfalls

Be skeptical, and wary of overfitting, warns Ian Witten. There’s no “best learner”: all methods have biases. Data mining is an experimental science!

Be skeptical, and wary of overfitting. Always use fresh data for evaluation. Datasets often have missing values, which can mean different things – and different classifiers treat them in different ways. Finally, there’s no free lunch, no “best learner”. In order to generalize, any learner must embody some knowledge or assumptions beyond the data it’s given. All methods have biases. Data mining is an experimental science!

This article is from the free online

Data Mining with Weka

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now