Skip main navigation

Avenues for further investigation

Ian Witten discusses directions for further investigation of the soil sample problem.

You can perform much more experimentation in search of a good model!

For example, we have not examined the effects of parameter changes in either the classifiers or the preprocessing techniques (except for the Savitzky-Golay window size).

One problem faced in all application development is knowing when a result is good enough to be useful in practice. In our experience, the correlation coefficient needs to increase to 0.95–0.99 for this problem. Our best result in this activity is 0.87, still a long way off. Another important factor that we have not explored is the effect of outliers in regression problems. Filtering out outlier instances can make a huge difference to performance.

This article is from the free online

Advanced Data Mining with Weka

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now