Skip main navigation

Avenues for further investigation

Ian Witten discusses directions for further investigation of the soil sample problem.

You can perform much more experimentation in search of a good model!

For example, we have not examined the effects of parameter changes in either the classifiers or the preprocessing techniques (except for the Savitzky-Golay window size).

One problem faced in all application development is knowing when a result is good enough to be useful in practice. In our experience, the correlation coefficient needs to increase to 0.95–0.99 for this problem. Our best result in this activity is 0.87, still a long way off. Another important factor that we have not explored is the effect of outliers in regression problems. Filtering out outlier instances can make a huge difference to performance.

This article is from the free online

Advanced Data Mining with Weka

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education