Want to keep learning?

This content is taken from the The University of Waikato's online course, More Data Mining with Weka. Join the course to learn more.

What if there's no "class" attribute?

This course so far – like it’s predecessor, Data Mining with Weka – has focused solely on tasks whose aim is to predict the value of a particular attribute called the “class”.

When the class is nominal, this is called classification; when it’s numeric, it’s called regression (why? – don’t ask, it’s kinda crazy). Both tasks are sometimes called “supervised” learning, the idea being that there is a supervisor (or teacher) who dictates what the correct class should be.

But what can you do with a dataset that has no class value? One idea is to seek associations between any of the attributes, or between any set of attributes. Associations are invariably expressed as rules, and this is called “association rule mining”. Another idea is to see if the instances fall into natural groups, a task known as “clustering”. In both cases it’s pretty hard to evaluate the result in objective ways – unlike classification, where the gold standard is to predict the class correctly on fresh data.

This week we’ll examine both of these tasks. By the end you will be able to apply association rule mining to a dataset and seek interesting associations. For any rule you’ll be able to calculate the key parameters of support and confidence. And you’ll have experienced some of the limitations of association rule mining and how difficult it can be to find interesting patterns in data.

You’ll also be experienced in using different clustering methods, and will have learned to be sceptical if the results look too good! And you’ll be able to evaluate clusterings using the classification-by-clustering method.

Share this article:

This article is from the free online course:

More Data Mining with Weka

The University of Waikato

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: