Learn more about this course.

Review of terms

Ian Witten reviews the terms introduced in the preceding video

Let’s review some of the key terms we’ll be using.

A dataset is a set of instances

In Weka, it’s stored in what’s called an ARFF file. This is just a text file, where each line represents one instance. In the next video (“The Glass data”) we’ll see what an ARFF file looks like. For example, weather.nominal.arff is an ARFF file; so is weather.numeric.arff.

An instance is a single example

Want to keep
learning?

This content is taken from
The University of Waikato online course,

Data Mining with Weka

View Course

In Weka, each line of an ARFF file represents a different instance. For example, the weather data contains 14 instances that are supposed to represent 14 different days. The instances are the rows of the ARFF file.

An attribute is a characteristic of an instance

… or you might call it a “feature” of the instance. For example, the weather data has 5 attributes: outlook, temperature, humidity, windy, and play. For every instance there a value for each attribute – for the first instance (day) in the weather data it’s outlook = sunny, temperature = hot, humidity = high, windy = FALSE, play = no. The attributes are the columns of the ARFF file.

In Weka, attributes can be nominal or numeric. The value of a nominal attribute is represented by a word: sunny, overcast, and rainy for the outlook attribute; yes and no for the play attribute. As you might expect, the value of a numeric attribute is a number: 85, 72, 55.34, whatever.

The class of an instance is what you’re trying to predict

It’s one of the attributes. In the weather data the goal is to predict the value of the attribute play (yes or no) from the values of the other attributes outlook, temperature, humidity, and windy. Given these attribute values for the weather, should you play the game?

The goal is to determine the class of new instances

The goal is to create a classifier – something that can determine the classes of instances. A classifier is a model – like some kind of a formula – that allows the class attribute to be determined from the other attributes. The classifier is produced automatically from a “training” data set. “New” instances are ones that aren’t in the training set.

Want to keep learning?

This content is taken from The University of Waikato online course

Data Mining with Weka

View Course

See other articles from this course

This article is from the free online

Data Mining with Weka

Created by

Join Now

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now

Learn more about this course.

Review of terms

A dataset is a set of instances

An instance is a single example

Want to keep
learning?

Data Mining with Weka

An attribute is a characteristic of an instance

The class of an instance is what you’re trying to predict

The goal is to determine the class of new instances

Want to keep learning?

Data Mining with Weka

Data Mining with Weka

Data Mining with Weka

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.

See all FutureLearn courses.

Learn more about this course.

Review of terms

A dataset is a set of instances

An instance is a single example

Want to keep learning?

Data Mining with Weka

An attribute is a characteristic of an instance

The class of an instance is what you’re trying to predict

The goal is to determine the class of new instances

Want to keep learning?

Data Mining with Weka

Share this

Data Mining with Weka

Data Mining with Weka

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.

See all FutureLearn courses.

Want to keep
learning?