Skip main navigation

How can you discretize numeric attributes?

Ian Witten introduces this week's first Big Question

Converting numeric attributes to nominal is called “discretization”.

But wait! Why would you want to do this? Well, for one thing, some machine learning methods only work on nominal attributes. For another, a model like a decision tree that branches on nominal values like very_big, big, medium, small, very_small may be easier to understand than one that uses numbers. Also, it may actually be better to determine split-points using global information from the whole dataset rather than individually for each branch of the tree.

Supervised discretization is when you take class information into account when determining the split-points. This might produce better split points. But it introduces a subtle but crucial issue concerning the use of class information when creating a model, for then what should you do when faced with with test data that is (of course) completely unlabelled? This important issue transcends discretization, but it’s easier to grasp in the specific context of discretization.

At the end of this week you will be able to explain various discretization strategies: equal width and equal frequency; unsupervised and supervised. You will be able to discretize in a way that preserves the ordering information inherent in numeric attributes, even though the resulting nominal attributes have no intrinsic ordering. You will appreciate why pre-discretization might be better than building the same discretization method into a classifier—and why it might work the other way round! And you will be able to use Weka’s FilteredClassifier to fairly evaluate a classification method that involves supervised discretization.

This article is from the free online

More Data Mining with Weka

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education