What happens when different errors have different costs?
In this course (and its predecessor, Data Mining with Weka), we’ve been obsessed with classification accuracy – the percentage of correct predictions on test data – as the measure of success.
But is it really the right thing to measure?
We touched on this when looking at document classification in Week 2, where we learned how to use “threshold curves” to show different tradeoffs between error types. But it’s not just document classification. In reality, when you look at the larger picture surrounding any deployment of machine learning, different errors almost always have different costs. Even with the weather: if today I undertake a certain activity because suitable weather is predicted, and that prediction is incorrect, I might suffer, even die. This – though thankfully not the dying part – happened to me just the other day, when I was sailing offshore in predicted 20 knot winds and was hit by an unforecasted 50 knot gust. (It was exciting, and dangerous.) On the other hand, if unsuitable weather had been incorrectly predicted, I might just have spent the day playing music instead of sailing, in which case little would have been lost.
At the end of this week you will be able to take different error costs into account when doing machine learning. Of course, we won’t be able to help you ascertain what the costs are. But if you know the different costs of incorrect predictions, you’ll be able to take this into account when measuring performance. You’ll also know how the output of an ordinary classifier can be post-processed to take account of different error costs, and – an alternative method – how to build a classifier that takes account of the costs internally.