Diego Diaz

More Data Mining with Weka

The University of Waikato

This course introduced advanced data mining skills – methods for turning raw data into useful information. Participants learned how to use advanced facilities of the Weka data mining workbench to solve a variety of large-scale data mining problems in various domains.

5 weeks, 4 hours per week

Ian H. Witten

Professor of Computer Science

The University of Waikato

Learning outcomes

  • Compare the performance of different mining methods on a wide range of datasets
  • Demonstrate how to set up learning tasks as a knowledge flow
  • Solve data mining problems on huge datasets
  • Apply equal-width and equal-frequency binning for discretizing numeric attributes
  • Identify the advantages of supervised vs unsupervised discretization
  • Evaluate different trade-offs between error rates in 2-class classification
  • Classify documents using various techniques
  • Debate the correspondence between decision trees and decision rules
  • Explain how association rules can be generated and used
  • Discuss techniques for representing, generating, and evaluating clusters
  • Perform attribute selection by wrapping a classifier inside a cross-validation loop
  • Describe different techniques for searching through subsets of attributes
  • Develop effective sets of attributes for text classification problems
  • Explain cost-sensitive evaluation, cost-sensitive classification, and cost-sensitive learning
  • Design and evaluate multi-layer neural networks
  • Assess the volume of training data needed for mining tasks
  • Calculate optimal parameter values for a given learning system


  • Running large-scale data mining experiments
  • Constructing and executing knowledge flows
  • Processing very large datasets
  • Analyzing collections of textual documents
  • Mining association rules
  • Preprocessing data using a range of filters
  • Automatic methods of attribute selection
  • Clustering data
  • Taking account of different decision costs
  • Producing learning curves
  • Optimizing learning parameters in data mining

Issued on 26th March 2018

