Online course

More Data Mining with Weka

Learn more about practical data mining, including how to deal with large data sets. Use advanced techniques to mine your own data.

This course is part of the Practical Data Mining program, which will enable you to become a data mining expert through three short courses.

Become an experienced data miner

This course introduces advanced data mining skills, following on from Data Mining with Weka. You’ll process a dataset with 10 million instances. You’ll mine a 250,000-word text dataset. You’ll analyze a supermarket dataset representing 5000 shopping baskets. You’ll learn about filters for preprocessing data, selecting attributes, classification, clustering, association rules, cost-sensitive evaluation. You’ll meet learning curves and automatically optimize learning parameters. Weka originated at the University of Waikato in NZ, and Ian Witten has authored a leading book on data mining.

Download video: standard or HD

Skip to 0 minutes and 6 secondsHi! I'm Ian Witten from the beautiful University of Waikato in New Zealand, and I'd like to tell you about our new online course More Data Mining with Weka. It's an advanced version of Data Mining with Weka, and if you liked that, you'll love the new course. It's the same format, the same software, the same learning by doing. The aim is the same, as well, to enable you to use advanced techniques of data mining to process your own data and understand what you're doing.

Skip to 0 minutes and 38 secondsYou don't need to have actually completed the old course in order to embark on the new one, but we won't be covering things again, so you will need to know something about data mining and the Weka machine learning workbench. The course has short, 5-10 minute video lessons. Slides and captions are available, as well. As before, Weka will be a laboratory for you to learn the practice and the principles of advanced data mining. Each lesson is followed by a carefully designed activity that reinforces what you learned in the lesson. You're going to do most of your learning actually doing the activities. You won't learn by listening to me talking or watching me do things, you'll learn by doing stuff yourself.

Skip to 1 minute and 21 secondsMore Data Mining with Weka, coming soon to a computer near you! Hope to see you there!

What topics will you cover?

  • Running large-scale data mining experiments
  • Constructing and executing knowledge flows
  • Processing very large datasets
  • Analyzing collections of textual documents
  • Mining association rules
  • Preprocessing data using a range of filters
  • Automatic methods of attribute selection
  • Clustering data
  • Taking account of different decision costs
  • Producing learning curves
  • Optimizing learning parameters in data mining

When would you like to start?

What will you achieve?

By the end of the course, you'll be able to...

  • Compare the performance of different mining methods on a wide range of datasets
  • Demonstrate how to set up learning tasks as a knowledge flow
  • Solve data mining problems on huge datasets
  • Apply equal-width and equal-frequency binning for discretizing numeric attributes
  • Identify the advantages of supervised vs unsupervised discretization
  • Evaluate different trade-offs between error rates in 2-class classification
  • Classify documents using various techniques
  • Debate the correspondence between decision trees and decision rules
  • Explain how association rules can be generated and used
  • Discuss techniques for representing, generating, and evaluating clusters
  • Perform attribute selection by wrapping a classifier inside a cross-validation loop
  • Describe different techniques for searching through subsets of attributes
  • Develop effective sets of attributes for text classification problems
  • Explain cost-sensitive evaluation, cost-sensitive classification, and cost-sensitive learning
  • Design and evaluate multi-layer neural networks
  • Assess the volume of training data needed for mining tasks
  • Calculate optimal parameter values for a given learning system

Who is the course for?

This course is aimed at anyone who deals in data. It follows on from Data Mining with Weka, and you should have completed that first (or have otherwise acquired a rudimentary knowledge of Weka). As with the previous course, it involves no computer programming, although you need some experience with using computers for everyday tasks. High-school maths is more than enough; some elementary statistics concepts (means and variances) are assumed.

What software or tools do you need?

Before the course starts, download the free Weka software. It runs on any computer, under Windows, Linux, or Mac. It has been downloaded millions of times and is being used all around the world.

(Note: Depending on your computer and system version, you may need admin access to install Weka.)

Who will you learn with?

Ian Witten

I grew up in Ireland, studied at Cambridge, and taught computer science at the Universities of Essex in England and Calgary in Canada before moving to paradise (aka New Zealand) 25 years ago.

Who developed the course?

Sitting among the top 3% of universities world-wide, The University of Waikato prepares students to think critically and to show initiative in their learning.

Learners collage mobile

Join this course


  • Access to this course for 7 weeks
  • Includes any articles, videos, peer reviews and quizzes


  • Unlimited access to this course
  • Includes any articles, videos, peer reviews and quizzes
  • Tests to validate your learning
  • Certificate of Achievement to prove your success when you're eligible
  • Download and print your Certificate of Achievement anytime