• Waikato

More Data Mining with Weka

Enhance your skills in practical data mining as you get to grips with using large data sets and advanced data mining techniques.

Join - $14What's included?

12,999 enrolled on this course

More Data Mining with Weka

Learn how to process, analyse, and model large data sets

On this course, led by the University of Waikato where Weka originated, you’ll be introduced to advanced data mining techniques and skills.

Following on from their first Data Mining with Weka course, you’ll now be supported to process a dataset with 10 million instances and mine a 250,000-word text dataset.

You’ll analyse a supermarket dataset representing 5000 shopping baskets and learn about filters for preprocessing data, selecting attributes, classification, clustering, association rules, cost-sensitive evaluation.

You’ll also explore learning curves and how to automatically optimize learning parameters.

Download video: standard or HD

Skip to 0 minutes and 6 seconds Hi! I’m Ian Witten from the beautiful University of Waikato in New Zealand, and I’d like to tell you about our new online course More Data Mining with Weka. It’s an advanced version of Data Mining with Weka, and if you liked that, you’ll love the new course. It’s the same format, the same software, the same learning by doing. The aim is the same, as well, to enable you to use advanced techniques of data mining to process your own data and understand what you’re doing.

Skip to 0 minutes and 38 seconds You don’t need to have actually completed the old course in order to embark on the new one, but we won’t be covering things again, so you will need to know something about data mining and the Weka machine learning workbench. The course has short, 5-10 minute video lessons. Slides and captions are available, as well. As before, Weka will be a laboratory for you to learn the practice and the principles of advanced data mining. Each lesson is followed by a carefully designed activity that reinforces what you learned in the lesson. You’re going to do most of your learning actually doing the activities. You won’t learn by listening to me talking or watching me do things, you’ll learn by doing stuff yourself.

Skip to 1 minute and 21 seconds More Data Mining with Weka, coming soon to a computer near you! Hope to see you there!


  • Week 1

    Exploring Weka's interfaces, and working with big data

    • Hello again

      This practical course on more advanced data mining follows on from Data Mining with Weka. You'll become an expert Weka user, and pick up many new techniques and principles of data mining along the way.

    • What are Weka's other interfaces for?

      Each week we’ll focus on a couple of “Big Questions” relating to data mining. This is the first Big Question for this week.

    • Exploring the Experimenter

      You can use the Experimenter to find the performance of classification algorithms on datasets, or to determine whether one classifier performs better (or runs faster) than another. In the Explorer, such things can be tedious.

    • Comparing classifiers

      The Experimenter can be used to compare classifiers. The "null hypothesis" is that they perform the same. To show that one is better than the other, we must *reject* this hypothesis at a given level of statistical significance.

    • The Knowledge Flow interface

      The Knowledge Flow interface is an alternative to the Explorer. You can lay out filters, classifiers, evaluators on a 2D canvas ... and connect them up in different ways. Data and classification models flow through the diagram!

    • Using the Command Line

      You can do everything the Explorer does (and more) from the command line. One advantage is that you get more control over memory usage. To access the definitive source of Weka documentation you need to learn to use JavaDoc.

    • Can Weka process big data?

      This week's second Big Question!

    • Working with big data

      The Explorer can handle pretty big datasets, but it has limits. However, the Command Line Interface does not: it works incrementally whenever it can. Some classifiers can handle arbitrarily large datasets.

  • Week 2

    Discretization and text classification

    • How can you discretize numeric attributes?

      This week's first Big Question!

    • Discretizing numeric attributes

      There are two basic methods for converting numeric attributes to nominal: equal-width binning and equal-frequency binning. A third method is able to preserve the ordering information inherent in numeric values.

    • Supervised discretization

      "Supervised" discretization takes the class into account when setting discretization boundaries. But when testing, the boundaries must be determined from the *training* set. You can do this with Weka's FilteredClassifier.

    • Discretization in J48

      Some classifiers (e.g. C4.5/J48) incorporate discretization internally, as they go along. But pre-discretization may outperform internal discretization. Whether it does or not is an experimental question!

    • How do you classify documents?

      This week's second Big Question!

    • Document classification

      In Weka, documents are represented as "string" attributes, and the StringToWordVector filter creates one attribute for each word. But the overall classification accuracy isn't necessarily what we really care about?

    • Evaluating 2-class classification

      Threshold curves show different tradeoffs between error types. "Receiver Operating Characteristic" (ROC) curves are a particular type of threshold curve, and the area under the ROC curve measures a classifier's overall quality.

    • Multinomial Naive Bayes

      Multinomial Naive Bayes is a classification method designed for text, which is generally better, and a lot faster, than plain Naive Bayes. In addition, the StringToWordVector filter has many useful options.

    • How are you getting on?

      We're well into the course now. Let's just take stock.

  • Week 3

    Classification rules, association rules, and clustering

    • Is it better to generate rules or trees?

      This week's first Big Question!

    • Decision trees and rules

      Any decision tree has an equivalent set of rules ... and for any set of rules there's an equivalent decision tree. But the complexities may be very different – particularly if the rules are to be executed in a predetermined order.

    • Generating decision rules

      "PART" makes good rules by repeatedly creating partial decision trees. Incremental reduced-error pruning is a standard pruning technique. "Ripper" follows this by complex optimization to make very small rule sets.

    • What if there's no "class" attribute?

      This week's second Big Question!

    • Association rules

      Instead of predicting a "class", association rules describe relations between any of the attributes. Support and Confidence are basic measures of a rule. "Apriori" is the standard association-rule-learning algorithm.

    • Learning association rules

      Apriori's strategy is to specify a minimum Confidence, and iteratively reduce Support until enough rules are found. It generates high-support "item sets" and turns them into rules.

    • Representing clusters

      With clustering, there is no "class" attribute. Instead of predicting the class, we try to divide the instances into natural groups, or "clusters". There are many different representations for clusters.

    • Evaluating clusters

      It's hard to evaluate clustering, except perhaps by visualization. Different clustering algorithms use different metrics for optimization. If the dataset has a "class" attribute, you can do "classes to clusters" evaluation.

  • Week 4

    Selecting attributes and counting the cost

    • How about selecting key attributes before applying a classifier?

      This week's first Big Question!

    • "Wrapper" attribute selection

      Fewer attributes often yield better performance! The "wrapper" method of attribute selection involves both an attribute evaluator and a search method. A classifier, wrapped inside a cross-validation loop, is used for evaluation.

    • The Attribute Selected Classifier

      Experimenting with a dataset to select attributes and applying a classifier to the result is cheating! – even when evaluating by cross-validation. The AttributeSelectedClassifier selects attributes based on the training set only.

    • Scheme-independent selection

      The "wrapper" method for evaluating attributes is slow. "Scheme-independent" methods that do not depend on a particular classifier can be faster. However, searching is still involved whenever you evaluate subsets of attributes.

    • Attribute selection using ranking

      Evaluating attributes individually is much faster than evaluating subsets. Single-attribute methods, often based on particular machine learning methods, can eliminate irrelevant attributes – but not redundant ones.

    • What happens when different errors have different costs?

      This week's second Big Question!

    • Counting the cost

      If different errors have different costs, the "classification rate" is inappropriate. Cost-sensitive evaluation takes account of cost when measuring performance. Cost-sensitive classification takes account of cost during learning.

    • Cost-sensitive classification

      A classifier can be made cost-sensitive by re-calculating internal probability thresholds to adjust its output; alternatively the classifier itself can be reimplemented to take account of the cost matrix.

  • Week 5

    Neural networks, learning curves, and performance optimization

    • What are "neural networks" and how can I use them?

      This week's first Big Question!

    • Simple neural networks

      The "Perceptron" is the simplest form of neural network. The basic perceptron implements a linear decision boundary ... though modern improvements allow more complex boundaries.

    • Multilayer perceptrons

      Multilayer perceptrons are networks of perceptrons, with an input layer, an output layer, and (perhaps many) "hidden layers". They can implement arbitrary decision boundaries, but have practical limitations.

    • How much training data do I need? And how do I optimize all those parameters?

      This week's second Big Question! (two questions, really)

    • Learning curves

      Find out how much data you need by plotting a learning curve using the "resample" filter (which allows sampling with or without replacement). You can avoid sampling the test set by using the FilteredClassifier.

    • Performance optimization

      Weka has several "wrapper" metalearners that optimize parameters for best performance: CVParameterSelection, GridSearch, and ThresholdSelector. You should avoid optimizing parameters manually: you're bound to overfit!

    • ARFF and XRFF

      The ARFF format can encode sparse data, weighted instances, and relational attributes. Some Weka filters and classifiers take advantage of sparsity to reduce space and increase speed. There's an XML version of ARFF, called XRFF.

    • There's no magic in data mining

      There's no magic in data mining – no universal "best" method. It's an experimental science. You've learned a lot – but there's plenty more! Data mining is a powerful technology: please use it wisely.

    • Farewell

      It's time to say goodbye again.

When would you like to start?

Start straight away and join a global classroom of learners. If the course hasn’t started yet you’ll see the future date listed below.

Learning on this course

On every step of the course you can meet other learners, share your ideas and join in with active discussions in the comments.

What will you achieve?

By the end of the course, you‘ll be able to...

  • Compare the performance of different mining methods on a wide range of datasets
  • Demonstrate how to set up learning tasks as a knowledge flow
  • Solve data mining problems on huge datasets
  • Apply equal-width and equal-frequency binning for discretizing numeric attributes
  • Identify the advantages of supervised vs unsupervised discretization
  • Evaluate different trade-offs between error rates in 2-class classification
  • Classify documents using various techniques
  • Debate the correspondence between decision trees and decision rules
  • Explain how association rules can be generated and used
  • Discuss techniques for representing, generating, and evaluating clusters
  • Perform attribute selection by wrapping a classifier inside a cross-validation loop
  • Describe different techniques for searching through subsets of attributes
  • Develop effective sets of attributes for text classification problems
  • Explain cost-sensitive evaluation, cost-sensitive classification, and cost-sensitive learning
  • Design and evaluate multi-layer neural networks
  • Assess the volume of training data needed for mining tasks
  • Calculate optimal parameter values for a given learning system

Who is the course for?

This course is aimed at anyone who deals in data professionally or is interested in furthering their professional or academic skills in data science.

This course follows on from Data Mining with Weka and it’s recommended that you complete that course first unless you already have a rudimentary knowledge of Weka.

As with the previous course, it involves no computer programming, although you need some experience with using computers for everyday tasks.

High school maths is more than enough; some elementary statistics concepts (means and variances) are assumed.

What software or tools do you need?

Before the course starts, download the free Weka software. It runs on any computer, under Windows, Linux, or Mac. It has been downloaded millions of times and is being used all around the world.

(Note: Depending on your computer and system version, you may need admin access to install Weka.)

Who will you learn with?

I grew up in Ireland, studied at Cambridge, and taught computer science at the Universities of Essex in England and Calgary in Canada before moving to paradise (aka New Zealand) 25 years ago.

Who developed the course?

The University of Waikato

Sitting among the top 3% of universities world-wide, The University of Waikato prepares students to think critically and to show initiative in their learning.

  • Established

  • Location

    Waikato, New Zealand
  • World ranking

    Top 380Source: QS World University Rankings 2021

What's included?

This is a premium course. These courses are designed for professionals from specific industries looking to learn with a smaller group of like-minded individuals.

  • Unlimited access to this course
  • Includes any articles, videos, peer reviews and quizzes
  • Tests to validate your learning
  • Certificate of Achievement to prove your success when you're eligible
  • Download and print your Certificate of Achievement anytime

Still want to know more? Check out our FAQs

Learning on FutureLearn

Your learning, your rules

  • Courses are split into weeks, activities, and steps to help you keep track of your learning
  • Learn through a mix of bite-sized videos, long- and short-form articles, audio, and practical activities
  • Stay motivated by using the Progress page to keep track of your step completion and assessment scores

Join a global classroom

  • Experience the power of social learning, and get inspired by an international network of learners
  • Share ideas with your peers and course educators on every step of the course
  • Join the conversation by reading, @ing, liking, bookmarking, and replying to comments from others

Map your progress

  • As you work through the course, use notifications and the Progress page to guide your learning
  • Whenever you’re ready, mark each step as complete, you’re in control
  • Complete 90% of course steps and all of the assessments to earn your certificate

Want to know more about learning on FutureLearn? Using FutureLearn

Learner reviews

Learner reviews cannot be loaded due to your cookie settings. Please and refresh the page to view this content.

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join:

Do you know someone who'd love this course? Tell them about it...

You can use the hashtag #FLmoredatamining to talk about this course on social media.