About this course
This course introduces you to practical data mining using the Weka workbench. We explain the basic principles of several popular algorithms and how to use them in practical applications. The aim of the course is to dispel the mystery that surrounds data mining. After completing it you will be able to mine your own data – and understand what it is that you are doing!
Teachers open the door. You enter by yourself. (Chinese proverb)
This is structured as a five week course:
- Week 1: A little bit of everything
- Week 2: Evaluation
- Week 3: Simple classifiers
- Week 4: More classifiers
- Week 5: Putting it all together
Each week focuses on a “Big Question.” For example, Week 1’s Big Question is: What’s it like to do data mining? The week covers a handful of activities that together address the question. Each activity comprises:
- 5-10 minute video
- Quiz. But no ordinary quiz! In order to answer the questions you have to undertake some practical data mining task. You don’t learn by watching someone talk; you learn by actually doing things! The quizzes give you an opportunity to do a lot of data mining.
I hear and I forget. I see and I remember. I do and I understand. (Confucius)
- Mid-class test at the end of Week 2
- Post-class test at the end of Week 5
This week …
In Week 1 you will get started with data mining. You will install Weka, explore its interface, explore some data sets, build a classifier, interpret the output, use filters, and visualize data sets. At the end of the week you will know what it’s like to do data mining!
Support for language learners
If English is not your first language – and even if it is! – you might be interested in trying F-Lingo, an experimental system designed to help you learn about selected words, phrases, and concepts used in the course. At present it only works in the Chrome browser. To try it out, download F-Lingo from the Chrome store and install it. Restart your browser and visit any page of the course; the rest happens automatically.
F-Lingo has been developed by Jemma Konig in her PhD project, which I am supervising. Using it will help her gather experimental usage data for her PhD. If you want to see what F-Lingo does without installing it, this 3-minute video illustrates its facilities.
- Video editing, Peter Oliver and Louise Hutt
- Captions, Jennifer Whisler
- Music: Mozart’s Divertimento No. 2, Allegro, performed by Woodside Clarinets: Paul King, Sarah Shieff and Ian Witten
- Share what you are learning, including difficulties, problems and solutions, with others in the class in a weekly discussion focused on the Big Question of the week and what you have learned
- Other discussions from time to time
- Transcripts are supplied for all videos
- Slides for all videos can be downloaded as a PDF file
You will download and install the free Weka software during Week 1. It runs on any computer, under Windows, Linux, or Mac. It has been downloaded millions of times and is being used all around the world.
(Note: Depending on your computer and system version, you may need admin access to install Weka.)
You need no programming experience for this course. And no math, though some high-school statistical concepts are used (means and variances, maybe confidence intervals).