Ian Witten

I grew up in Ireland, studied at Cambridge, and taught computer science at the Universities of Essex in England and Calgary in Canada before moving to paradise (aka New Zealand) 25 years ago.

Location New Zealand

Activity

Ian Witten replied to Dinda Veska

The glass data

08 SEP 2021

Yes, Weka can read csv files; see Q.3 of the quiz that follows (Step 1.17)
Ian Witten replied to Agustiah Agustiah

Cross-validation results

16 AUG 2021

Darn. Guess you're missing the best part of the course :-)
Ian Witten replied to Agustiah Agustiah

Installing Weka

11 JUL 2021

yes
Ian Witten replied to Seren Evans

Installing Weka: preview

16 JUN 2021

See here, Step 1.9: https://www.futurelearn.com/courses/data-mining-with-weka/9/steps/796485
Ian Witten replied to Tristen Fielding

Cross-validation results

21 APR 2021

You are correct. This weird result is only true for this particular dataset.
Ian Witten replied to n'nancocquot chrystelle kouassi

Welcome! Please introduce yourself

13 APR 2021

Great! But please note, this is an advanced course. If you haven't done it already, you might be better signing up for the introductory course Data Mining with Weka (https://www.futurelearn.com/courses/data-mining-with-weka/) instead.
Ian Witten replied to [Learner left FutureLearn]

Reflect on this week's Big Question

05 APR 2021

What exactly is your question? If this https://www.futurelearn.com/comments/58046078 is it then the answer is "no".
Ian Witten replied to Luiz Jacob

How are you getting on?

01 APR 2021

My understanding is that once the course opens, all five weeks become available immediately when you join the course. Is this not the case for you? (Since I am instructor rather than student, it's possible the interface works differently for me.)

Otherwise, I don't understand your question.
Ian Witten replied to Angel Perez

How would you apply this in real life?

19 FEB 2021

I just tried it and can confirm that the quiz answer is correct. And you are looking in the correct place. Try restarting Weka and doing the experiment again.
Ian Witten replied to [Learner left FutureLearn]

Using a filter

17 FEB 2021

!ʇhgir s’tahT
Ian Witten replied to Angel Perez

Building a classifier

17 FEB 2021

I believe I have fixed this now.
Ian Witten replied to Peter Rossler

Pitfalls and pratfalls

10 FEB 2021

The plant is, in fact, real (as is my hair). In New Zealand we don't need artificial plants.
Ian Witten replied to Stephanie Williams

Reflect on this week's Big Question

22 JAN 2021

Reminds me (for some reason) of "rubber duck debugging", a method of debugging code whose name refers to a story about a programmer who carried around a rubber duck and debugged their code by forcing themselves to explain it, line-by-line, to the duck. (Don't tell your husband.)
Ian Witten replied to Robert Gillespie

Analyzing functional MRI Neuroimaging data

16 DEC 2020

@RobertGillespie I just checked this, and I believe the answer is correct as it stands.
Ian Witten replied to Robert Gillespie

Reflect on this week’s Big Question

16 DEC 2020

@RobertGillespie You are correct, it's an error. I've fixed it. Thanks for pointing this out.
Ian Witten replied to Clarke Bacharach

Repeated training and testing

09 DEC 2020

It's a little clunky, but the quickest way is to select another classifier (or filter) and then re-select the one you want.
Ian Witten replied to Nahuel Bargas

The data mining challenge: An expert speaks

18 NOV 2020

Thanks for pointing this out. I've made them available on our own (Waikato University) computer, and changed the links appropriately.
Ian Witten replied to Anne C

The data mining process

27 OCT 2020

Click the pink link in the sentence "Download the regression_outliers.csv dataset and open it with Weka." in the quiz instructions. (And note the two points that follow in those instructions.)
Ian Witten replied to Robert Gillespie

Reflect on this week's Big Questions

14 OCT 2020

@RobertGillespie I checked, and the given answer is correct. I think you might not be constraining XMeans to 2 clusters only. Its default is maxNumClusters=4, minNumClusters=2; and you should change maxNumClusters to 2 – otherwise the result is as you describe.
Ian Witten replied to Robert Gillespie

Performance of the multilayer perceptron

14 OCT 2020

Yes, there's plenty of scope for more experimentation. Give it a go!
Ian Witten replied to paul martin

Farewell

07 OCT 2020

The follow-up course is running right now, More Data Mining with Weka (https://www.futurelearn.com/courses/more-data-mining-with-weka).
Ian Witten replied to Teresa Franco

Comparing classifiers

07 OCT 2020

@TeresaFranco Thanks for pointing this out; I've fixed it (stupid cut-and-paste error).
Ian Witten replied to Amanda Bluett

How well did you do?

23 AUG 2020

I'm not using Catalina myself, but I've talked to others who do and they report no problems regarding Weka.

You mention "lots of little error boxes": Weka doesn't normally pop up lots of boxes; it would be nice to know what's in them :-)

If you've worked with earlier versions of Weka previously, it may be worth removing the "wekafiles" folder in your...
Ian Witten replied to Richard Storey

Reflect on this week's Big Question

09 JUL 2020

Sorry you feel like that; hope it doesn't put you off the course. Did you manage to complete the quiz anyway?
Ian Witten replied to Tatiparti Padma

What about real-life classification methods?

07 JUL 2020

Please re-read step 3.15 Using Weka in practice: some questions (https://www.futurelearn.com/courses/data-mining-with-weka/7/steps/658023)
Ian Witten replied to Manish Pandey

More weather

08 JUN 2020

If only life were so easy ...
Ian Witten replied to Lorna Johnson

Classification by regression

03 JUN 2020

The aim of the quizzes is not so much to test your knowledge as to help with your learning. Sounds like it's working in your case :-) By the way, it's fine to look at the answers!
Ian Witten replied to Nicolas Brookes

Be a classifier!

02 JUN 2020

@NicolasBrookes I'm sorry you're giving up. But don't blame the Mac – Weka works perfectly well on a Mac; I use one all the time.
Ian Witten replied to Ojas Khandelwal

What could possibly go wrong?

28 MAY 2020

I have no idea what the problem is, and you don't mention what the error is – though I'm not sure it would help me to know.

Tens of thousands of people have installed the user classifier without reporting any problems, so I would guess it's some kind of network issue.

I'd recommend you try again when network loading is light. And if you can't install it,...
Ian Witten replied to Robert Gillespie

Invoking Weka from Python

28 MAY 2020

@MarkGlover: "Python 2.7 is obsolete ..." -- thanks for reminding me! I'll make a note below the video, and remove references to Python 2.7.
Ian Witten replied to Nicolas Brookes

Be a classifier!

27 MAY 2020

Do you know where Weka has been installed? All ARFF files should be in a folder called "data" within the weka-3-8-4 folder.
Ian Witten replied to ismael salam

What could possibly go wrong?

27 MAY 2020

Please try this: locate the folder called "wekafiles", which should be in your home directory, and remove it and all its contents. (This is where Weka puts the packages.)

Good for Google Translate :-)
Ian Witten replied to Cathal King

Farewell

26 MAY 2020

I have no idea, unfortunately. That's a FutureLearn question; I'm just the educator :-)

But congratulations anyway.
Ian Witten replied to Gamal Akabani

How would you apply this in real life?

26 MAY 2020

Precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances, while Recall (also known as sensitivity) is the fraction of the total amount of relevant instances that were actually retrieved.

Both are shown in the Weka classifier output, e.g. (for ZeroR on the Iris data):

=== Detailed Accuracy...
Ian Witten replied to Ken di

Problems with probabilities?

26 MAY 2020

The NaiveBayesMultinomial classifier, which (as you will learn about in Week 2 of the follow-up course More Data Mining with Weka) is used for text mining, is based on the multinomial model, which is a generalization of the binomial model.
Ian Witten replied to Brent U

Pitfalls and pratfalls

26 MAY 2020

Yes, it's not so important provided you get the idea.

See my response below (at https://www.futurelearn.com/comments/43942319) for what I think are the 8 outliers.
Ian Witten replied to Vivian Wagumba

Pitfalls and pratfalls

26 MAY 2020

@VivianWagumba Look at the Visualization tab and select the first of the four plots, which is X: year vs Y: phone calls. If you double-click a data point you get instance information, including the instance number. Instances 15, 16, 17, 18, 19 and 20 are clear outliers. Less obvious are instances 21 and (particularly) 14. To see that these are outliers,...
Ian Witten replied to Hawwau Moruf

How are you getting on?

26 MAY 2020

The difference is whether you're mining "data" (typically in spreadsheet-like tabular format) or text (typically a plain text file). Text mining is covered in Week 2 of the follow-up course, More Data Mining with Weka.
Ian Witten replied to Dragana Dj. Jeremic

Using Weka in practice: some questions

26 MAY 2020

The manual is in a file called WekaManual.pdf that appears in the weka-3-8-4 folder when you download Weka, . On my Mac that's /Applications/weka-3-8-4; on WIndows I guess weka-3-8-4 is in the Program Files folder.
Ian Witten replied to Katie Terrell

Installing Weka

25 MAY 2020

@KatieTerrell: rename the file weather.arff.txt to weather.arff.
Ian Witten replied to Coby Beck

Linear regression

21 MAY 2020

You will soon (Step 4.8) learn about Logistic Regression, which does (something like?) what I think you are describing. It often works well, but I know of no rules of thumb. That's why evaluation was stressed in Week 2.
Ian Witten replied to Jorge Pita

Using Naive Bayes and JRip

21 MAY 2020

Unfortunately this won't work at the moment. Please see my response to a question about the upcoming quiz on Cross-validating classifiers with Spark: https://www.futurelearn.com/comments/43342086
Ian Witten replied to Jorge Pita

Map tasks and Reduce tasks

21 MAY 2020

i's not your Mac; unfortunately it doesn't work on anything at the moment. Please see my response to a question about the upcoming quiz on Cross-validating classifiers with Spark: https://www.futurelearn.com/comments/43342086
Ian Witten replied to Ron W

Random Forest performance

20 MAY 2020

OK, here's the scoop. It turns out that the Spark 1.x libraries used in distributedWekaSpark are incompatible with recent versions of Java (> 1.8), which Weka has been updated to use. Spark 3.0 should resolve these issues, but it's only at the preview stage at the moment.

You can overcome this by installing a Java 1.8 runtime environment for Quiz 4.10 and...
Ian Witten replied to Ron W

Random Forest performance

19 MAY 2020

Looks like a bug to me. I'll check with Mark Hall.
Ian Witten replied to Robert Gillespie

Cost-sensitive classification

19 MAY 2020

You need to define a cost matrix using the costMatrix field of the CostSensitiveClassifier panel configuration panel.

The error message appears because by default Weka tries to load the cost matrix from a file (called breast-cancer.cost in this case).
Ian Witten replied to shine destine

Yikes! The math! It’s too much!

19 MAY 2020

"Multiresponse linear regression" and "pairwise linear regression" are different ways of using linear regression for the classification problem. (For a 2-class problem, there is no difference between the two.)

They are explained near the start of the Step 4.6 video "Classification by regression" (from 0:44 min:sec).

Multiresponse linear regression works...
Ian Witten replied to Lorna Johnson

Building a classifier

19 MAY 2020

> I'm part way through the quiz.

I guess you mean Step 1.19, Using J48. (It's annoying that FutureLearn interface allow comments on quizzes – which in this course is where you most need them! – but I'm trying to establish the convention that queries are posted on the step following the quiz, not the one preceding it.)

> I've opened the labor.arff...
Ian Witten replied to shine destine

Classification by regression

19 MAY 2020

The other attributes have already been used to obtain the best possible predicted number; now what we are doing is finding the best split-point to distinguish the two classes. If that number is all that will be used to make the final binary decision, OneR will produce the best split-point.

Its clean and simple. Perhaps a more complicated scheme might make...
Ian Witten replied to Stephen Howells

What else is there to know?

19 MAY 2020

There is no way of doing this within Weka (as far as I know). However, I'm sure others have faced this problem. You should join the Weka email list and ask your question there.
Ian Witten replied to Oluwasefunmi Bamidele

The weather data

19 MAY 2020

@HawwauMoruf This will become clear as you work though the week.
Ian Witten replied to Hulya Akil

The MOA interface

19 MAY 2020

This problem is fixed in the latest version of the massiveOnlineAnaysis package, 2020.05.1.
Ian Witten replied to Mark Glover

The MOA interface

19 MAY 2020

There is now a new version of the massiveOnlineAnaysis package, 2020.05.1. which works OK for me now. Try it.
Ian Witten replied to Mark Glover

The MOA interface

15 MAY 2020

Yes, I know. It doesn't work on my Mac but apparently it does work for the MOA guy who created the fix. He's looking into it.
Ian Witten replied to Hulya Akil

Using R to plot data

14 MAY 2020

Check out Robert's comment (below, in the same comment stream): https://www.futurelearn.com/comments/42437289
Ian Witten replied to Robert Gillespie

Using R to run a classifier

13 MAY 2020

That's the dreaded "smart quotes" problem. If you look closely at the quotes around polygon you'll see that they're not regular quotation marks. Just edit them in the R command line, typing in the quote marks, and the command will work.

The problem arose because you (wisely!) copied from the question text, and FutureLearn displays all quotation marks as...
Ian Witten replied to İbrahim Atakan Kubilay

The MOA interface

10 MAY 2020

No!
Ian Witten replied to Andy van Emmerik

Classification by regression

09 MAY 2020

Filters are called "supervised" if they use the actual class value of training instances in any way; otherwise they are "unsupervised". Almost all filters you will use are unsupervised.

However, the addClassification filter is supervised. Why? That's a good question! As I am using it here it doesn't look at the actual class values (so it should be...
Ian Witten replied to Jeff H

Reflect on this week's Big Question

09 MAY 2020

If a CSV file contains strings, quotation marks or newlines (maybe other characters too) within strings can cause this problem. Have a good look at line 2 of your file.
Ian Witten replied to Francisco Ortiz Aldana

How are you getting on?

09 MAY 2020

The NoChange classifier has been accidentally omitted from the LITE version of MOA. Change to the STANDARD version using the little menu at the top right of the interface, and then you will find it.
Ian Witten replied to Ron W

The MOA interface

09 MAY 2020

Please see my response to Hulya, https://www.futurelearn.com/comments/42215131
Ian Witten replied to Hulya Akil

The MOA interface

09 MAY 2020

Please see my response to Hulya, https://www.futurelearn.com/comments/42215131
Ian Witten replied to Mark Glover

The MOA interface

09 MAY 2020

Please see my response to Hulya, https://www.futurelearn.com/comments/42215131
Ian Witten replied to Hulya Akil

The MOA interface

09 MAY 2020

I have just discovered that an incompatibility has arisen between the versions of Java used in Weka (which was recently updated to use the latest version) and the massiveOnlineAnalysis package (which was not). If you are using the latest version of Weka, you are unable to select MOA’s data generators and classifiers from within Weka.

I apologise for not...
Ian Witten replied to Ron W

How are you getting on?

09 MAY 2020

@RonW: I have just discovered that an incompatibility has arisen between the versions of Java used in Weka (which was recently updated to use the latest version) and the massiveOnlineAnalysis package (which was not). This explains why, if you are using the latest version of Weka and of this package, you are unable to select MOA's data generators and...
Ian Witten replied to Ron W

How are you getting on?

08 MAY 2020

I think your choice of folder name might betray your age.
Ian Witten replied to Hulya Akil

Weka's MOA package

08 MAY 2020

Yes, that's OK. Actually, the separate installation of MOA.jar is unnecessary, but it does no harm. I have asked for that piece of text to be removed.
Ian Witten replied to Auntiewhnor Kpolie

Welcome! Please introduce yourself

07 MAY 2020

OK, thanks for letting me know. And sorry again for jumping to conclusions :-)
Ian Witten replied to Robert Gillespie

Signal peptide prediction

07 MAY 2020

The NoChange classifier has been accidentally omitted from the LITE version of MOA. Change to the STANDARD version using the little menu at the top right of the interface, and then you will find it.
Ian Witten replied to Auntiewhnor Kpolie

Avenues for further investigation

07 MAY 2020

The file org_c_n.arff is large; 8.6 MB (about 11,000 lines)
Ian Witten replied to shine destine

Overfitting

07 MAY 2020

It depends who you're talking to.

1. Some people use the term "validation data" for what I call the test data.

2. Sometimes the test data is used to help select between competing final models, in which case a "validation dataset" is held back to be used to make an unbiased estimate of the final model's performance –so you have training data, test data,...
Ian Witten replied to Brent U

Support vector machines

07 MAY 2020

The classifier that you use in these lessons is SMO, and that is installed in your system.

LibSVM is an external library that you would have to load explicitly; but you don't need it now. As I say in the video, the SMO algorithm only works with 2-class datasets, whereas the methods in LibSVM are more comprehensive.

Yes, my video screenshots were taken...
Ian Witten replied to Auntiewhnor Kpolie

Welcome! Please introduce yourself

07 MAY 2020

I apologize! I thought you hadn't because when I click on your name I can see a list of the FutureLearn courses you have done, and those two are not on it. Is the list inaccurate, or did you do the courses under a different name? I sent the same message to several others on the same basis, so I would like to know.
Ian Witten replied to George Hannan

Problems with probabilities?

06 MAY 2020

@AndyvanEmmerik Having selected the Copy filter (or any other filter or classifier), double-click it to bring up its configuration panel. The More button is near the top on the right-hand side.
Ian Witten replied to Hulya Akil

Analyzing infrared data from soil samples

05 MAY 2020

Well, it could use Date-Remapped in the model, but it doesn't. Linear regression doesn't necessarily use all available attributes, because omitting some may produce a better model.
Ian Witten replied to Hyeon Jin Cho

Support vector machines

05 MAY 2020

Maybe :-). Look under "functions".
Ian Witten replied to Gautam Bhut

How are you getting on?

05 MAY 2020

The dropdown list is scrollable.
Ian Witten replied to Walew Yeboah

Logistic regression

05 MAY 2020

As Q.3 of the Step 4.7 Quiz says, click Output predictions in the More options menu and output the predictions as PlainText.
Ian Witten replied to Andrea Rossi

Reflect on this week's Big Question

05 MAY 2020

Kaggle (https://www.kaggle.com/) has 19,000 public datasets, and also offers many competitions, past and present, some with attractive prizes! (at https://www.kaggle.com/competitions).
Ian Witten replied to Coby Beck

Welcome! Please introduce yourself

05 MAY 2020

Ah yes, I know Bragg Creek, and nearby Moose Mountain with the ice cave, and Rock of Gibraltar in the Sheep River area. Great times!
Ian Witten replied to Stephen Howells

Using Weka in practice: some questions

05 MAY 2020

You can keep data files on the Web and open them by clicking "Open URL" in the Explorer's Preprocess.
Ian Witten replied to Rabina Parvin

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to सुजन गेलाल

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Eya Gallardo

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Amjad A. Mohammed

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Cem Çetiz

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Ibrahim Abdullahi

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Auntiewhnor Kpolie

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Uma Z

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Lanie Richmond

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Utazi Suzan

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Dick Pitt

Welcome! Please introduce yourself

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Joseph Wijeyagoonewardena

About this course

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to رضا القريشي

About this course

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Satender Melandiya

What will you learn?

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to Rouane Abdelselam

What will you learn?

05 MAY 2020

This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to İbrahim Atakan Kubilay

What will you learn?

05 MAY 2020

@OlarewajuBabatope This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to İbrahim Atakan Kubilay

What will you learn?

05 MAY 2020

@AmreenKureshi This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).
Ian Witten replied to swapan mitra

The data mining process

05 MAY 2020

@MaryLynch The PrincipalComponents filter performs a principal components analysis and transformation of the data.
Dimensionality reduction is accomplished by choosing enough eigenvectors to account for some percentage of the variance in the original data -- default 0.95 (95%).
Ian Witten replied to Guus Löhlefink

What will you learn?

05 MAY 2020

@JyotiJalaj This course is an advanced one. I recommend you start with the introductory course "Data Mining with Weka" (https://www.futurelearn.com/courses/data-mining-with-weka/).

Harnessing AI in Marketing and Communication

Samuel Johnson’s Rasselas: An Introduction

The Online Educator: People and Pedagogy

How to Succeed at: Interviews

Harnessing AI in Marketing and Communication

Samuel Johnson’s Rasselas: An Introduction

The Online Educator: People and Pedagogy

How to Succeed at: Interviews

Ian Witten

Activity

About FutureLearn

Using FutureLearn

Need some help?

Popular Subjects

Developing Skills

Small Print