This video demonstrates an R package called ggplot2 that provides extensive plotting capabilities, which can be accessed from Weka. Detailed instructions are given in the accompanying download (these slides do …
R is a powerful statistical programming system that contains data mining tools for classification, regression, and plotting data, some of them very advanced. Eibe Frank shows how to access these …
Ian Witten demonstrates LibLINEAR, which contains fast algorithms for linear classification; and LibSVM, which produces non-linear SVMs. Both implement support vector machines – which are already available in Weka as …
Tony Smith introduces signal peptide prediction, an application of data mining to a problem in bioinformatics. A sequence of amino acids that makes up a protein begins with an initial …
Twitter is a vast, continuous, prolific, real time data stream. Sentiment analysis is the task of classifying tweets as positive or negative according to the feelings they express. Emoticons constitute …
Change is everywhere! – and is a distinguishing feature of data stream mining. Bernhard Pfahringer explains that one way of dealing with change is to use an adaptive windowing method …
We download MOA and run it. Incremental data stream mining calls for different evaluation methods from batch operation. One possibility is to interleave training and testing by periodically holding out …
MOA is open source software that is specifically designed for mining data streams. It can handle evolving data streams – ones generated by mechanisms that change, or drift, over time. …
Albert Bifet introduces data stream mining. It requires incremental operation rather than the batch mode used so far. Weka includes many different incremental methods. Updating decision trees presents an interesting …
Some feel that data miners focus too much on new methods and tiny improvements in accuracy, instead of on applications that will make a real difference in practice. Geoff Holmes …
There are many parameters and options for deriving time-dependent attributes, such as which attribute holds the timestamp and what is the periodicity of the data. Periodicity affects the lagged variables …
Weka’s time series forecasting package includes options for visualizing predictions for any number of steps ahead, as well as performance on the training data. As well as visualizing future predictions, …
Dealing manually with time series is a pain, as we learned in the last lesson. Weka’s time series forecasting package automatically produces lagged variables, plus many others – perhaps too …
This video welcomes you to the course, which – unlike earlier courses in this series – is given by the entire data mining team at the University of Waikato in …
Download the latest version of MOA and run it. It’s a Java program, like Weka. If you can run Weka, you can run MOA! Note: The interface differs very slightly …