Skip to 0 minutes and 11 seconds Hi, I’m Pamela Douglas from the UCLA Semel Institute. We’re going to talk about a very interesting application of using Weka for classifying functional MRI data. Classifying these data can be very challenging for a number of reasons. First of all, these data are very high dimensional. A structural MRI scan can consist of approximately 100,000 voxels, and an FMRI scan records signals from these voxels over time, resulting in 4-dimensional data. The number of possible features and attributes that can be derived from these data are very large, and one of the recent events that highlights why this can be problematic was the ADHD200 Global Machine Learning Competition.
Skip to 0 minutes and 51 seconds The goal of this competition was to predict a subject’s diagnosis as either “Typically developing (TD)” or ADHD using a combination of demographic and structural and functional neuroimaging features. A number of sites from around the world collaborated to provide data for this competition. This resulted in approximately 800 subjects’ worth of data in the training set, as well as 200 subjects’ worth of data in the test set, where the diagnosis was unknown to the participants in the competition. My team participated in this competition, and our first goal was to figure out how to derive meaningful information from structural MRI data. The first thing we did was calculate Freesurfer metrics for automated brain parcellation.
Skip to 1 minute and 33 seconds This resulted in 9 different attributes, like brain volume, from 68 different cortical regions, as well as 3 different measures from each of 45 subcortical and non-cortical brain regions. Collectively, this resulted in more than 700 structural brain attributes using just the SMRI data. Our next step was to determine how to extract features from the resting state functional MRI data. The first thing we did was calculate resting state functional connectivity matrices, or pairwise time series correlations between different brain regions. We then calculated the total number of independent components used that were required to describe 99% of the data variance. We also calculated power spectra, regional homogeneity, and a number of different graph-theoretic metrics, like functional modular organization.
Skip to 2 minutes and 18 seconds Overall, this resulted in more than 100,000 functional neuroimaging attributes. My team, as it turns out, placed third in this competition using a voted perceptron, as implemented in Weka, but the overall results of this competition were very unsatisfying. As it turned out, the winning team used only demographic features and a very small number of attributes overall. Classification using these features alone outperformed all the other teams that used demographic features in combination with MRI data. This result really highlights the importance of using feature selection, either as a separate step or as part of a regularization scheme, since the inclusion of irrelevant and redundant features can vastly degrade the performance of a classifier.
Skip to 3 minutes and 5 seconds Another reason Weka can be useful for classifying MRI data is that there are a number of algorithms that are readily available for testing that have already been vetted by the machine learning community. In a previous lesson, we learned about the “No Free Lunch Theorem”. As a brief reminder, each classifier has its own inductive bias, and there is no way to know a priori which classifier will perform best on a given dataset. Therefore, it’s often a good idea to test out a few different classifiers and use model selection to determine your best option. In the exercise that follows, you’ll be able to test out a few different classifiers using the classic Haxby et al. dataset.
Skip to 3 minutes and 40 seconds In this 2001 study, functional MRI data was collected while subjects viewed images from eight different object categories. You’ll also get to test out a few different methods for feature selection, as well as parameter tuning using nested cross-validation. In summary, functional MRI data is high dimensional, so feature selection and regularization are highly recommended. Weka can be very useful for classifying these data, since Weka has the capability of handling large datasets and combining across multiple feature categories, like nominal and numeric data, as well as handling missing data. Testing out a variety of models and classifiers can be very helpful.
Skip to 4 minutes and 24 seconds Lastly, the Weka group has kindly now added a “brain button” to their software, so you can now load in MRI data files in NIFTI format directly into Weka for classification, without needing to convert it to the attribute relation file format. I hope that you’ll enjoy testing out Weka for classification on brain imaging data as much as I have. Thanks so much!
Analyzing functional MRI Neuroimaging data
Pamela Douglas from UCLA introduces the problem of classifying functional MRI data. An FMRI scan records signals over time from 100,000 voxels covering the brain region, which creates a huge 4-dimensional dataset. The ADHD2000 machine learning competition is to predict a subject as either “Typically developing (TD)” or “Attention deficit hyperactivity disorder (ADHD)” using data from 1000 subjects that includes both demographic and structural neuroimaging features. Pamela’s team calculated 100,000 functional neuroimaging attributes from the raw data, and was placed 3rd using a voted perceptron learning algorithm. Ironically, the winning team ignored the neuroimaging features and used demographic data only! The video also introduces Haxby’s classic FMRI dataset, collected while subjects viewed images from 8 object categories. You will use these in the Quiz that follows.
© University of Waikato, New Zealand. CC Creative Commons Attribution 4.0 International License.