Skip main navigation

Machine learning implementation

video
15.2
And let’s move to the second session today. I try to show you about how you can implement some machine learning algorithms here. Machine learning implementation. And there are two kinds of machine learning implementations that I would like to introduce in this course. The first one I suggest, uh… to use the Weka like this is a workbench for machine learning, and for this one it doesn’t need any prior knowledge in programming. And then you can even create some machine learning algorithms by yourself. Very simple. And the second one you can use is the python. If you have some background knowledge on programming, you can use python. This is a most common programming language for machine learning. And in…
68.6
for machine learning, mostly I try to show you step by step, how you can build machine learning algorithms using Weka. And for the python, if we move to the deep learning, I try to use a python tool, how to implement deep learning. And data format… so if you want to work with the machine learning, you need to understand about the data format. So which format with data that can be used in machine learning algorithms. And in machine learning item, especially in bioinformatics, there are three kinds of data format that you can use. This is… there are three most popular ones for bioinformatics. You can see. We call the CSV data.
110.9
Next one is the ARFF data, and the last one is the LIBSVM. And for the this one is an assemble for CSV data, for classification problems. So you can see that you have a column for the label, the label can be 1, 0, or maybe 2, 3, or so on. So this is for… if you use 1, 0 is for binary classification, however if it can be multiple classifications problems. So the first column is the label. and then the others column is the values which means all of the values inside the data. And you can use, you can use this data to classify between each rows, means the each patients are examples in your data set.
162.6
So this is a CSV data and you will use a lot in machine learning. And the second one is the ARFF, why do I explain the ARFF data format here? Because I also show you how you can use Weka to apply to implement the machine learning items. And this is the format that you can use in Weka. And just… there is the difference with the CSV format. The difference is in the… is on the header. You can see that the header contains the information of the data. And in the header, you need to solve the name of the data as well as the attribute of the data.
207.2
And finally, the for the end of the data is all of the values. And according to their classes, so this is the data format of the iaff. And you can use in Weka. And finally there is another data format that many people use in bioinformatics. We call LIBSVM and the format is just look like the ARFF, the same you need to separate which one is the label and also which one is the values. Okay. And if in case, this is the… We try to implement the machine learning items using Weka. However, you can… if you want to use machine learning algorithms, and using programming language, you can use python to perform to implement some of the algorithms and the…
261
libraries that we will use to implement machine learning algorithms is the Scikit-Learn. And this one, this library provides a range of supervised and unsupervised learning algorithms. And this has interface in python. And for this one, this library is focused on modeling data. And a lot of model already be integrated in Scikit-learn. Here is, here is the histories of Scikit-learn. And this is, uh, initially developed by David Cournapeau as a Google summer of code project in 2007. And also the later, Matthieu Brucher joined the project, and started to use this as a part of his thesis work. Until 2010, INRIA got involved and the first public release was published in late January 2010.
315.3
So now for Scikit-learn project has more than 30 active contributors, and has had paid sponsorship from INRIA, Google, also Tinyclues and also the Python Software Foundation. And for this library, now, very famous in python, and a lot of people from data science for machine learning, and they try to use Scikit-learn to perform, to implement different AI models include the deep learning also. And here is the home page of the Scikit-learn. if you want to go to the Scikit-learn home page, you can see that. this is the interface, and there’re…they explain some of the models, that they can perform like classification, like regression, clustering, also dimensional reduction, model selection, or pre-processing, and so on.
369.5
And to install, if you want to install the Scikit-learn, you need to use the pip install Scikit-learn here. And you want to check the installation, you can use some of the…command like show, and also the freeze. And in python, if you want to use Scikit-learn, you just use “import sklearn”, and after that you can show the version. So if you familiar with python programming language, very easy for you to import some library and use.

Dr. Khanh will introduce how to implement machine learning algorithms next. He starts with the tools for machine learning, Weka, and Python.

Please check the link from the official website:

From Weka website, they provide Machine Learning Courses. We suggest you take a look first. If you are especially interested, you could check this course on FutureLearn, Data Mining with Weka from The University of Waikato.

Next, Dr. Khank will introduce machine learning in python, Scikit-learn Project. Scikit-learn provides a range of supervised and unsupervised learning algorithms via a consistent interface in Python. The library is focused on modeling data. You can find install tutorial at here.

This article is from the free online

Artificial Intelligence in Bioinformatics

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education