Skip main navigation

AI in protein function prediction

And here is an assemble of AI applied  into bioinformatics, as you can see here.   We call… this program we call Variant Calling.   So you can see in the left-hand-side,  we call the single nucleotide variation. Single nucleotide variation  means that in our DNA sequence,  if there is any change in the sequence like here, the nucleotide A will change to amino  acid….will change to nucleotide C.  So we call this is a “variant”. And for the, for the next one is   the insertion or deletion means that in your DNA sequence, there are some more…there are one  more, one or two more nucleotides, so it can be, we call it insertion. If we delete some of the  nucleotides from the sequence, we call this is deletion.
And to, how you can know the partitions?  Which is the single nucleotide variation  are also insertion or deletion. So you need to perform AI, and on that problem, we call AI in variant calling. And a single nucleotide variations is mean that they can change the amino acid at this added to the protein chain. And also, to understand, if you  need to apply some AI technique,  for traditional methods, they also can use humans. Humans can also try to identify  use some motive to identify the   position of insertion, deletion, and also change. However, currently,   they use AI to learn the information from  the DNA sequence and generate the result. And here is a…here’s an article…
I  just want to show an article that how they apply AI in variant calling? For this one, they call the deep variant and they develop using deep neuron networks  to identify the small-indel variant caller. And another aspect of using AI in bioinformatics,  we call AI in Protein Function Prediction. As you know, proteins are  large and complex molecules   that play many critical roles in the body. And proteins can be described according to  their large arrange of functions in the body. And therefore, predicting functions of proteins  is an important problem and that attracts a   lot of people, lots of researchers from the  biological field, and also counter-side field. And the idea to apply AI in protein function  prediction are described in this figure.
As you see here, if traditional techniques, previous year, many people use traditional techniques to identify the function of  proteins, using some wetlab technique like this. So you see that. if you have a protein, with a known function, and then you want to understand their functions, you need to use some wetlab techniques and then try to perform many traditional techniques to identify their functions. And the accuracy is high, for this case. However, currently, AI techniques can’t have you   to identify the protein functions  with also a permission accuracy. And just the difference, like in the below figure,   you can see that, you try to replace  the wetlab techniques using AI models.
And if you have an unknown protein, and go inside  your AI model, so your AI model can learn and generate the result, which means the  proteins will contain a lot of functions. And here, some of the… some popular articles talk about how you can apply AI  in protein function prediction. The right-hand side is a review paper that shows a  lot of used cases for predicting protein function   from sequence and also from the structure. And in the left-hand side, you can see that they try to apply deep neural networks to improve  the protein function prediction from sequence.
And here if we move to the gene expression profiles, how you can apply AI?  So in this case, an example is that AI  can help you to classify the tumors like   the tumors inside the basins, inside using the gene expression profiles. And for this one, an example here,  I use the FAST. AI to classify the tumors  using the TCGA which is a very famous   public resources for gene expression profiles and  the data they use in this study is INS sequencing. And you even can use different kinds of  data here. So see here, they mentioned that they can get about 93 percent in accuracy and actually is acceptable for tumor classification. And here is a more example.
I try to show more examples on applying AI in bioinformatics. For the first one, you can see here. A computational modeling of DNA and RNA targets of regulatory proteins is improved by a deep learning approach. So you see that, the input is DNA sequence. And also after they try to use some AI models like deep neural networks, and then they generate combinations and then they  can get the affinity here. And here is another example predicting the sequence specificities of DNA and RNA binding proteins by deep learning. For this one is very important. Not because they can predict correctly the DNA and RNA binding  protein sequence, from the sequence information.
And here they integrate a  lot of deep neural networks   and the output of the model is to detect  the binding size from the sequence. And here is a primer on deep learning  in genomics, so this one the chart is   very simple like that you have the data  set, just like the sequence and the label. After that, you perform some  architectures like CNN, RNN. And you evaluate the model. After that you can get the use of future importance and then  get the outcomes of the results.

Next, Dr. Khanh Le will introduce AI in Variant Calling, AI in Protein Function Prediction AI in Protein Function Prediction. He will give a few examples of deep learning data like tumor classification using gene expression profiles, and predicting the sequence specificities of DNA- and RNA- binding proteins by deep learning.

You could check the research mentioned in the video at also link below.

This article is from the free online

Artificial Intelligence in Bioinformatics

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now