Prediction of N6-methyladenosine sites using convolution neural network model based on distributed feature representations

被引:30
作者
Tahir, Muhammad [1 ,2 ]
Hayat, Maqsood [1 ]
Chong, Kil To [2 ,3 ]
机构
[1] Abdul Wali Khan Univ Mardan, Dept Comp Sci, Mardan 23200, KP, Pakistan
[2] Chonbuk Natl Univ, Dept Elect & Informat Engn, Jeonju 54896, South Korea
[3] Chonbuk Natl Univ, Adv Elect & Informat Res Ctr, Jeonju 54896, South Korea
基金
新加坡国家研究基金会;
关键词
CNN; Natural language processing; word2vec; 10-fold cross-validation; SEQUENCE-BASED PREDICTOR; MESSENGER-RNA; N-6-METHYLADENOSINE SITES; GENERAL-FORM; NUCLEAR-RNA; WEB SERVER; IDENTIFICATION; METHYLATION; PROTEINS; SPACE;
D O I
10.1016/j.neunet.2020.05.027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
N-6-methyladenosine (m(6)A) is a well-studied and most common interior messenger RNA (mRNA) modification that plays an important function in cell development. N(6)A is found in all kingdoms of life and many other cellular processes such as RNA splicing, immune tolerance, regulatory functions, RNA processing, and cancer. Despite the crucial role of m(6)A in cells, it was targeted computationally, but unfortunately, the obtained results were unsatisfactory. It is imperative to develop an efficient computational model that can truly represent m(6)A sites. In this regard, an intelligent and highly discriminative computational model namely: m6A-word2vec is introduced for the discrimination of m(6)A sites. Here, a concept of natural language processing in the form of word2vec is used to represent the motif of the target class automatically. These motifs (numerical descriptors) are automatically targeted from the human genome without any clear definition. Further, the extracted feature space is then forwarded to the convolution neural network model as input for prediction. The developed computational model obtained 83.17%, 92.69%, and 90.50% accuracy for benchmark datasets S-1, S-2, and S-3, respectively, using a 10-fold cross-validation test. The predictive outcomes validate that the developed intelligent computational model showed better performance compared to existing computational models. It is thus greatly estimated that the introduced computational model "m6A-word2vec" may be a supportive and practical tool for elementary and pharmaceutical research such as in drug design along with academia. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页码:385 / 391
页数:7
相关论文
共 68 条
[1]  
Ahmad J., 2017, ARTIFICIAL INTELLIGE
[2]   iRNA-PseTNC: identification of RNA 5-methylcytosine sites using hybrid vector space of pseudo nucleotide composition [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Iqbal, Muhammad ;
Tahir, Muhammad .
FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (02) :451-460
[3]   iMethyl-STTNC: Identification of N6-methyladenosine sites by extending the idea of SAAC into Chou's PseAAC to formulate RNA sequences [J].
Akbar, Shahid ;
Hayat, Maqsood .
JOURNAL OF THEORETICAL BIOLOGY, 2018, 455 :205-211
[4]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[5]   The RNA modification database, RNAMDB: 2011 update [J].
Cantara, William A. ;
Crain, Pamela F. ;
Rozenski, Jef ;
McCloskey, James A. ;
Harris, Kimberly A. ;
Zhang, Xiaonong ;
Vendeix, Franck A. P. ;
Fabris, Daniele ;
Agris, Paul F. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D195-D201
[6]   m6A RNA Methylation Is Regulated by MicroRNAs and Promotes Reprogramming to Pluripotency [J].
Chen, Tong ;
Hao, Ya-Juan ;
Zhang, Ying ;
Li, Miao-Miao ;
Wang, Meng ;
Han, Weifang ;
Wu, Yongsheng ;
Lv, Ying ;
Hao, Jie ;
Wang, Libin ;
Li, Ang ;
Yang, Ying ;
Jin, Kang-Xuan ;
Zhao, Xu ;
Li, Yuhuan ;
Ping, Xiao-Li ;
Lai, Wei-Yi ;
Wu, Li-Gang ;
Jiang, Guibin ;
Wang, Hai-Lin ;
Sang, Lisi ;
Wang, Xiu-Jie ;
Yang, Yun-Gui ;
Zhou, Qi .
CELL STEM CELL, 2015, 16 (03) :289-301
[7]   MethyRNA: a web server for identification of N6-methyladenosine sites [J].
Chen, Wei ;
Tang, Hua ;
Lin, Hao .
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2017, 35 (03) :683-687
[8]   iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences [J].
Chen, Wei ;
Feng, Pengmian ;
Yang, Hui ;
Ding, Hui ;
Lin, Hao ;
Chou, Kuo-Chen .
ONCOTARGET, 2017, 8 (03) :4208-4217
[9]   Identifying N 6-methyladenosine sites in the Arabidopsis thaliana transcriptome [J].
Chen, Wei ;
Feng, Pengmian ;
Ding, Hui ;
Lin, Hao .
MOLECULAR GENETICS AND GENOMICS, 2016, 291 (06) :2225-2229
[10]   IACP: a sequence-based tool for identifying anticancer peptides [J].
Chen, Wei ;
Ding, Hui ;
Feng, Pengmian ;
Lin, Hao ;
Chou, Kuo-Chen .
ONCOTARGET, 2016, 7 (13) :16895-16909