EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction

被引:16
|
作者
Wang, Honglei [1 ,2 ,3 ]
Liu, Hui [1 ,2 ]
Huang, Tao [2 ]
Li, Gangshen [1 ,2 ]
Zhang, Lin [1 ,2 ]
Sun, Yanjing [1 ,2 ]
机构
[1] China Univ Min & Technol, Engn Res Ctr Intelligent Control Underground Spac, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Jiangsu, Peoples R China
[3] Xuzhou Coll Ind Technol, Sch Informat Engn, Xuzhou 221400, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
RNA modification site; Deep learning; Natural language processing; Predictor; N-1-METHYLADENOSINE; N-6-METHYLADENOSINE; LANDSCAPE; RMBASE;
D O I
10.1186/s12859-022-04756-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Recent research recommends that epi-transcriptome regulation through post-transcriptional RNA modifications is essential for all sorts of RNA. Exact identification of RNA modification is vital for understanding their purposes and regulatory mechanisms. However, traditional experimental methods of identifying RNA modification sites are relatively complicated, time-consuming, and laborious. Machine learning approaches have been applied in the procedures of RNA sequence features extraction and classification in a computational way, which may supplement experimental approaches more efficiently. Recently, convolutional neural network (CNN) and long short-term memory (LSTM) have been demonstrated achievements in modification site prediction on account of their powerful functions in representation learning. However, CNN can learn the local response from the spatial data but cannot learn sequential correlations. And LSTM is specialized for sequential modeling and can access both the contextual representation but lacks spatial data extraction compared with CNN. There is strong motivation to construct a prediction framework using natural language processing (NLP), deep learning (DL) for these reasons. Results: This study presents an ensemble multiscale deep learning predictor (EMDLP) to identify RNA methylation sites in an NLP and DL way. It organically combines the dilated convolution and Bidirectional LSTM (BiLSTM), which helps to take better advantage of the local and global information for site prediction. The first step of EMDLP is to represent the RNA sequences in an NLP way. Thus, three encodings, e.g., RNA word embedding, One-hot encoding, and RGloVe, which is an improved learning method of word vector representation based on GloVe, are adopted to decipher sites from the viewpoints of the local and global information. Then, a dilated convolutional Bidirectional LSTM network (DCB) model is constructed with the dilated convolutional neural network (DCNN) followed by BiLSTM to extract potential contributing features for methylation site prediction. Finally, these three encoding methods are integrated by a soft vote to obtain better predictive performance. Experiment results on m(1)A and m(6)A reveal that the area under the receiver operating characteristic(AUROC) of EMDLP obtains respectively 95.56%, 85.24%, and outperforms the state-of-the-art models. To maximize user convenience, a user-friendly webserver for EMDLP was publicly available at http://www.labiip.net/EMDLP/index.php (http://47.104.130.81/EMDLP/index.php). Conclusions: We developed a predictor for m(1)A and m(6)A methylation sites.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] From ensemble learning to deep ensemble learning: A case study on multi-indicator prediction of pavement performance
    Wu, Yi
    Applied Soft Computing, 2024, 166
  • [42] A brief review of machine learning methods for RNA methylation sites prediction
    Wang, Hong
    Wang, Shuyu
    Zhang, Yong
    Bi, Shoudong
    Zhu, Xiaolei
    METHODS, 2022, 203 : 399 - 421
  • [43] DeepPGD: A Deep Learning Model for DNA Methylation Prediction Using Temporal Convolution, BiLSTM, and Attention Mechanism
    Teragawa, Shoryu
    Wang, Lei
    Liu, Yi
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (15)
  • [44] EnsDeepDP: An Ensemble Deep Learning Approach for Disease Prediction Through Metagenomics
    Shen, Yang
    Zhu, Jinlin
    Deng, Zhaohong
    Lu, Wenwei
    Wang, Hongchao
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (02) : 986 - 998
  • [45] A Hierarchical Feature Ensemble Deep Learning Approach for Software Defect Prediction
    Zhang, Shenggang
    Jiang, Shujuan
    Yan, Yue
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (04) : 543 - 573
  • [46] Sea surface temperature prediction by stacked generalization ensemble of deep learning
    Dai, Hao
    Lei, Famei
    Wei, Guomei
    Zhang, Xining
    Lin, Rui
    Zhang, Weijie
    Shang, Shaoping
    DEEP-SEA RESEARCH PART I-OCEANOGRAPHIC RESEARCH PAPERS, 2024, 209
  • [47] PseUdeep: RNA Pseudouridine Site Identification with Deep Learning Algorithm
    Zhuang, Jujuan
    Liu, Danyang
    Lin, Meng
    Qiu, Wenjing
    Liu, Jinyang
    Chen, Size
    FRONTIERS IN GENETICS, 2021, 12
  • [48] Accurate Prediction of Human Essential Proteins Using Ensemble Deep Learning
    Li, Yiming
    Zeng, Min
    Wu, Yifan
    Li, Yaohang
    Li, Min
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (06) : 3263 - 3271
  • [49] DNA Methylation Markers for Pan-Cancer Prediction by Deep Learning
    Liu, Biao
    Liu, Yulu
    Pan, Xingxin
    Li, Mengyao
    Yang, Shuang
    Li, Shuai Cheng
    GENES, 2019, 10 (10)
  • [50] Recent Deep Learning Methodology Development for RNA-RNA Interaction Prediction
    Fang, Yi
    Pan, Xiaoyong
    Shen, Hong-Bin
    SYMMETRY-BASEL, 2022, 14 (07):