6mAPred-MSFF: A Deep Learning Model for Predicting DNA N6-Methyladenine Sites across Species Based on a Multi-Scale Feature Fusion Mechanism

被引:10
作者
Zeng, Rao [1 ]
Liao, Minghong [1 ]
机构
[1] Xiamen Univ, Sch Informat, Dept Software Engn, Xiamen 361005, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 16期
基金
中国国家自然科学基金;
关键词
DNA N6-methyladenine; deep learning; site prediction; depthwise separable convolution; inverted residual structure; attention mechanism; feature fusion; METHYLATION; PROTEIN; IDENTIFICATION; N-6-ADENINE; NETWORK; GENOME;
D O I
10.3390/app11167731
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
DNA methylation is one of the most extensive epigenetic modifications. DNA N6-methyladenine (6mA) plays a key role in many biology regulation processes. An accurate and reliable genome-wide identification of 6mA sites is crucial for systematically understanding its biological functions. Some machine learning tools can identify 6mA sites, but their limited prediction accuracy and lack of robustness limit their usability in epigenetic studies, which implies the great need of developing new computational methods for this problem. In this paper, we developed a novel computational predictor, namely the 6mAPred-MSFF, which is a deep learning framework based on a multi-scale feature fusion mechanism to identify 6mA sites across different species. In the predictor, we integrate the inverted residual block and multi-scale attention mechanism to build lightweight and deep neural networks. As compared to existing predictors using traditional machine learning, our deep learning framework needs no prior knowledge of 6mA or manually crafted sequence features and sufficiently capture better characteristics of 6mA sites. By benchmarking comparison, our deep learning method outperforms the state-of-the-art methods on the 5-fold cross-validation test on the seven datasets of six species, demonstrating that the proposed 6mAPred-MSFF is more effective and generic. Specifically, our proposed 6mAPred-MSFF gives the sensitivity and specificity of the 5-fold cross-validation on the 6mA-rice-Lv dataset as 97.88% and 94.64%, respectively. Our model trained with the rice data predicts well the 6mA sites of other five species: Arabidopsis thaliana, Fragaria vesca, Rosa chinensis, Homo sapiens, and Drosophila melanogaster with a prediction accuracy 98.51%, 93.02%, and 91.53%, respectively. Moreover, via experimental comparison, we explored performance impact by training and testing our proposed model under different encoding schemes and feature descriptors.
引用
收藏
页数:19
相关论文
共 91 条
[1]  
Adam A.G., 2017, ARXIV170404861
[2]   Prediction of bio-sequence modifications and the associations with diseases [J].
Ao, Chunyan ;
Yu, Liang ;
Zou, Quan .
BRIEFINGS IN FUNCTIONAL GENOMICS, 2021, 20 (01) :1-18
[3]   SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome [J].
Basith, Shaherin ;
Manavalan, Balachandran ;
Shin, Tae Hwan ;
Lee, Gwang .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2019, 18 :131-141
[4]   Attention Augmented Convolutional Networks [J].
Bello, Irwan ;
Zoph, Barret ;
Vaswani, Ashish ;
Shlens, Jonathon ;
Le, Quoc V. .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294
[5]   i6mA-Pred: identifying DNA N6 - methyladenine sites in the rice genome [J].
Chen, Wei ;
Lv, Hao ;
Nie, Fulei ;
Lin, Hao .
BIOINFORMATICS, 2019, 35 (16) :2796-2800
[6]   iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties [J].
Chen, Wei ;
Yang, Hui ;
Feng, Pengmian ;
Ding, Hui ;
Lin, Hao .
BIOINFORMATICS, 2017, 33 (22) :3518-3523
[7]   MUFFIN: multi-scale feature fusion for drug-drug interaction prediction [J].
Chen, Yujie ;
Ma, Tengfei ;
Yang, Xixi ;
Wang, Jianmin ;
Song, Bosheng ;
Zeng, Xiangxiang .
BIOINFORMATICS, 2021, 37 (17) :2651-2658
[8]   Iterative feature representation algorithm to improve the predictive performance of N7-methylguanosine sites [J].
Dai, Chichi ;
Feng, Pengmian ;
Cui, Lizhen ;
Su, Ran ;
Chen, Wei ;
Wei, Leyi .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
[9]   Attentional Feature Fusion [J].
Dai, Yimian ;
Gieseke, Fabian ;
Oehmcke, Stefan ;
Wu, Yiquan ;
Barnard, Kobus .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, :3559-3568
[10]   Classification of Chromosomal DNA Sequences Using Hybrid Deep Learning Architectures [J].
Du, Zhihua ;
Xiao, Xiangdong ;
Uversky, Vladimir N. .
CURRENT BIOINFORMATICS, 2020, 15 (10) :1130-1136