STM-ac4C: a hybrid model for identification of N4-acetylcytidine (ac4C) in human mRNA based on selective kernel convolution, temporal convolutional network, and multi-head self-attention

被引:2
作者
Yi, Mengyue [1 ]
Zhou, Fenglin [1 ]
Deng, Yu [1 ]
机构
[1] Jingdezhen Ceram Univ, Sch Informat Engn, Jingdezhen, Peoples R China
基金
美国国家科学基金会;
关键词
N4-acetylcytidine; selective kernel convolution; temporal convolutional network; multi-head self-attention; deep learning; DNA N4-METHYLCYTOSINE SITES; GENE-REGULATION; PREDICTION;
D O I
10.3389/fgene.2024.1408688
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
N4-acetylcysteine (ac4C) is a chemical modification in mRNAs that alters the structure and function of mRNA by adding an acetyl group to the N4 position of cytosine. Researchers have shown that ac4C is closely associated with the occurrence and development of various cancers. Therefore, accurate prediction of ac4C modification sites on human mRNA is crucial for revealing its role in diseases and developing new diagnostic and therapeutic strategies. However, existing deep learning models still have limitations in prediction accuracy and generalization ability, which restrict their effectiveness in handling complex biological sequence data. This paper introduces a deep learning-based model, STM-ac4C, for predicting ac4C modification sites on human mRNA. The model combines the advantages of selective kernel convolution, temporal convolutional networks, and multi-head self-attention mechanisms to effectively extract and integrate multi-level features of RNA sequences, thereby achieving high-precision prediction of ac4C sites. On the independent test dataset, STM-ac4C showed improvements of 1.81%, 3.5%, and 0.37% in accuracy, Matthews correlation coefficient, and area under the curve, respectively, compared to the existing state-of-the-art technologies. Moreover, its performance on additional balanced and imbalanced datasets also confirmed the model's robustness and generalization ability. Various experimental results indicate that STM-ac4C outperforms existing methods in predictive performance. In summary, STM-ac4C excels in predicting ac4C modification sites on human mRNA, providing a powerful new tool for a deeper understanding of the biological significance of mRNA modifications and cancer treatment. Additionally, the model reveals key sequence features that influence the prediction of ac4C sites through sequence region impact analysis, offering new perspectives for future research. The source code and experimental data are available at https://github.com/ymy12341/STM-ac4C.
引用
收藏
页数:17
相关论文
共 53 条
[1]   4mCPred-CNN-Prediction of DNA N4-Methylcytosine in the Mouse Genome Using a Convolutional Neural Network [J].
Abbas, Zeeshan ;
Tayara, Hilal ;
Chong, Kil To .
GENES, 2021, 12 (02) :1-10
[2]   XG-ac4C: identification of N4-acetylcytidine (ac4C) in mRNA using eXtreme gradient boosting with electron-ion interaction pseudopotentials [J].
Alam, Waleed ;
Tayara, Hilal ;
Chong, Kil To .
SCIENTIFIC REPORTS, 2020, 10 (01)
[3]   Immunoprecipitation and Sequencing of Acetylated RNA [J].
Arango, Daniel ;
Sturgill, David ;
Oberdoerffer, Shalini .
BIO-PROTOCOL, 2019, 9 (12)
[4]   Acetylation of Cytidine in mRNA Promotes Translation Efficiency [J].
Arango, Daniel ;
Sturgill, David ;
Alhusaini, Najwa ;
Dillman, Allissa A. ;
Sweet, Thomas J. ;
Hanson, Gavin ;
Hosogane, Masaki ;
Sinclair, Wilson R. ;
Nanan, Kyster K. ;
Mandler, Mariana D. ;
Fox, Stephen D. ;
Zengeya, Thomas T. ;
Andresson, Thorkell ;
Meier, Jordan L. ;
Coller, Jeffery ;
Oberdoerffer, Shalini .
CELL, 2018, 175 (07) :1872-+
[5]  
Bai SJ, 2018, Arxiv, DOI [arXiv:1803.01271, 10.48550/arXiv.1803.01271, DOI 10.48550/ARXIV.1803.01271]
[6]   iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties [J].
Chen, Wei ;
Yang, Hui ;
Feng, Pengmian ;
Ding, Hui ;
Lin, Hao .
BIOINFORMATICS, 2017, 33 (22) :3518-3523
[7]   iRNA-PseU: Identifying RNA pseudouridine sites [J].
Chen, Wei ;
Tang, Hua ;
Ye, Jing ;
Lin, Hao ;
Chou, Kuo-Chen .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2016, 5 :e332
[8]   BiLSTM-5mC: A Bidirectional Long Short-Term Memory-Based Approach for Predicting 5-Methylcytosine Sites in Genome-Wide DNA Promoters [J].
Cheng, Xin ;
Wang, Jun ;
Li, Qianyue ;
Liu, Taigang .
MOLECULES, 2021, 26 (24)
[9]   RNA modifications: importance in immune cell biology and related diseases [J].
Cui, Lian ;
Ma, Rui ;
Cai, Jiangluyi ;
Guo, Chunyuan ;
Chen, Zeyu ;
Yao, Lingling ;
Wang, Yuanyuan ;
Fan, Rui ;
Wang, Xin ;
Shi, Yuling .
SIGNAL TRANSDUCTION AND TARGETED THERAPY, 2022, 7 (01)
[10]  
Erickson N, 2020, Arxiv, DOI arXiv:2003.06505