Nmix: a hybrid deep learning model for precise prediction of 2'-O-methylation sites based on multi-feature fusion and ensemble learning

被引:0
|
作者
Geng, Yu-Qing [1 ]
Lai, Fei-Liao [1 ]
Luo, Hao [1 ]
Gao, Feng [1 ,2 ,3 ,4 ]
机构
[1] Tianjin Univ, Sch Sci, Dept Phys, 92 Weijin Rd, Tianjin 300072, Peoples R China
[2] Tianjin Univ, Frontiers Sci Ctr Synthet Biol, 92 Weijin Rd, Tianjin 300072, Peoples R China
[3] Tianjin Univ, Key Lab Syst Bioengn, Minist Educ, 92 Weijin Rd, Tianjin 300072, Peoples R China
[4] Collaborat Innovat Ctr Chem Sci & Engn Tianjin, SynBio Res Platform, 92 Weijin Rd, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
2'-O-methylation; multi-feature fusion; deep learning; asymmetric loss; ensemble learning; HIGH-THROUGHPUT; MESSENGER-RNA; IDENTIFICATION; LANDSCAPE;
D O I
10.1093/bib/bbae601
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
RNA 2'-O-methylation (Nm) is a crucial post-transcriptional modification with significant biological implications. However, experimental identification of Nm sites is challenging and resource-intensive. While multiple computational tools have been developed to identify Nm sites, their predictive performance, particularly in terms of precision and generalization capability, remains deficient. We introduced Nmix, an advanced computational tool for precise prediction of Nm sites in human RNA. We constructed the largest, low-redundancy dataset of experimentally verified Nm sites and employed an innovative multi-feature fusion approach, combining one-hot, Z-curve and RNA secondary structure encoding. Nmix utilizes a meticulously designed hybrid deep learning architecture, integrating 1D/2D convolutional neural networks, self-attention mechanism and residual connection. We implemented asymmetric loss function and Bayesian optimization-based ensemble learning, substantially improving predictive performance on imbalanced datasets. Rigorous testing on two benchmark datasets revealed that Nmix significantly outperforms existing state-of-the-art methods across various metrics, particularly in precision, with average improvements of 33.1% and 60.0%, and Matthews correlation coefficient, with average improvements of 24.7% and 51.1%. Notably, Nmix demonstrated exceptional cross-species generalization capability, accurately predicting 93.8% of experimentally verified Nm sites in rat RNA. We also developed a user-friendly web server (https://tubic.org/Nm) and provided standalone prediction scripts to facilitate widespread adoption. We hope that by providing a more accurate and robust tool for Nm site prediction, we can contribute to advancing our understanding of Nm mechanisms and potentially benefit the prediction of other RNA modification sites.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Multi-feature fusion for deep learning to predict plant lncRNA-protein interaction
    Wekesa, Jael Sanyanda
    Meng, Jun
    Luan, Yushi
    GENOMICS, 2020, 112 (05) : 2928 - 2936
  • [42] A Multi-Feature Fusion Based on Transfer Learning for Chicken Embryo Eggs Classification
    Huang, Lvwen
    He, Along
    Zhai, Mengqun
    Wang, Yuxi
    Bai, Ruige
    Nie, Xiaolin
    SYMMETRY-BASEL, 2019, 11 (05):
  • [43] Multi-Feature Data Fusion-Based Load Forecasting of Electric Vehicle Charging Stations Using a Deep Learning Model
    Aduama, Prince
    Zhang, Zhibo
    Al-Sumaiti, Ameena S.
    ENERGIES, 2023, 16 (03)
  • [44] EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction
    Honglei Wang
    Hui Liu
    Tao Huang
    Gangshen Li
    Lin Zhang
    Yanjing Sun
    BMC Bioinformatics, 23
  • [45] A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion
    Ali, Farman
    El-Sappagh, Shaker
    Islam, S. M. Riazul
    Kwak, Daehan
    Ali, Amjad
    Imran, Muhammad
    Kwak, Kyung-Sup
    INFORMATION FUSION, 2020, 63 : 208 - 222
  • [46] Alg-MFDL: A multi-feature deep learning framework for allergenic proteins prediction
    Hu, Xiang
    Li, Jingyi
    Liu, Taigang
    ANALYTICAL BIOCHEMISTRY, 2025, 697
  • [47] NmSEER: A Prediction Tool for 2′-O-Methylation (Nm) Sites Based on Random Forest
    Zhou, Yiran
    Cui, Qinghua
    Zhou, Yuan
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT I, 2018, 10954 : 893 - 900
  • [48] A novel hybrid ensemble model based on tree-based method and deep learning method for default prediction
    He, Hongliang
    Fan, Yanli
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 176
  • [49] A deep learning-based multi-model ensemble method for cancer prediction
    Xiao, Yawen
    Wu, Jun
    Lin, Zongli
    Zhao, Xiaodong
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2018, 153 : 1 - 9
  • [50] Feature Fusion Deep Learning Model for Defects Prediction in Crystal Structures
    Alarfaj, Abeer Abdulaziz
    Mahmoud, Hanan Ahmed Hosni
    CRYSTALS, 2022, 12 (09)