N6-methyladenine identification using deep learning and discriminative feature integration

被引:0
作者
Khan, Salman [1 ]
Uddin, Islam [1 ]
Noor, Sumaiya [2 ]
Alqahtani, Salman A. [3 ]
Ahmad, Nijad [4 ]
机构
[1] Abdul Wali Khan Univ, Dept Comp Sci, Mardan, Pakistan
[2] Purdue Univ, Business & Management Sci Dept, W Lafayette, IN USA
[3] King Saud Univ, Coll Comp & Informat Sci, New Emerging Technol & 5g Network & Beyond Res Cha, Dept Comp Engn, Riyadh, Saudi Arabia
[4] Khurasan Univ, Dept Comp Sci, Jalalabad, Afghanistan
关键词
Deep Learning; DNA Modifications; N6-methyladenine (6 mA); Epigenetics; Sequence Analysis; DNA Methylation Detection; Deep Neural Network; PREDICTION;
D O I
10.1186/s12920-025-02131-6
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
N6-methyladenine (6 mA) is a pivotal DNA modification that plays a crucial role in epigenetic regulation, gene expression, and various biological processes. With advancements in sequencing technologies and computational biology, there is an increasing focus on developing accurate methods for 6 mA site identification to enhance early detection and understand its biological significance. Despite the rapid progress of machine learning in bioinformatics, accurately detecting 6 mA sites remains a challenge due to the limited generalizability and efficiency of existing approaches. In this study, we present Deep-N6mA, a novel Deep Neural Network (DNN) model incorporating optimal hybrid features for precise 6 mA site identification. The proposed framework captures complex patterns from DNA sequences through a comprehensive feature extraction process, leveraging k-mer, Dinucleotide-based Cross Covariance (DCC), Trinucleotide-based Auto Covariance (TAC), Pseudo Single Nucleotide Composition (PseSNC), Pseudo Dinucleotide Composition (PseDNC), and Pseudo Trinucleotide Composition (PseTNC). To optimize computational efficiency and eliminate irrelevant or noisy features, an unsupervised Principal Component Analysis (PCA) algorithm is employed, ensuring the selection of the most informative features. A multilayer DNN serves as the classification algorithm to identify N6-methyladenine sites accurately. The robustness and generalizability of Deep-N6mA were rigorously validated using fivefold cross-validation on two benchmark datasets. Experimental results reveal that Deep-N6mA achieves an average accuracy of 97.70% on the F. vesca dataset and 95.75% on the R. chinensis dataset, outperforming existing methods by 4.12% and 4.55%, respectively. These findings underscore the effectiveness of Deep-N6mA as a reliable tool for early 6 mA site detection, contributing to epigenetic research and advancing the field of computational biology.
引用
收藏
页数:13
相关论文
共 35 条
  • [1] ENet-6mA: Identification of 6mA Modification Sites in Plant Genomes Using ElasticNet and Neural Networks
    Abbas, Zeeshan
    Tayara, Hilal
    Chong, Kil To
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (15)
  • [2] SpineNet-6mA: A Novel Deep Learning Tool for Predicting DNA N6-Methyladenine Sites in Genomes
    Abbas, Zeeshan
    Tayara, Hilal
    Chong, Kil To
    [J]. IEEE ACCESS, 2020, 8 : 201450 - 201457
  • [3] Intelligent hepatitis diagnosis using adaptive neuro-fuzzy inference system and information gain method
    Ahmad, Waheed
    Ahmad, Ayaz
    Iqbal, Amjad
    Hamayun, Muhammad
    Hussain, Anwar
    Rehman, Gauhar
    Khan, Salman
    Khan, Ubaid Ullah
    Khan, Dawar
    Huang, Lican
    [J]. SOFT COMPUTING, 2019, 23 (21) : 10931 - 10938
  • [4] Bordes A., 2014, P 2014 C EMP METH NA, P615, DOI DOI 10.3115/V1/D14-1067
  • [5] i6mA-Pred: identifying DNA N6 - methyladenine sites in the rice genome
    Chen, Wei
    Lv, Hao
    Nie, Fulei
    Lin, Hao
    [J]. BIOINFORMATICS, 2019, 35 (16) : 2796 - 2800
  • [6] Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences
    Chen, Wei
    Lin, Hao
    Chou, Kuo-Chen
    [J]. MOLECULAR BIOSYSTEMS, 2015, 11 (10) : 2620 - 2634
  • [7] Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
    Chou, KC
    [J]. BIOINFORMATICS, 2005, 21 (01) : 10 - 19
  • [8] i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation
    Hasan, Md Mehedi
    Manavalan, Balachandran
    Shoombuatong, Watshara
    Khatun, Mst Shamima
    Kurata, Hiroyuki
    [J]. PLANT MOLECULAR BIOLOGY, 2020, 103 (1-2) : 225 - 234
  • [9] Deep Neural Networks for Acoustic Modeling in Speech Recognition
    Hinton, Geoffrey
    Deng, Li
    Yu, Dong
    Dahl, George E.
    Mohamed, Abdel-rahman
    Jaitly, Navdeep
    Senior, Andrew
    Vanhoucke, Vincent
    Patrick Nguyen
    Sainath, Tara N.
    Kingsbury, Brian
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 82 - 97
  • [10] iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods
    Inayat, Nagina
    Khan, Mukhtaj
    Iqbal, Nadeem
    Khan, Salman
    Raza, Mushtaq
    Khan, Dost Muhammad
    Khan, Abbas
    Wei, Dong Qing
    [J]. IEEE ACCESS, 2021, 9 : 40783 - 40796