Leveraging the attention mechanism to improve the identification of DNA N6-methyladenine sites

被引:29
|
作者
Zhang, Ying [1 ]
Liu, Yan [1 ]
Xu, Jian [1 ]
Wang, Xiaoyu [2 ]
Peng, Xinxin [2 ]
Song, Jiangning [3 ,4 ]
Yu, Dong-Jun [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, 200 Xiaolingwei, Nanjing 210094, Peoples R China
[2] Monash Univ, Biomed Discovery Inst, Melbourne, Vic 3800, Australia
[3] Monash Univ, Dept Biochem & Mol Biol, Melbourne, Vic 3800, Australia
[4] Monash Univ, Monash Biomed Discovery Inst, Melbourne, Vic, Australia
基金
英国医学研究理事会; 澳大利亚研究理事会; 中国国家自然科学基金; 美国国家卫生研究院;
关键词
DNA modification; 6mA; self-attention mechanism; deep learning; LSTM; attention interpretation; RICE GENOME; METHYLATION; N-6-ADENINE; PREDICTION;
D O I
10.1093/bib/bbab351
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
DNA N6-methyladenine is an important type of DNA modification that plays important roles in multiple biological processes. Despite the recent progress in developing DNA 6mA site prediction methods, several challenges remain to be addressed. For example, although the hand-crafted features are interpretable, they contain redundant information that may bias the model training and have a negative impact on the trained model. Furthermore, although deep learning (DL)-based models can perform feature extraction and classification automatically, they lack the interpretability of the crucial features learned by those models. As such, considerable research efforts have been focused on achieving the trade-off between the interpretability and straightforwardness of DL neural networks. In this study, we develop two new DL-based models for improving the prediction of N6-methyladenine sites, termed LA6mA and AL6mA, which use bidirectional long short-term memory to respectively capture the long-range information and self-attention mechanism to extract the key position information from DNA sequences. The performance of the two proposed methods is benchmarked and evaluated on the two model organisms Arabidopsis thaliana and Drosophila melanogaster. On the two benchmark datasets, LA6mA achieves an area under the receiver operating characteristic curve (AUROC) value of 0.962 and 0.966, whereas AL6mA achieves an AUROC value of 0.945 and 0.941, respectively. Moreover, an in-depth analysis of the attention matrix is conducted to interpret the important information, which is hidden in the sequence and relevant for 6mA site prediction. The two novel pipelines developed for DNA 6mA site prediction in this work will facilitate a better understanding of the underlying principle of DL-based DNA methylation site prediction and its future applications.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites
    Xue, Tian
    Zhang, Shengli
    Qiao, Huijuan
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2021, 13 (03) : 413 - 425
  • [22] i6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites
    Tian Xue
    Shengli Zhang
    Huijuan Qiao
    Interdisciplinary Sciences: Computational Life Sciences, 2021, 13 : 413 - 425
  • [23] DNA N6-methyladenine: a new epigenetic mark in eukaryotes?
    Luo, Guan-Zheng
    Blanco, Mario Andres
    Greer, Eric Lieberman
    He, Chuan
    Shi, Yang
    NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2015, 16 (12) : 705 - 710
  • [24] An Adenine Code for DNA: A Second Life for N6-Methyladenine
    Heyn, Holger
    Esteller, Manel
    CELL, 2015, 161 (04) : 710 - 713
  • [25] N6-methyladenine is incorporated into mammalian genome by DNA polymerase
    Xiaoling Liu
    Weiyi Lai
    Yao Li
    Shaokun Chen
    Baodong Liu
    Ning Zhang
    Jiezhen Mo
    Cong Lyu
    Jing Zheng
    Ya-Rui Du
    Guibin Jiang
    Guo-Liang Xu
    Hailin Wang
    Cell Research, 2021, 31 : 94 - 97
  • [26] N6-methyladenine is incorporated into mammalian genome by DNA polymerase
    Liu, Xiaoling
    Lai, Weiyi
    Li, Yao
    Chen, Shaokun
    Liu, Baodong
    Zhang, Ning
    Mo, Jiezhen
    Lyu, Cong
    Zheng, Jing
    Du, Ya-Rui
    Jiang, Guibin
    Xu, Guo-Liang
    Wang, Hailin
    CELL RESEARCH, 2021, 31 (01) : 94 - 97
  • [27] Identification and quantification of DNA N6-methyladenine modification in mammals: A challenge to modern analytical technologies
    Lyu, Cong
    Wang, Hui-Dong
    Lai, Weiyi
    Wang, Hailin
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2023, 73
  • [28] Detection of N6-Methyladenine in Eukaryotes
    Liu, Baodong
    Wang, Hailin
    CANCER METABOLOMICS: METHODS AND APPLICATIONS, 2021, 1280 : 83 - 95
  • [29] 6mA-Finder: a novel online tool for predicting DNA N6-methyladenine sites in genomes
    Xu, Haodong
    Hu, Ruifeng
    Jia, Peilin
    Zhao, Zhongming
    BIOINFORMATICS, 2020, 36 (10) : 3257 - 3259
  • [30] DNA N6-methyladenine involvement and regulation of hepatocellular carcinoma development
    Lin, Qu
    Chen, Jun-wei
    Yin, Hao
    Li, Ming-an
    Zhou, Chu-ren
    Hao, Tao-fang
    Pan, Tao
    Wu, Chun
    Li, Zheng-ran
    Zhu, Duo
    Wang, Hao-fan
    Huang, Ming-sheng
    GENOMICS, 2022, 114 (02)