Leveraging the attention mechanism to improve the identification of DNA N6-methyladenine sites

被引:29
|
作者
Zhang, Ying [1 ]
Liu, Yan [1 ]
Xu, Jian [1 ]
Wang, Xiaoyu [2 ]
Peng, Xinxin [2 ]
Song, Jiangning [3 ,4 ]
Yu, Dong-Jun [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, 200 Xiaolingwei, Nanjing 210094, Peoples R China
[2] Monash Univ, Biomed Discovery Inst, Melbourne, Vic 3800, Australia
[3] Monash Univ, Dept Biochem & Mol Biol, Melbourne, Vic 3800, Australia
[4] Monash Univ, Monash Biomed Discovery Inst, Melbourne, Vic, Australia
基金
英国医学研究理事会; 澳大利亚研究理事会; 中国国家自然科学基金; 美国国家卫生研究院;
关键词
DNA modification; 6mA; self-attention mechanism; deep learning; LSTM; attention interpretation; RICE GENOME; METHYLATION; N-6-ADENINE; PREDICTION;
D O I
10.1093/bib/bbab351
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
DNA N6-methyladenine is an important type of DNA modification that plays important roles in multiple biological processes. Despite the recent progress in developing DNA 6mA site prediction methods, several challenges remain to be addressed. For example, although the hand-crafted features are interpretable, they contain redundant information that may bias the model training and have a negative impact on the trained model. Furthermore, although deep learning (DL)-based models can perform feature extraction and classification automatically, they lack the interpretability of the crucial features learned by those models. As such, considerable research efforts have been focused on achieving the trade-off between the interpretability and straightforwardness of DL neural networks. In this study, we develop two new DL-based models for improving the prediction of N6-methyladenine sites, termed LA6mA and AL6mA, which use bidirectional long short-term memory to respectively capture the long-range information and self-attention mechanism to extract the key position information from DNA sequences. The performance of the two proposed methods is benchmarked and evaluated on the two model organisms Arabidopsis thaliana and Drosophila melanogaster. On the two benchmark datasets, LA6mA achieves an area under the receiver operating characteristic curve (AUROC) value of 0.962 and 0.966, whereas AL6mA achieves an AUROC value of 0.945 and 0.941, respectively. Moreover, an in-depth analysis of the attention matrix is conducted to interpret the important information, which is hidden in the sequence and relevant for 6mA site prediction. The two novel pipelines developed for DNA 6mA site prediction in this work will facilitate a better understanding of the underlying principle of DL-based DNA methylation site prediction and its future applications.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Identification of DNA N6-methyladenine sites by integration of sequence features
    Wang, Hao-Tian
    Xiao, Fu-Hui
    Li, Gong-Hua
    Kong, Qing-Peng
    EPIGENETICS & CHROMATIN, 2020, 13 (01)
  • [2] Identification of DNA N6-methyladenine sites by integration of sequence features
    Hao-Tian Wang
    Fu-Hui Xiao
    Gong-Hua Li
    Qing-Peng Kong
    Epigenetics & Chromatin, 13
  • [3] A review of methods for predicting DNA N6-methyladenine sites
    Han, Ke
    Wang, Jianchun
    Wang, Yu
    Zhang, Lei
    Yu, Mengyao
    Xie, Fang
    Zheng, Dequan
    Xu, Yaoqun
    Ding, Yijie
    Wan, Jie
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [4] No evidence for DNA N6-methyladenine in mammals
    Douvlataniotis, Karolos
    Bensberg, Maike
    Lentini, Antonio
    Gylemo, Bjorn
    Nestor, Coim E.
    SCIENCE ADVANCES, 2020, 6 (12):
  • [5] N6-methyladenine DNA Modification in Glioblastoma
    Xie, Qi
    Wu, Tao P.
    Gimple, Ryan C.
    Li, Zheng
    Prager, Briana C.
    Wu, Qiulian
    Yu, Yang
    Wang, Pengcheng
    Wang, Yinsheng
    Gorkin, David U.
    Zhang, Cheng
    Dowiak, Alexis V.
    Lin, Kaixuan
    Zeng, Chun
    Sui, Yinghui
    Kim, Leo J. Y.
    Miller, Tyler E.
    Jiang, Li
    Lee-Poturalski, Christine
    Huang, Zhi
    Fang, Xiaoguang
    Zhai, Kui
    Mack, Stephen C.
    Sander, Maike
    Bao, Shideng
    Kerstetter-Fogle, Amber E.
    Sloan, Andrew E.
    Xiao, Andrew Z.
    Rich, Jeremy N.
    CELL, 2018, 175 (05) : 1228 - +
  • [6] IMMUNOCHEMICAL DETECTION OF N6-METHYLADENINE IN DNA
    STORL, HJ
    SIMON, H
    BARTHELMES, H
    BIOCHIMICA ET BIOPHYSICA ACTA, 1979, 564 (01) : 23 - 30
  • [7] N6-Methyladenine DNA Modification in Drosophila
    Zhang, Guoqiang
    Huang, Hua
    Liu, Di
    Cheng, Ying
    Liu, Xiaoling
    Zhang, Wenxin
    Yin, Ruichuan
    Zhang, Dapeng
    Zhang, Peng
    Liu, Jianzhao
    Li, Chaoyi
    Liu, Baodong
    Luo, Yuewan
    Zhu, Yuanxiang
    Zhang, Ning
    He, Shunmin
    He, Chuan
    Wang, Hailin
    Chen, Dahua
    CELL, 2015, 161 (04) : 893 - 906
  • [8] DNA N6-methyladenine modification in hypertension
    Guo, Ye
    Pei, Yuqing
    Li, Kexin
    Cui, Wei
    Zhang, Donghong
    AGING-US, 2020, 12 (07): : 6276 - 6291
  • [9] N6-methyladenine: the other methylated base of DNA
    Ratel, D
    Ravanat, JL
    Berger, F
    Wion, D
    BIOESSAYS, 2006, 28 (03) : 309 - 315
  • [10] N6-Methyladenine: A Conserved and Dynamic DNA Mark
    O'Brown, Zach Klapholz
    Greer, Eric Lieberman
    DNA METHYLTRANSFERASES - ROLE AND FUNCTION, 2016, 945 : 213 - 246