Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review

被引:77
作者
Choi, Sanghyuk Roy [1 ]
Lee, Minhyeok [1 ]
机构
[1] Chung Ang Univ, Sch Elect & Elect Engn, Seoul 06974, South Korea
来源
BIOLOGY-BASEL | 2023年 / 12卷 / 07期
基金
新加坡国家研究基金会;
关键词
deep learning; transformer model; attention mechanism; genome data; transcriptome data; genomics; bioinformatics; sequence analysis; natural language processing; MIRNA-DISEASE ASSOCIATIONS; NEURAL-NETWORK; DEEP; PREDICTION; MODEL; FUSION;
D O I
10.3390/biology12071033
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Simple Summary The rapidly advancing field of deep learning, specifically transformer-based architectures and attention mechanisms, has found substantial applicability in bioinformatics and genome data analysis. Given the analogous nature of genome sequences to language texts, these techniques initially successful in natural language processing have been applied to genomic data. This review provides an in-depth analysis of the most recent advancements and applications of these techniques to genome data, critically evaluating their advantages and limitations. By investigating studies from 2019 to 2023, this review identifies potential future research areas, thereby encouraging further advancements in the field. The emergence and rapid development of deep learning, specifically transformer-based architectures and attention mechanisms, have had transformative implications across several domains, including bioinformatics and genome data analysis. The analogous nature of genome sequences to language texts has enabled the application of techniques that have exhibited success in fields ranging from natural language processing to genomic data. This review provides a comprehensive analysis of the most recent advancements in the application of transformer architectures and attention mechanisms to genome and transcriptome data. The focus of this review is on the critical evaluation of these techniques, discussing their advantages and limitations in the context of genome data analysis. With the swift pace of development in deep learning methodologies, it becomes vital to continually assess and reflect on the current standing and future direction of the research. Therefore, this review aims to serve as a timely resource for both seasoned researchers and newcomers, offering a panoramic view of the recent advancements and elucidating the state-of-the-art applications in the field. Furthermore, this review paper serves to highlight potential areas of future investigation by critically evaluating studies from 2019 to 2023, thereby acting as a stepping-stone for further research endeavors.
引用
收藏
页数:29
相关论文
共 163 条
[1]   A deep learning approach to programmable RNA switches [J].
Angenent-Mari, Nicolaas M. ;
Garruss, Alexander S. ;
Soenksen, Luis R. ;
Church, George ;
Collins, James J. .
NATURE COMMUNICATIONS, 2020, 11 (01)
[2]   Detecting sequence signals in targeting peptides using deep learning [J].
Armenteros, Jose Juan Almagro ;
Salvatore, Marco ;
Emanuelsson, Olof ;
Winther, Ole ;
von Heijne, Gunnar ;
Elofsson, Arne ;
Nielsen, Henrik .
LIFE SCIENCE ALLIANCE, 2019, 2 (05)
[3]   EL-RMLocNet: An explainable LSTM network for RNA-associated multi-compartment localization prediction [J].
Asim, Muhammad Nabeel ;
Ibrahim, Muhammad Ali ;
Malik, Muhammad Imran ;
Zehe, Christoph ;
Cloarec, Olivier ;
Trygg, Johan ;
Dengel, Andreas ;
Ahmed, Sheraz .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2022, 20 :3986-4002
[4]   Incorporating Machine Learning into Established Bioinformatics Frameworks [J].
Auslander, Noam ;
Gussow, Ayal B. ;
Koonin, Eugene V. .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (06) :1-19
[5]   DeepTRIAGE: interpretable and individualised biomarker scores using attention mechanism for the classification of breast cancer sub-types [J].
Beykikhoshk, Adham ;
Quinn, Thomas P. ;
Lee, Samuel C. ;
Truyen Tran ;
Venkatesh, Svetha .
BMC MEDICAL GENOMICS, 2020, 13 (Suppl 3)
[6]   CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope [J].
Bhatt, Dulari ;
Patel, Chirag ;
Talsania, Hardik ;
Patel, Jigar ;
Vaghela, Rasmika ;
Pandya, Sharnil ;
Modi, Kirit ;
Ghayvat, Hemant .
ELECTRONICS, 2021, 10 (20)
[7]   An attention-based hybrid deep neural networks for accurate identification of transcription factor binding sites [J].
Bhukya, Raju ;
Kumari, Archana ;
Dasari, Chandra Mohan ;
Amilpur, Santhosh .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (21) :19051-19060
[8]  
Brown TB, 2020, ADV NEUR IN, V33
[9]   COPPER: an ensemble deep-learning approach for identifying exclusive virus-derived small interfering RNAs in plants [J].
Bu, Yuanyuan ;
Jia, Cangzhi ;
Guo, Xudong ;
Li, Fuyi ;
Song, Jiangning .
BRIEFINGS IN FUNCTIONAL GENOMICS, 2023, 22 (03) :274-280
[10]   CellVGAE: an unsupervised scRNA-seq analysis workflow with graph attention networks [J].
Buterez, David ;
Bica, Ioana ;
Tariq, Ifrah ;
Andres-Terre, Helena ;
Lio, Pietro .
BIOINFORMATICS, 2022, 38 (05) :1277-1286