REDalign: accurate RNA structural alignment using residual encoder-decoder network

被引:0
作者
Chen, Chun-Chi [1 ]
Chan, Yi-Ming [2 ]
Jeong, Hyundoo [3 ]
机构
[1] Natl Chiayi Univ, Dept Elect Engn, 300 Xuefu Rd, Chiayi 600355, Taiwan
[2] MindtronicAI Co, 7 F,218,Sec 6,Roosevelt Rd, Taipei 11674, Taiwan
[3] Incheon Natl Univ, Biomed & Robot Engn, 119 Acad Ro, Incheon 22012, South Korea
来源
BMC BIOINFORMATICS | 2024年 / 25卷 / 01期
关键词
RNA secondary structure; Structural alignment; Pseudoknot structure; Deep learning; Residual encoder decoder network; SECONDARY STRUCTURE; EVOLUTIONARY CONSERVATION; SEQUENCE; DYNALIGN; SEARCH; COMMON;
D O I
10.1186/s12859-024-05956-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundRNA secondary structural alignment serves as a foundational procedure in identifying conserved structural motifs among RNA sequences, crucially advancing our understanding of novel RNAs via comparative genomic analysis. While various computational strategies for RNA structural alignment exist, they often come with high computational complexity. Specifically, when addressing a set of RNAs with unknown structures, the task of simultaneously predicting their consensus secondary structure and determining the optimal sequence alignment requires an overwhelming computational effort of O(L6)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(L<^>6)$$\end{document} for each RNA pair. Such an extremely high computational complexity makes these methods impractical for large-scale analysis despite their accurate alignment capabilities.ResultsIn this paper, we introduce REDalign, an innovative approach based on deep learning for RNA secondary structural alignment. By utilizing a residual encoder-decoder network, REDalign can efficiently capture consensus structures and optimize structural alignments. In this learning model, the encoder network leverages a hierarchical pyramid to assimilate high-level structural features. Concurrently, the decoder network, enhanced with residual skip connections, integrates multi-level encoded features to learn detailed feature hierarchies with fewer parameter sets. REDalign significantly reduces computational complexity compared to Sankoff-style algorithms and effectively handles non-nested structures, including pseudoknots, which are challenging for traditional alignment methods. Extensive evaluations demonstrate that REDalign provides superior accuracy and substantial computational efficiency.ConclusionREDalign presents a significant advancement in RNA secondary structural alignment, balancing high alignment accuracy with lower computational demands. Its ability to handle complex RNA structures, including pseudoknots, makes it an effective tool for large-scale RNA analysis, with potential implications for accelerating discoveries in RNA research and comparative genomics.
引用
收藏
页数:16
相关论文
共 38 条
[1]   Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning [J].
Akiyama, Manato ;
Sakakibara, Yasubumi .
NAR GENOMICS AND BIOINFORMATICS, 2022, 4 (01)
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
Batey RT, 1999, ANGEW CHEM INT EDIT, V38, P2327
[4]   RNAmountAlign: Efficient software for local, global, semiglobal pairwise and multiple RNA sequence/structure alignment [J].
Bayegan, Amir H. ;
Clote, Peter .
PLOS ONE, 2020, 15 (01)
[5]   Integrating alignment-based and alignment-free sequence similarity measures for biological sequence classification [J].
Borozan, Ivan ;
Watt, Stuart ;
Ferretti, Vincent .
BIOINFORMATICS, 2015, 31 (09) :1396-1404
[6]   REDfold: accurate RNA secondary structure prediction using residual encoder-decoder network [J].
Chen, Chun-Chi ;
Chan, Yi-Ming .
BMC BIOINFORMATICS, 2023, 24 (01)
[7]   TOPAS: network-based structural alignment of RNA sequences [J].
Chen, Chun-Chi ;
Jeong, Hyundoo ;
Qian, Xiaoning ;
Yoon, Byung-Jun .
BIOINFORMATICS, 2019, 35 (17) :2941-2948
[8]   VARNA: Interactive drawing and editing of the RNA secondary structure [J].
Darty, Kevin ;
Denise, Alain ;
Ponty, Yann .
BIOINFORMATICS, 2009, 25 (15) :1974-1975
[9]   Deep learning: new computational modelling techniques for genomics [J].
Eraslan, Gokcen ;
Avsec, Ziga ;
Gagneur, Julien ;
Theis, Fabian J. .
NATURE REVIEWS GENETICS, 2019, 20 (07) :389-403
[10]   RNA folding at elementary step resolution [J].
Flamm, C ;
Fontana, W ;
Hofacker, IL ;
Schuster, P .
RNA, 2000, 6 (03) :325-338