RNA structure prediction using deep learning — A comprehensive review

被引:3
作者
Chaturvedi, Mayank [1 ]
Rashid, Mahmood A. [1 ]
Paliwal, Kuldip K. [1 ]
机构
[1] Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, 4111, QLD
基金
澳大利亚研究理事会;
关键词
Deep learning; Feature extraction; Machine learning; Neural networks; RNA secondary structure prediction; Transformers;
D O I
10.1016/j.compbiomed.2025.109845
中图分类号
学科分类号
摘要
In computational biology, accurate RNA structure prediction offers several benefits, including facilitating a better understanding of RNA functions and RNA-based drug design. Implementing deep learning techniques for RNA structure prediction has led tremendous progress in this field, resulting in significant improvements in prediction accuracy. This comprehensive review aims to provide an overview of the diverse strategies employed in predicting RNA secondary structures, emphasizing deep learning methods. The article categorizes the discussion into three main dimensions: feature extraction methods, existing state-of-the-art learning model architectures, and prediction approaches. We present a comparative analysis of various techniques and models highlighting their strengths and weaknesses. Finally, we identify gaps in the literature, discuss current challenges, and suggest future approaches to enhance model performance and applicability in RNA structure prediction tasks. This review provides a deeper insight into the subject and paves the way for further progress in this dynamic intersection of life sciences and artificial intelligence. © 2025 The Authors
引用
收藏
相关论文
共 189 条
[51]  
Ding J., Regev A., Deep generative model embedding of single-cell RNA-seq profiles on hyperspheres and hyperbolic spaces, Nat. Commun., 12, 1, (2021)
[52]  
Bonizzoni P., Costantini M., De Felice C., Petescia A., Pirola Y., Previtali M., Rizzi R., Stoye J., Zaccagnino R., Zizza R., Numeric Lyndon-based feature embedding of sequencing reads for machine learning approaches, Inf. Sci., 607, pp. 458-476, (2022)
[53]  
Hwang H., Jeon H., Yeo N., Baek D., Big data and deep learning for RNA biology, Exp. Mol. Med., pp. 1-29, (2024)
[54]  
Woloszynek S., Zhao Z., Chen J., Rosen G.L., 16S rRNA sequence embeddings: Meaningful numeric feature representations of nucleotide sequences that are convenient for downstream analyses, PLoS Comput. Biol., 15, 2, (2019)
[55]  
Mikolov T., Chen K., Corrado G., Dean J., Efficient estimation of word representations in vector space, (2013)
[56]  
Asgari E., Mofrad M.R., Continuous distributed representation of biological sequences for deep proteomics and genomics, PLoS One, 10, 11, (2015)
[57]  
Akiyama M., Sakakibara Y., Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning, NAR: Genom. Bioinf., 4, 1, (2022)
[58]  
Chiu B., Crichton G., Korhonen A., Pyysalo S., How to train good word embeddings for biomedical NLP, Proceedings of the 15th Workshop on Biomedical Natural Language Processing, pp. 166-174, (2016)
[59]  
Arowolo M.O., Adebiyi M.O., Aremu C., Adebiyi A.A., A survey of dimension reduction and classification methods for RNA-Seq data on malaria vector, J. Big Data, 8, pp. 1-17, (2021)
[60]  
Pudjihartono N., Fadason T., Kempa-Liehr A.W., O'Sullivan J.M., A review of feature selection methods for machine learning-based disease risk prediction, Front. Bioinformat., 2, (2022)