When will RNA get its AlphaFold moment?

被引:49
作者
Schneider, Bohdan [1 ]
Sweeney, Blake Alexander [2 ]
Bateman, Alex [2 ]
Cerny, Jiri [1 ]
Zok, Tomasz [3 ,4 ]
Szachniuk, Marta [3 ,4 ,5 ]
机构
[1] Czech Acad Sci, Inst Biotechnol, Prumyslova 595, CZ-25250 Vestec, Czech Republic
[2] European Bioinformat Inst EMBL EBI, European Mol Biol Lab, Wellcome Genome Campus, Hinxton CB10 1SD, England
[3] Poznan Univ Tech, Inst Comp Sci, Piotrowo 2, PL-60965 Poznan, Poland
[4] Poznan Univ Tech, European Ctr Bioinformat & Genom, Piotrowo 2, PL-60965 Poznan, Poland
[5] Polish Acad Sci, Inst Bioorgan Chem, Noskowskiego12-14, PL-61704 Poznan, Poland
关键词
CONFORMATION-DEPENDENT RESTRAINTS; 3-DIMENSIONAL STRUCTURE; STRUCTURE PREDICTION; RIBOSOMAL-SUBUNIT; IDENTIFICATION; PUZZLES; TOOL; POLYNUCLEOTIDES; JUNCTIONS; ASSEMBLE;
D O I
10.1093/nar/gkad726
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The protein structure prediction problem has been solved for many types of proteins by AlphaFold. Recently, there has been considerable excitement to build off the success of AlphaFold and predict the 3D structures of RNAs. RNA prediction methods use a variety of techniques, from physics-based to machine learning approaches. We believe that there are challenges preventing the successful development of deep learning-based methods like AlphaFold for RNA in the short term. Broadly speaking, the challenges are the limited number of structures and alignments making data-hungry deep learning methods unlikely to succeed. Additionally, there are several issues with the existing structure and sequence data, as they are often of insufficient quality, highly biased and missing key information. Here, we discuss these challenges in detail and suggest some steps to remedy the situation. We believe that it is possible to create an accurate RNA structure prediction method, but it will require solving several data quality and volume issues, usage of data beyond simple sequence alignments, or the development of new less data-hungry machine learning methods. Graphical Abstract
引用
收藏
页码:9522 / 9532
页数:11
相关论文
共 98 条
[1]   RNAsolo: a repository of cleaned PDB-derived RNA 3D structures [J].
Adamczyk, Bartosz ;
Antczak, Maciej ;
Szachniuk, Marta .
BIOINFORMATICS, 2022, 38 (14) :3668-3670
[2]   DNCON2: improved protein contact prediction using two-level deep convolutional neural networks [J].
Adhikari, Badri ;
Hou, Jie ;
Cheng, Jianlin .
BIOINFORMATICS, 2018, 34 (09) :1466-1472
[3]   AlphaFold at CASP13 [J].
AlQuraishi, Mohammed .
BIOINFORMATICS, 2019, 35 (22) :4862-4865
[5]  
Antczak M., 2023, PROTEIN-STRUCT FUNCT, V91, P1
[6]   The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution [J].
Ban, N ;
Nissen, P ;
Hansen, J ;
Moore, PB ;
Steitz, TA .
SCIENCE, 2000, 289 (5481) :905-920
[8]   SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction [J].
Boniecki, Michal J. ;
Lach, Grzegorz ;
Dawson, Wayne K. ;
Tomala, Konrad ;
Lukasz, Pawel ;
Soltysinski, Tomasz ;
Rother, Kristian M. ;
Bujnicki, Janusz M. .
NUCLEIC ACIDS RESEARCH, 2016, 44 (07)
[9]   Barnaba: software for analysis of nucleic acid structures and trajectories [J].
Bottaro, Sandro ;
Bussi, Giovanni ;
Pinamonti, Giovanni ;
Reisser, Sabine ;
Boomsma, Wouter ;
Lindorff-Larsen, Kresten .
RNA, 2019, 25 (02) :219-231
[10]   ProteinBERT: a universal deep-learning model of protein sequence and function [J].
Brandes, Nadav ;
Ofer, Dan ;
Peleg, Yam ;
Rappoport, Nadav ;
Linial, Michal .
BIOINFORMATICS, 2022, 38 (08) :2102-2110