RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning

被引:238
作者
Singh, Jaswinder [1 ]
Hanson, Jack [1 ]
Paliwal, Kuldip [1 ]
Zhou, Yaoqi [2 ,3 ]
机构
[1] Griffith Univ, Sch Engn & Built Environm, Signal Proc Lab, Brisbane, Qld 4111, Australia
[2] Griffith Univ, Inst Glyc, Parklands Dr, Southport, Qld 4222, Australia
[3] Griffith Univ, Sch Informat & Commun Technol, Parklands Dr, Southport, Qld 4222, Australia
基金
英国医学研究理事会;
关键词
THERMODYNAMICS; IMPLEMENTATION; GENERATION; PROTEIN;
D O I
10.1038/s41467-019-13395-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The majority of our human genome transcribes into noncoding RNAs with unknown structures and functions. Obtaining functional clues for noncoding RNAs requires accurate base-pairing or secondary-structure prediction. However, the performance of such predictions by current folding-based algorithms has been stagnated for more than a decade. Here, we propose the use of deep contextual learning for base-pair prediction including those non-canonical and non-nested (pseudoknot) base pairs stabilized by tertiary interactions. Since only <250 nonredundant, high-resolution RNA structures are available for model training, we utilize transfer learning from a model initially trained with a recent high-quality bpRNA dataset of >10,000 nonredundant RNAs made available through comparative analysis. The resulting method achieves large, statistically significant improvement in predicting all base pairs, noncanonical and non-nested base pairs in particular. The proposed method (SPOT-RNA), with a freely available server and standalone software, should be useful for improving RNA structure modeling, sequence alignment, and functional annotations.
引用
收藏
页数:13
相关论文
共 69 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   RNA motif discovery: a computational overview [J].
Achar, Avinash ;
Saetrom, Pal .
BIOLOGY DIRECT, 2015, 10
[3]   A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model [J].
Akiyama, Manato ;
Sato, Kengo ;
Sakakibara, Yasubumi .
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2018, 16 (06)
[4]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[5]  
[Anonymous], 2015, PREPRINT
[6]  
[Anonymous], 2018, ABS180301271 CORR
[7]  
[Anonymous], 2016, STAT-US
[8]   ProbKnot: Fast prediction of RNA secondary structure including pseudoknots [J].
Bellaousov, Stanislav ;
Mathews, David H. .
RNA, 2010, 16 (10) :1870-1880
[9]   Genome-Wide Analysis of RNA Secondary Structure [J].
Bevilacqua, Philip C. ;
Ritchey, Laura E. ;
Su, Zhao ;
Assmann, Sarah M. .
ANNUAL REVIEW OF GENETICS, VOL 50, 2016, 50 :235-266
[10]   Boolean analysis reveals systematic interactions among low-abundance species in the human gut microbiome [J].
Claussen, Jens Christian ;
Skieceviciene, Jurgita ;
Wang, Jun ;
Rausch, Philipp ;
Karlsen, Tom H. ;
Lieb, Wolfgang ;
Baines, John F. ;
Franke, Andre ;
Huett, Marc-Thorsten .
PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (06)