A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures

被引:30
作者
Jabbari, Hosna [1 ]
Condon, Anne [1 ]
机构
[1] Univ British Columbia, Dept Comp Sci, Vancouver, BC V5Z 1M9, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
RNA; Secondary structure prediction; Pseudoknot; Hierarchical folding; Minimum free energy; DYNAMIC-PROGRAMMING ALGORITHM; PARTITION-FUNCTION; TRANSLATION; SERVER;
D O I
10.1186/1471-2105-15-147
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information. Results: We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0. Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure. Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets. Conclusions: Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in this work are freely available at http://www.cs.ubc.ca/similar to hjabbari/software.php.
引用
收藏
页数:17
相关论文
共 66 条
[1]   Contribution of 16S rRNA nucleotides forming the 30S subunit A and P sites to translation in Escherichia coli [J].
Abdi, NM ;
Fredrick, K .
RNA, 2005, 11 (11) :1624-1632
[2]   Ensemble-based prediction of RNA secondary structures [J].
Aghaeepour, Nima ;
Hoos, Holger H. .
BMC BIOINFORMATICS, 2013, 14
[3]   Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots [J].
Akutsu, T .
DISCRETE APPLIED MATHEMATICS, 2000, 104 (1-3) :45-62
[4]   Secondary structure prediction of interacting RNA molecules [J].
Andronescu, M ;
Zhang, ZC ;
Condon, A .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 345 (05) :987-1001
[5]   Improved free energy parameters for RNA pseudoknotted secondary structure prediction [J].
Andronescu, Mirela S. ;
Pop, Cristina ;
Condon, Anne E. .
RNA, 2010, 16 (01) :26-42
[6]  
[Anonymous], 2013, LANG ENV STAT COMP
[7]   Topology Links RNA Secondary Structure with Global Conformation, Dynamics, and Adaptation [J].
Bailor, Maximillian H. ;
Sun, Xiaoyan ;
Al-Hashimi, Hashim M. .
SCIENCE, 2010, 327 (5962) :202-206
[8]   ProbKnot: Fast prediction of RNA secondary structure including pseudoknots [J].
Bellaousov, Stanislav ;
Mathews, David H. .
RNA, 2010, 16 (10) :1870-1880
[9]   RNAalifold: improved consensus structure prediction for RNA alignments [J].
Bernhart, Stephan H. ;
Hofacker, Ivo L. ;
Will, Sebastian ;
Gruber, Andreas R. ;
Stadler, Peter F. .
BMC BIOINFORMATICS, 2008, 9 (1)
[10]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816