Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction

被引:18
作者
Janssen, Stefan [2 ]
Schudoma, Christian [3 ]
Steger, Gerhard [1 ]
Giegerich, Robert [2 ]
机构
[1] Univ Dusseldorf, Inst Phys Biol, D-40204 Dusseldorf, Germany
[2] Univ Bielefeld, Fac Technol, D-33615 Bielefeld, Germany
[3] Max Planck Inst Mol Plant Physiol, Bioinformat Grp, D-14476 Potsdam, Germany
来源
BMC BIOINFORMATICS | 2011年 / 12卷
关键词
STABILITY; PARAMETERS; STACKING; SOFTWARE;
D O I
10.1186/1471-2105-12-429
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Many bioinformatics tools for RNA secondary structure analysis are based on a thermodynamic model of RNA folding. They predict a single, "optimal" structure by free energy minimization, they enumerate near-optimal structures, they compute base pair probabilities and dot plots, representative structures of different abstract shapes, or Boltzmann probabilities of structures and shapes. Although all programs refer to the same physical model, they implement it with considerable variation for different tasks, and little is known about the effects of heuristic assumptions and model simplifications used by the programs on the outcome of the analysis. Results: We extract four different models of the thermodynamic folding space which underlie the programs RNAFOLD, RNASHAPES, and RNASUBOPT. Their differences lie within the details of the energy model and the granularity of the folding space. We implement probabilistic shape analysis for all models, and introduce the shape probability shift as a robust measure of model similarity. Using four data sets derived from experimentally solved structures, we provide a quantitative evaluation of the model differences. Conclusions: We find that search space granularity affects the computed shape probabilities less than the over-or underapproximation of free energy by a simplified energy model. Still, the approximations perform similar enough to implementations of the full model to justify their continued use in settings where computational constraints call for simpler algorithms. On the side, we observe that the rarely used level 2 shapes, which predict the complete arrangement of helices, multiloops, internal loops and bulges, include the "true" shape in a rather small number of predicted high probability shapes. This calls for an investigation of new strategies to extract high probability members from the (very large) level 2 shape space of an RNA sequence. We provide implementations of all four models, written in a declarative style that makes them easy to be modified. Based on our study, future work on thermodynamic RNA folding may make a choice of model based on our empirical data. It can take our implementations as a starting point for further program development.
引用
收藏
页数:19
相关论文
共 35 条
  • [1] Analysis and classification of RNA tertiary structures
    Abraham, Mira
    Dror, Oranit
    Nussinov, Ruth
    Wolfson, Haim J.
    [J]. RNA, 2008, 14 (11) : 2274 - 2289
  • [2] Computational approaches for RNA energy parameter estimation
    Andronescu, Mirela
    Condon, Anne
    Hoos, Holger H.
    Mathews, David H.
    Murphy, Kevin P.
    [J]. RNA, 2010, 16 (12) : 2304 - 2318
  • [3] RNA STRAND: The RNA secondary structure and statistical analysis database
    Andronescu, Mirela
    Bereg, Vera
    Hoos, Holger H.
    Condon, Anne
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [4] [Anonymous], 1995, Introduction to computational biology: maps, sequences and genomes
  • [5] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [6] STABILITY OF RIBONUCLEIC-ACID DOUBLE-STRANDED HELICES
    BORER, PN
    DENGLER, B
    TINOCO, I
    UHLENBECK, OC
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1974, 86 (04) : 843 - 853
  • [7] Thermodynamics of unpaired terminal nucleotides on short RNA helixes correlates with stacking at helix termini in larger RNAs
    Burkard, ME
    Kierzek, R
    Turner, DH
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1999, 290 (05) : 967 - 982
  • [8] CONTRAfold: RNA secondary structure prediction without physics-based models
    Do, Chuong B.
    Woods, Daniel A.
    Batzoglou, Serafim
    [J]. BIOINFORMATICS, 2006, 22 (14) : E90 - E98
  • [9] Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction
    Dowell, RD
    Eddy, SR
    [J]. BMC BIOINFORMATICS, 2004, 5 (1)
  • [10] A comprehensive comparison of comparative RNA structure prediction approaches
    Gardner, PP
    Giegerich, R
    [J]. BMC BIOINFORMATICS, 2004, 5 (1)