Computational approaches for RNA energy parameter estimation

被引:89
作者
Andronescu, Mirela [1 ]
Condon, Anne [2 ]
Hoos, Holger H. [2 ]
Mathews, David H. [3 ]
Murphy, Kevin P. [2 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Univ British Columbia, Dept Comp Sci, Vancouver, BC V6T 1Z4, Canada
[3] Univ Rochester, Med Ctr, Dept Biochem & Biophys, Rochester, NY 14642 USA
基金
加拿大自然科学与工程研究理事会;
关键词
RNA secondary structure prediction; RNA free-energy parameter estimation; RNA free-energy parameters; RNA thermodynamic models; RNA free-energy models; SECONDARY STRUCTURE PREDICTION; G-CENTER-DOT; HAIRPIN LOOP STABILITY; NEAREST-NEIGHBOR PARAMETERS; UNUSUALLY STABLE RNA; TERMINAL BASE-PAIRS; INTERNAL LOOPS; THERMODYNAMIC STABILITIES; SEQUENCE DEPENDENCE; TANDEM MISMATCHES;
D O I
10.1261/rna.1950510
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Methods for efficient and accurate prediction of RNA structure are increasingly valuable, given the current rapid advances in understanding the diverse functions of RNA molecules in the cell. To enhance the accuracy of secondary structure predictions, we developed and refined optimization techniques for the estimation of energy parameters. We build on two previous approaches to RNA free-energy parameter estimation: (1) the Constraint Generation (CG) method, which iteratively generates constraints that enforce known structures to have energies lower than other structures for the same molecule; and (2) the Boltzmann Likelihood (BL) method, which infers a set of RNA free-energy parameters that maximize the conditional likelihood of a set of reference RNA structures. Here, we extend these approaches in two main ways: We propose (1) a max-margin extension of CG, and (2) a novel linear Gaussian Bayesian network that models feature relationships, which effectively makes use of sparse data by sharing statistical strength between parameters. We obtain significant improvements in the accuracy of RNA minimum free-energy pseudoknot-free secondary structure prediction when measured on a comprehensive set of 2518 RNA molecules with reference structures. Our parameters can be used in conjunction with software that predicts RNA secondary structures, RNA hybridization, or ensembles of structures. Our data, software, results, and parameter sets in various formats are freely available at http://www.cs.ubc.ca/labs/beta/Projects/RNA-Params.
引用
收藏
页码:2304 / 2318
页数:15
相关论文
共 122 条
[1]   The tmRDB and SRPDB resources [J].
Andersen, Ebbe Sloth ;
Rosenblad, Magnus Alm ;
Larsen, Niels ;
Westergaard, Jesper Cairo ;
Burks, Jody ;
Wower, Iwona K. ;
Wower, Jacek ;
Gorodkin, Jan ;
Samuelsson, Tore ;
Zwieb, Christian .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D163-D168
[2]   Secondary structure prediction of interacting RNA molecules [J].
Andronescu, M ;
Zhang, ZC ;
Condon, A .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 345 (05) :987-1001
[3]  
ANDRONESCU M, 2003, THESIS U BRIT COLUMB
[4]   Efficient parameter estimation for RNA secondary structure prediction [J].
Andronescu, Mirela ;
Condon, Anne ;
Hoos, Holger H. ;
Mathews, David H. ;
Murphy, Kevin P. .
BIOINFORMATICS, 2007, 23 (13) :I19-I28
[5]   RNA STRAND: The RNA secondary structure and statistical analysis database [J].
Andronescu, Mirela ;
Bereg, Vera ;
Hoos, Holger H. ;
Condon, Anne .
BMC BIOINFORMATICS, 2008, 9 (1)
[6]   Improved free energy parameters for RNA pseudoknotted secondary structure prediction [J].
Andronescu, Mirela S. ;
Pop, Cristina ;
Condon, Anne E. .
RNA, 2010, 16 (01) :26-42
[7]  
ANDRONESCU MS, 2008, THESIS U BRIT COLUMB
[8]  
[Anonymous], 2021, Bayesian data analysis
[9]  
[Anonymous], 2007, PROC INT C NEURAL IN
[10]   THERMODYNAMIC PARAMETERS FOR LOOP FORMATION IN RNA AND DNA HAIRPIN TETRALOOPS [J].
ANTAO, VP ;
TINOCO, I .
NUCLEIC ACIDS RESEARCH, 1992, 20 (04) :819-824