A max-margin model for efficient simultaneous alignment and folding of RNA sequences

被引:56
作者
Do, Chuong B. [1 ]
Foo, Chuan-Sheng [1 ]
Batzoglou, Serafim [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
关键词
D O I
10.1093/bioinformatics/btn177
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene prediction. In this work, we present RAF (RNA Alignment and Folding), an efficient algorithm for simultaneous alignment and consensus folding of unaligned RNA sequences. Algorithmically, RAF exploits sparsity in the set of likely pairing and alignment candidates for each nucleotide (as identified by the CONTRAfold or CONTRAlign programs) to achieve an effectively quadratic running time for simultaneous pairwise alignment and folding. RAFs fast sparse dynamic programming, in turn, serves as the inference engine within a discriminative machine learning algorithm for parameter estimation. Results: In cross-validated benchmark tests, RAF achieves accuracies equaling or surpassing the current best approaches for RNA multiple sequence secondary structure prediction. However, RAF requires nearly an order of magnitude less time than other simultaneous folding and alignment methods, thus making it especially appropriate for high-throughput studies.
引用
收藏
页码:I68 / I76
页数:9
相关论文
共 41 条
[1]   Efficient parameter estimation for RNA secondary structure prediction [J].
Andronescu, Mirela ;
Condon, Anne ;
Hoos, Holger H. ;
Mathews, David H. ;
Murphy, Kevin P. .
BIOINFORMATICS, 2007, 23 (13) :I19-I28
[2]  
[Anonymous], [No title captured]
[3]  
[Anonymous], 2007, ICML 07
[4]  
Bertsekas D, 2003, Convex Analysis and Optimization, V1
[5]   Hierarchy and dynamics of RNA folding [J].
Brion, P ;
Westhof, E .
ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE, 1997, 26 :113-137
[6]   Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints [J].
D Dowell, Robin ;
Eddy, Sean R. .
BMC BIOINFORMATICS, 2006, 7 (1)
[7]   STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time [J].
Dalli, Deniz ;
Wilm, Andreas ;
Mainz, Indra ;
Steger, Gerhard .
BIOINFORMATICS, 2006, 22 (13) :1593-1599
[8]  
DIEFFENBACH CW, 1993, PCR METH APPL, V3, pS30
[9]   ProbCons: Probabilistic consistency-based multiple sequence alignment [J].
Do, CB ;
Mahabhashyam, MSP ;
Brudno, M ;
Batzoglou, S .
GENOME RESEARCH, 2005, 15 (02) :330-340
[10]   CONTRAfold: RNA secondary structure prediction without physics-based models [J].
Do, Chuong B. ;
Woods, Daniel A. ;
Batzoglou, Serafim .
BIOINFORMATICS, 2006, 22 (14) :E90-E98