A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure

被引:154
作者
Eddy, SR [1 ]
机构
[1] Washington Univ, Sch Med, Howard Hughes Med Inst, St Louis, MO 63110 USA
[2] Washington Univ, Sch Med, Dept Genet, St Louis, MO 63110 USA
关键词
D O I
10.1186/1471-2105-3-18
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N-3) in memory. This is only practical for small RNAs. Results: I describe a divide and conquer variant of the alignment algorithm that is analogous to memory-efficient Myers/Miller dynamic programming algorithms for linear sequence alignment. The new algorithm has an O(N-2 log N) memory complexity, at the expense of a small constant factor in time. Conclusions: Optimal ribosomal RNA structural alignments that previously required up to 150 GB of memory now require less than 270 MB.
引用
收藏
页数:16
相关论文
共 57 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[3]   The Ribonuclease P Database [J].
Brown, JW .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :314-314
[4]  
Brown M. P. S., 2000, Proceedings. Eighth International Conference on Intelligent Systems for Molecular Biology, P57
[5]  
CHIU DKY, 1991, COMPUT APPL BIOSCI, V7, P347
[6]  
CORPET F, 1994, COMPUT APPL BIOSCI, V10, P389
[7]   FINDING THE HAIRPIN IN THE HAYSTACK - SEARCHING FOR RNA MOTIFS [J].
DANDEKAR, T ;
HENTZE, MW .
TRENDS IN GENETICS, 1995, 11 (02) :45-50
[8]  
DERIJK P, 1994, NUCLEIC ACIDS RES, V22, P3495
[9]   Searching for patterns in genomic data [J].
Dsouza, M ;
Larsen, N ;
Overbeek, R .
TRENDS IN GENETICS, 1997, 13 (12) :497-498
[10]  
Durbin R., 1998, BIOL SEQUENCE ANAL P