Hidden Markov model speed heuristic and iterative HMM search procedure

被引:828
作者
Johnson, L. Steven [1 ]
Eddy, Sean R. [2 ]
Portugaly, Elon [3 ]
机构
[1] Washington Univ, Sch Med, Dept Pathol & Immunol, St Louis, MO 63130 USA
[2] Howard Hughes Med Inst, Ashburn, VA USA
[3] Hebrew Univ Jerusalem, Sch Engn & Comp Sci, Jerusalem, Israel
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
ASTRAL COMPENDIUM; PROTEIN-STRUCTURE; PSI-BLAST;
D O I
10.1186/1471-2105-11-431
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Profile hidden Markov models (profile-HMMs) are sensitive tools for remote protein homology detection, but the main scoring algorithms, Viterbi or Forward, require considerable time to search large sequence databases. Results: We have designed a series of database filtering steps, HMMERHEAD, that are applied prior to the scoring algorithms, as implemented in the HMMER package, in an effort to reduce search time. Using this heuristic, we obtain a 20-fold decrease in Forward and a 6-fold decrease in Viterbi search time with a minimal loss in sensitivity relative to the unfiltered approaches. We then implemented an iterative profile-HMM search method, JackHMMER, which employs the HMMERHEAD heuristic. Due to our search heuristic, we eliminated the subdatabase creation that is common in current iterative profile-HMM approaches. On our benchmark, JackHMMER detects 14% more remote protein homologs than SAM's iterative method T2K. Conclusions: Our search heuristic, HMMERHEAD, significantly reduces the time needed to score a profile-HMM against large sequence databases. This search heuristic allowed us to implement an iterative profile-HMM search method, JackHMMER, which detects significantly more remote protein homologs than SAM's T2K and NCBI's PSI-BLAST.
引用
收藏
页数:8
相关论文
共 16 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [3] [Anonymous], HMMER: biosequence analysis using profile hidden Markov models
  • [4] Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships
    Brenner, SE
    Chothia, C
    Hubbard, TJP
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) : 6073 - 6078
  • [5] The ASTRAL compendium for protein structure and sequence analysis
    Brenner, SE
    Koehl, P
    Levitt, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 254 - 256
  • [6] ASTRAL compendium enhancements
    Chandonia, JM
    Walker, NS
    Conte, LL
    Koehl, P
    Levitt, M
    Brenner, SE
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 260 - 263
  • [7] The Pfam protein families database
    Finn, Robert D.
    Tate, John
    Mistry, Jaina
    Coggill, Penny C.
    Sammut, Stephen John
    Hotz, Hans-Rudolf
    Ceric, Goran
    Forslund, Kristoffer
    Eddy, Sean R.
    Sonnhammer, Erik L. L.
    Bateman, Alex
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D281 - D288
  • [8] Homology detection via family pairwise search
    Grundy, WN
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1998, 5 (03) : 479 - 491
  • [9] Removing near-neighbour redundancy from large protein sequence collections
    Holm, L
    Sander, C
    [J]. BIOINFORMATICS, 1998, 14 (05) : 423 - 429
  • [10] Hidden Markov models for detecting remote protein homologies
    Karplus, K
    Barrett, C
    Hughey, R
    [J]. BIOINFORMATICS, 1998, 14 (10) : 846 - 856