A comparison of scoring functions for protein sequence profile alignment

被引:81
作者
Edgar, RC [1 ]
Sjölander, K [1 ]
机构
[1] Univ Calif Berkeley, Dept Bioengn, Berkeley, CA 94720 USA
关键词
D O I
10.1093/bioinformatics/bth090
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation:In recent years, several methods have been proposed for aligning two protein sequence profiles, with reported improvements in alignment accuracy and homolog discrimination versus sequence-sequence methods (e.g. BLAST) and profile-sequence methods (e.g. PSI-BLAST). Profile-profile alignment is also the iterated step in progressive multiple sequence alignment algorithms such as CLUSTALW. However, little is known about the relative performance of different profile-profile scoring functions. In this work, we evaluate the alignment accuracy of 23 different profile-profile scoring functions by comparing alignments of 488 pairs of sequences with identity less than or equal to30% against structural alignments. We optimize parameters for all scoring functions on the same training set and use profiles of alignments from both PSI-BLAST and SAM-T99. Structural alignments are constructed from a consensus between the FSSP database and CE structural aligner. We compare the results with sequence-sequence and sequence-profile methods, including BLAST and PSI-BLAST. Results: We find that profile-profile alignment gives an average improvement over our test set of typically 2-3% over profile-sequence alignment and similar to40% over sequence-sequence alignment. No statistically significant difference is seen in the relative performance of most of the scoring functions tested. Significantly better results are obtained with profiles constructed from SAM-T99 alignments than from PSI-BLAST alignments.
引用
收藏
页码:1301 / 1308
页数:8
相关论文
共 41 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [3] [Anonymous], METHOD ENZYMOL
  • [4] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [5] Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships
    Brenner, SE
    Chothia, C
    Hubbard, TJP
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) : 6073 - 6078
  • [6] Predicting reliable regions in protein sequence alignments
    Cline, M
    Hughey, R
    Karplus, K
    [J]. BIOINFORMATICS, 2002, 18 (02) : 306 - 314
  • [7] CLINE M, 2000, THESIS U CALIFORNIA
  • [8] Dayhoff M.O., 1978, ATLAS PROTEIN SEQ ST, V5
  • [9] SATCHMO:: sequence alignment and tree construction using hidden Markov models
    Edgar, RC
    Sjölander, K
    [J]. BIOINFORMATICS, 2003, 19 (11) : 1404 - 1411
  • [10] EDGAR RC, 2004, IN PRESS BIOINFORMAT