Do aligned sequences share the same fold?

被引:194
作者
Abagyan, RA
Batalov, S
机构
[1] Skirball Inst. of Biomol. Medicine, Biochemistry Department, NYU Medical Center, New York, NY 10016
关键词
bioinformatics; fold recognition; sequence alignment; modeling by homology; protein structure prediction;
D O I
10.1006/jmbi.1997.1287
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Sequence comparison remains a powerful tool to assess the structural relatedness of two proteins. To develop a sensitive sequence-based procedure for fold recognition, we performed an exhaustive global alignment (with zero end gap penalties) between sequences of protein domains with known three-dimensional folds. The subset of 1.3 million alignments between sequences of structurally unrelated domains was used to derive a set of analytical functions that represent the probability of structural significance for any sequence alignment at a given sequence identity, sequence similarity and alignment score. Analysis of overlap between structurally significant and insignificant alignments shows that sequence identity and sequence similarity measures are poor indicators of structural relatedness in the ''twilight zone'', while the alignment score allows much better discrimination between alignments of structurally related and unrelated sequences for a wide variety of alignment settings. A fold recognition benchmark was used to compare eight different substitution matrices with eight sets of gap penalties. The best performing matrices were Gonnet and Blosum50 with normalized gap penalties of 2.4/0.15 and 2.0/0.15, respectively, while the positive matrices were the worst performers. The derived functions and parameters can be used for fold recognition via a multilink chain of probability weighted pairwise sequence alignments. (C) 1997 Academic Press Limited.
引用
收藏
页码:355 / 368
页数:14
相关论文
共 36 条
  • [1] RECOGNITION OF DISTANTLY RELATED PROTEINS THROUGH ENERGY CALCULATIONS
    ABAGYAN, R
    FRISHMAN, D
    ARGOS, P
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1994, 19 (02): : 132 - 140
  • [2] ABAGYAN RA, 1997, PROTEIN SCI S2, V6, P58
  • [3] Abola EE, 1987, DATA COMMISSION INT, P107
  • [4] Altschul SF, 1996, METHOD ENZYMOL, V266, P460
  • [5] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [6] STATISTICS OF SEQUENCE-STRUCTURE THREADING
    BRYANT, SH
    ALTSCHUL, SF
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1995, 5 (02) : 236 - 244
  • [7] THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS
    CHOTHIA, C
    LESK, AM
    [J]. EMBO JOURNAL, 1986, 5 (04) : 823 - 826
  • [8] FISH WM, 1983, J MOL BIOL, V163, P171
  • [9] GONNET GH, 1992, SCIENCE, V256, P1433
  • [10] Gumbel E J., 1962, Contributions to Order Statistics, P56