Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence

被引:114
作者
Kloczkowski, A
Ting, TL
Jernigan, RL
Garnier, J
机构
[1] NIH, Math & Stat Comp Lab, CIT, Bethesda, MD 20892 USA
[2] NCI, Lab Expt & Computat Biol, NIH, Bethesda, MD 20892 USA
关键词
GOR algorithm; protein secondary structure; secondary structure prediction; PSI-BLAST; multiple sequence alignment; information theory;
D O I
10.1002/prot.10181
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We have modified and improved the GOR algorithm for the protein secondary structure prediction by using the evolutionary information provided by multiple sequence alignments, adding triplet statistics, and optimizing various parameters. We have expanded the database used to include the 513 non-redundant domains collected recently by Cuff and Barton (Proteins 1999;34:508519; Proteins 2000;40:502-511). We have introduced a variable size window that allowed us to include sequences as short as 20-30 residues. A significant improvement over the previous versions of GOR algorithm was obtained by combining the PSI-BLAST multiple sequence alignments with the GOR method. The new algorithm will form the basis for the future GOR V release on an online prediction server. The average accuracy of the prediction of secondary structure with multiple sequence alignment and full jack-knife procedure was 73.5%. The accuracy. of the prediction increases to 74.2% by limiting the prediction to 375 (of 513) sequences having at least 50 PSI-BLAST alignments. The average accuracy of the prediction of the new improved program without using multiple sequence alignments was 67.5%. This is approximately a 3% improvement over the preceding GOR IV algorithm (Garnier J, Gibrat JF, Robson B. Methods Enzymol 1996;266:540-553; Kloczkowski A, Ting K-L, Jernigan RL, Garnier J. Polymer 2002;43:441-449). We have discussed alternatives to the segment overlap (Sov) coefficient proposed by Zemla et al. (Proteins 1999;34:220-223). (C) 2002 Wiley-Liss, Inc.*.
引用
收藏
页码:154 / 166
页数:13
相关论文
共 58 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] [Anonymous], 1989, Prediction of protein structures and the principles of protein conformation
  • [3] Bahar I, 1997, PROTEINS, V29, P172, DOI 10.1002/(SICI)1097-0134(199710)29:2<172::AID-PROT5>3.0.CO
  • [4] 2-F
  • [5] PROTEIN SECONDARY STRUCTURE PREDICTION
    BARTON, GJ
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1995, 5 (03) : 372 - 376
  • [6] SECONDARY STRUCTURE PREDICTION - COMBINATION OF 3 DIFFERENT METHODS
    BIOU, V
    GIBRAT, JF
    LEVIN, JM
    ROBSON, B
    GARNIER, J
    [J]. PROTEIN ENGINEERING, 1988, 2 (03): : 185 - 191
  • [7] DOES THE FOLDING TYPE OF A PROTEIN DEPEND ON ITS AMINO-ACID-COMPOSITION
    CHOU, KC
    [J]. FEBS LETTERS, 1995, 363 (1-2) : 127 - 131
  • [8] PREDICTION OF PROTEIN CONFORMATION
    CHOU, PY
    FASMAN, GD
    [J]. BIOCHEMISTRY, 1974, 13 (02) : 222 - 245
  • [9] Cuff JA, 1999, PROTEINS, V34, P508, DOI 10.1002/(SICI)1097-0134(19990301)34:4<508::AID-PROT10>3.0.CO
  • [10] 2-4