FORESST: fold recognition from secondary structure predictions of proteins

被引:35
作者
Di Francesco, V
Munson, PJ
Garnier, J
机构
[1] NIH, Analyt Biostat Sect, Math & Stat Comp Lab, Ctr Informat Technol, Bethesda, MD 20892 USA
[2] Inst Genom Res, Blolgiza, Rockville, MD 20850 USA
[3] INRA, Biol Cellulaire & Mol Lab, F-78352 Jouy En Josas, France
关键词
D O I
10.1093/bioinformatics/15.2.131
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A method for recognizing the three-dimensional fold from the protein amino acid sequence based on a combination of hidden Markov models (HMMs) and secondary structure prediction was recently developed for proteins in the Mainly-Alpha structural class. Here, this methodology is extended to Mainly-Beta and Alpha-Beta class proteins. Compared to other fold recognition methods based on HMMs, this approach is novel in that only secondary structure information is used. Each HMM is trained from known secondary structure sequences of proteins having a similar fold. Secondary structure prediction is performed for the amino acid sequence of a query protein. The predicted fold of a query protein is the fold described by the model fitting the predicted sequence the best. Results: After model cross-validation, the success rare on 44 test proteins covering the three structural classes was found to be 59%. On seven fold predictions performed prior to the publication of experimental structure, the success rate was 71%. In conclusion, this approach manages to capture important information about the fold of a protein embedded in the length avid arrangement of the predicted helices, strands and coils along the polypeptide chain. When a more extensive library of HMMs representing the universe of known structural families is available (work in progress), the program will allow rapid screening of genomic databases and sequence annotation when fold similarity is not detectable from the amino acid sequence.
引用
收藏
页码:131 / 140
页数:10
相关论文
共 41 条
  • [21] HIDDEN MARKOV-MODELS IN COMPUTATIONAL BIOLOGY - APPLICATIONS TO PROTEIN MODELING
    KROGH, A
    BROWN, M
    MIAN, IS
    SJOLANDER, K
    HAUSSLER, D
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (05) : 1501 - 1531
  • [22] PROTEIN-STRUCTURE PREDICTION BY THREADING METHODS - EVALUATION OF CURRENT TECHNIQUES
    LEMER, CMR
    ROOMAN, MJ
    WODAK, SJ
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1995, 23 (03): : 337 - 355
  • [23] IMPROVEMENTS IN A SECONDARY STRUCTURE PREDICTION METHOD BASED ON A SEARCH FOR LOCAL SEQUENCE HOMOLOGIES AND ITS USE AS A MODEL-BUILDING TOOL
    LEVIN, JM
    GARNIER, J
    [J]. BIOCHIMICA ET BIOPHYSICA ACTA, 1988, 955 (03) : 283 - 295
  • [24] Levitt M, 1997, PROTEINS, P92
  • [25] Marchler-Bauer A, 1997, PROTEINS, P83
  • [26] MUNSON PJ, 1994, P 27 HAW INT C SYS S, V5, P375
  • [27] MURZIN AG, 1995, J MOL BIOL, V247, P536, DOI 10.1016/S0022-2836(05)80134-2
  • [28] IDENTIFICATION AND CLASSIFICATION OF PROTEIN FOLD FAMILIES
    ORENGO, CA
    FLORES, TP
    TAYLOR, WR
    THORNTON, JM
    [J]. PROTEIN ENGINEERING, 1993, 6 (05): : 485 - 500
  • [29] PROTEIN SUPERFAMILIES AND DOMAIN SUPERFOLDS
    ORENGO, CA
    JONES, DT
    THORNTON, JM
    [J]. NATURE, 1994, 372 (6507) : 631 - 634
  • [30] Pearson WR, 1996, METHOD ENZYMOL, V266, P227