FORESST: fold recognition from secondary structure predictions of proteins

被引:35
作者
Di Francesco, V
Munson, PJ
Garnier, J
机构
[1] NIH, Analyt Biostat Sect, Math & Stat Comp Lab, Ctr Informat Technol, Bethesda, MD 20892 USA
[2] Inst Genom Res, Blolgiza, Rockville, MD 20850 USA
[3] INRA, Biol Cellulaire & Mol Lab, F-78352 Jouy En Josas, France
关键词
D O I
10.1093/bioinformatics/15.2.131
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A method for recognizing the three-dimensional fold from the protein amino acid sequence based on a combination of hidden Markov models (HMMs) and secondary structure prediction was recently developed for proteins in the Mainly-Alpha structural class. Here, this methodology is extended to Mainly-Beta and Alpha-Beta class proteins. Compared to other fold recognition methods based on HMMs, this approach is novel in that only secondary structure information is used. Each HMM is trained from known secondary structure sequences of proteins having a similar fold. Secondary structure prediction is performed for the amino acid sequence of a query protein. The predicted fold of a query protein is the fold described by the model fitting the predicted sequence the best. Results: After model cross-validation, the success rare on 44 test proteins covering the three structural classes was found to be 59%. On seven fold predictions performed prior to the publication of experimental structure, the success rate was 71%. In conclusion, this approach manages to capture important information about the fold of a protein embedded in the length avid arrangement of the predicted helices, strands and coils along the polypeptide chain. When a more extensive library of HMMs representing the universe of known structural families is available (work in progress), the program will allow rapid screening of genomic databases and sequence annotation when fold similarity is not detectable from the amino acid sequence.
引用
收藏
页码:131 / 140
页数:10
相关论文
共 41 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] Barrett C, 1997, COMPUT APPL BIOSCI, V13, P191
  • [3] PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES
    BERNSTEIN, FC
    KOETZLE, TF
    WILLIAMS, GJB
    MEYER, EF
    BRICE, MD
    RODGERS, JR
    KENNARD, O
    SHIMANOUCHI, T
    TASUMI, M
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) : 535 - 542
  • [4] A POSSIBLE 3-DIMENSIONAL STRUCTURE OF BOVINE ALPHA-LACTALBUMIN BASED ON THAT OF HENS EGG-WHITE LYSOZYME
    BROWNE, WJ
    NORTH, ACT
    PHILLIPS, DC
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1969, 42 (01) : 65 - &
  • [5] Bryant SH, 1996, PROTEINS, V26, P172
  • [6] THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS
    CHOTHIA, C
    LESK, AM
    [J]. EMBO JOURNAL, 1986, 5 (04) : 823 - 826
  • [7] Di Francesco V, 1997, PROTEINS, P123
  • [8] Protein topology recognition from secondary structure sequences: Application of the hidden Markov models to the alpha class proteins
    DiFrancesco, V
    Garnier, J
    Munson, PJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 267 (02) : 446 - 463
  • [9] DIFRANCESCO V, 1997, P 5 INT C INT SYST M, P100
  • [10] DOOLITTLE RF, 1992, PROTEIN SCI, V1, P191