Distill:: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins

被引:71
作者
Bau, Davide [1 ]
Martin, Alberto J. M. [1 ]
Mooney, Catherine [1 ]
Vullo, Alessandro [1 ]
Walsh, Ian [1 ]
Pollastri, Gianluca [1 ]
机构
[1] Univ Coll Dublin, Sch Comp & Informat Sci, Dublin 4, Ireland
关键词
D O I
10.1186/1471-2105-7-402
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: We describe Distill, a suite of servers for the prediction of protein structural features: secondary structure; relative solvent accessibility; contact density; backbone structural motifs; residue contact maps at 6, 8 and 12 Angstrom; coarse protein topology. The servers are based on large-scale ensembles of recursive neural networks and trained on large, up-to-date, nonredundant subsets of the Protein Data Bank. Together with structural feature predictions, Distill includes a server for prediction of C alpha traces for short proteins (up to 200 amino acids). Results: The servers are state-of-the-art, with secondary structure predicted correctly for nearly 80% of residues (currently the top performance on EVA), 2-class solvent accessibility nearly 80% correct, and contact maps exceeding 50% precision on the top non-diagonal contacts. A preliminary implementation of the predictor of protein C a traces featured among the top 20 Novel Fold predictors at the last CASP6 experiment as group Distill (ID 0348). The majority of the servers, including the C a trace predictor, now take into account homology information from the PDB, when available, resulting in greatly improved reliability. Conclusion: All predictions are freely available through a simple joint web interface and the results are returned by email. In a single submission the user can send protein sequences for a total of up to 32k residues to all or a selection of the servers. Distill is accessible at the address: http://distill.ucd.ie/distill/.
引用
收藏
页数:8
相关论文
共 25 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] [Anonymous], 2004, J MACH LEARN RES, DOI DOI 10.1162/153244304773936054
  • [3] Exploiting the past and the future in protein secondary structure prediction
    Baldi, P
    Brunak, S
    Frasconi, P
    Soda, G
    Pollastri, G
    [J]. BIOINFORMATICS, 1999, 15 (11) : 937 - 946
  • [4] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [5] Rosetta predictions in CASP5: Successes, failures, and prospects for complete automation
    Bradley, P
    Chivian, D
    Meiler, J
    Misura, KMS
    Rohl, CA
    Schief, WR
    Wedemeyer, WJ
    Schueler-Furman, O
    Murphy, P
    Schonbrun, J
    Strauss, CEM
    Baker, D
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 : 457 - 468
  • [6] Learning protein secondary structure from sequential and relational data
    Ceroni, A
    Frasconi, P
    Pollastri, G
    [J]. NEURAL NETWORKS, 2005, 18 (08) : 1029 - 1039
  • [7] EVA:: continuous automatic evaluation of protein structure prediction servers
    Eyrich, VA
    Martí-Renom, MA
    Przybylski, D
    Madhusudhan, MS
    Fiser, A
    Pazos, F
    Valencia, A
    Sali, A
    Rost, B
    [J]. BIOINFORMATICS, 2001, 17 (12) : 1242 - 1243
  • [8] HOBOHM U, 1994, PROTEIN SCI, V3, P522
  • [9] GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences
    Jones, DT
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1999, 287 (04) : 797 - 815
  • [10] DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES
    KABSCH, W
    SANDER, C
    [J]. BIOPOLYMERS, 1983, 22 (12) : 2577 - 2637