ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction

被引:54
作者
Overton, Ian M. [1 ]
Padovani, Gianandrea [2 ]
Girolami, Mark A. [2 ]
Barton, Geoffrey J. [1 ]
机构
[1] Univ Dundee, Sch Life Sci Res, Dundee DD1 5EH, Scotland
[2] Univ Glasgow, Dept Comp Sci, Glasgow GL12 8QQ, Lanark, Scotland
基金
英国工程与自然科学研究理事会; 英国生物技术与生命科学研究理事会;
关键词
D O I
10.1093/bioinformatics/btn055
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The ability to rank proteins by their likely success in crystallization is useful in current Structural Biology efforts and in particular in high-throughput Structural Genomics initiatives. We present ParCrys, a Parzen Window approach to estimate a proteins propensity to produce diffraction-quality crystals. The Protein Data Bank (PDB) provided training data whilst the databases TargetDB and PepcDB were used to define feature selection data as well as test data independent of feature selection and training. ParCrys outperforms the OB-Score, SECRET and CRYSTALP on the data examined, with accuracy and Matthews correlation coefficient values of 79.1 and 0.582, respectively (74.0 and 0.227, respectively, on data with a real-world ratio of positive:negative examples). ParCrys predictions and associated data are available from www.compbio.dundee.ac.uk/parcrys.
引用
收藏
页码:901 / 907
页数:7
相关论文
共 49 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Structural genomics - Tapping DNA for structures produces a trickle [J].
Service, RF .
SCIENCE, 2002, 298 (5595) :948-950
[3]  
[Anonymous], J STRUCT BIOL
[4]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkh131, 10.1093/nar/gkw1099]
[5]   A STRATEGY FOR THE RAPID MULTIPLE ALIGNMENT OF PROTEIN SEQUENCES - CONFIDENCE LEVELS FROM TERTIARY STRUCTURE COMPARISONS [J].
BARTON, GJ ;
STERNBERG, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 198 (02) :327-337
[6]   The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data [J].
Berman, Helen ;
Henrick, Kim ;
Nakamura, Haruki ;
Markley, John L. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D301-D303
[7]   Practical implementations for improving the throughput in a manual crystallization setup [J].
Biertümpfel, C ;
Basquin, J ;
Suck, D .
JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2005, 38 :568-570
[8]   Target selection for structural genomics [J].
Brenner, SE .
NATURE STRUCTURAL BIOLOGY, 2000, 7 (Suppl 11) :967-969
[9]   Structural genomics: beyond the Human Genome Project [J].
Burley, SK ;
Almo, SC ;
Bonanno, JB ;
Capel, M ;
Chance, MR ;
Gaasterland, T ;
Lin, DW ;
Sali, A ;
Studier, FW ;
Swaminathan, S .
NATURE GENETICS, 1999, 23 (02) :151-157
[10]   Protein biophysical properties that correlate with crystallization success in Thermotoga maritima:: Maximum clustering strategy for structural genomics [J].
Canaves, JM ;
Page, R ;
Wilson, IA ;
Stevens, RC .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 344 (04) :977-991