ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction

被引:54
作者
Overton, Ian M. [1 ]
Padovani, Gianandrea [2 ]
Girolami, Mark A. [2 ]
Barton, Geoffrey J. [1 ]
机构
[1] Univ Dundee, Sch Life Sci Res, Dundee DD1 5EH, Scotland
[2] Univ Glasgow, Dept Comp Sci, Glasgow GL12 8QQ, Lanark, Scotland
基金
英国生物技术与生命科学研究理事会; 英国工程与自然科学研究理事会;
关键词
D O I
10.1093/bioinformatics/btn055
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The ability to rank proteins by their likely success in crystallization is useful in current Structural Biology efforts and in particular in high-throughput Structural Genomics initiatives. We present ParCrys, a Parzen Window approach to estimate a proteins propensity to produce diffraction-quality crystals. The Protein Data Bank (PDB) provided training data whilst the databases TargetDB and PepcDB were used to define feature selection data as well as test data independent of feature selection and training. ParCrys outperforms the OB-Score, SECRET and CRYSTALP on the data examined, with accuracy and Matthews correlation coefficient values of 79.1 and 0.582, respectively (74.0 and 0.227, respectively, on data with a real-world ratio of positive:negative examples). ParCrys predictions and associated data are available from www.compbio.dundee.ac.uk/parcrys.
引用
收藏
页码:901 / 907
页数:7
相关论文
共 49 条
[31]   Life in the fast lane for protein crystallization and X-ray crystallography [J].
Pusey, ML ;
Liu, ZJ ;
Tempel, W ;
Praissman, J ;
Lin, DW ;
Wang, BC ;
Gavira, JA ;
Ng, JD .
PROGRESS IN BIOPHYSICS & MOLECULAR BIOLOGY, 2005, 88 (03) :359-386
[32]  
R Development Core Team, 2004, LANG ENV STAT COMP
[33]   EMBOSS: The European molecular biology open software suite [J].
Rice, P ;
Longden, I ;
Bleasby, A .
TRENDS IN GENETICS, 2000, 16 (06) :276-277
[34]   Twilight zone of protein sequence alignments [J].
Rost, B .
PROTEIN ENGINEERING, 1999, 12 (02) :85-94
[35]   Strategies for structural proteomics of prokaryotes: Quantifying the advantages of studying orthologous proteins and of using both NMR and X-ray crystallography approaches [J].
Savchenko, A ;
Yee, A ;
Khachatryan, A ;
Skarina, T ;
Evdokimova, E ;
Pavlova, M ;
Semesi, A ;
Northey, J ;
Beasley, S ;
Lan, N ;
Das, R ;
Gerstein, M ;
Arrowmith, CH ;
Edwards, AM .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2003, 50 (03) :392-399
[36]   Screening-based discovery and structural dissection of a novel family 18 chitinase inhibitor [J].
Schuttelkopf, Alexander W. ;
Andersen, Ole A. ;
Rao, Francesco V. ;
Allwood, Matthew ;
Lloyd, Clare ;
Eggleston, Ian M. ;
Van Aalten, Daan M. F. .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2006, 281 (37) :27278-27285
[37]   Structural biology - Structural genomics, round 2 [J].
Service, R .
SCIENCE, 2005, 307 (5715) :1554-+
[38]   Finding function through structural genomics [J].
Shapiro, L ;
Harris, T .
CURRENT OPINION IN BIOTECHNOLOGY, 2000, 11 (01) :31-35
[39]   Structural basis for Duffy recognition by the malaria parasite Duffy-binding-like domain [J].
Singh, SK ;
Hora, R ;
Belrhali, H ;
Chitnis, CE ;
Sharma, A .
NATURE, 2006, 439 (7077) :741-744
[40]   Will my protein crystallize? A sequence-based predictor [J].
Smialowski, P ;
Schmidt, T ;
Cox, J ;
Kirschner, A ;
Frishman, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 62 (02) :343-355