SHIFTX2: significantly improved protein chemical shift prediction

被引:509
作者
Han, Beomsoo [1 ]
Liu, Yifeng [1 ]
Ginzinger, Simon W. [4 ]
Wishart, David S. [1 ,2 ,3 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[2] Univ Alberta, Dept Biol Sci, Edmonton, AB, Canada
[3] CNR, NINT, Edmonton, AB T6G 2E8, Canada
[4] Salzburg Univ, Dept Mol Biol, Div Bioinformat, Ctr Appl Mol Engn, A-5020 Salzburg, Austria
基金
加拿大自然科学与工程研究理事会; 奥地利科学基金会;
关键词
NMR; Protein; Chemical shift; Machine learning; WEB SERVER; STRUCTURE GENERATION; SECONDARY-STRUCTURE; MAGNETIC-RESONANCE; C-ALPHA; NMR; C-13; N-15; H-1; C-13(ALPHA);
D O I
10.1007/s10858-011-9478-4
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A new computer program, called SHIFTX2, is described which is capable of rapidly and accurately calculating diamagnetic H-1, C-13 and N-15 chemical shifts from protein coordinate data. Compared to its predecessor (SHIFTX) and to other existing protein chemical shift prediction programs, SHIFTX2 is substantially more accurate (up to 26% better by correlation coefficient with an RMS error that is up to 3.3x smaller) than the next best performing program. It also provides significantly more coverage (up to 10% more), is significantly faster (up to 8.5x) and capable of calculating a wider variety of backbone and side chain chemical shifts (up to 6x) than many other shift predictors. In particular, SHIFTX2 is able to attain correlation coefficients between experimentally observed and predicted backbone chemical shifts of 0.9800 (N-15), 0.9959 (C-13 alpha), 0.9992 (C-13 beta), 0.9676 (C-13'), 0.9714 ((HN)-H-1), 0.9744 (H-1 alpha) and RMS errors of 1.1169, 0.4412, 0.5163, 0.5330, 0.1711, and 0.1231 ppm, respectively. The correlation between SHIFTX2's predicted and observed side chain chemical shifts is 0.9787 (C-13) and 0.9482 (H-1) with RMS errors of 0.9754 and 0.1723 ppm, respectively. SHIFTX2 is able to achieve such a high level of accuracy by using a large, high quality database of training proteins (> 190), by utilizing advanced machine learning techniques, by incorporating many more features (chi(2) and chi(3) angles, solvent accessibility, H-bond geometry, pH, temperature), and by combining sequence-based with structure-based chemical shift prediction techniques. With this substantial improvement in accuracy we believe that SHIFTX2 will open the door to many long-anticipated applications of chemical shift prediction to protein structure determination, refinement and validation. SHIFTX2 is available both as a standalone program and as a web server (http://www.shiftx2.ca).
引用
收藏
页码:43 / 57
页数:15
相关论文
共 49 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   A large data set comparison of protein structures determined by crystallography and NMR: Statistical test for structural differences and the effect of crystal packing [J].
Andrec, Michael ;
Snyder, David A. ;
Zhou, Zhiyong ;
Young, Jasmine ;
Montellone, Gaetano T. ;
Levy, Ronald M. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 69 (03) :449-465
[3]   PROSESS: a protein structure evaluation suite and server [J].
Berjanskii, Mark ;
Liang, Yongjie ;
Zhou, Jianjun ;
Tang, Peter ;
Stothard, Paul ;
Zhou, You ;
Cruz, Joseph ;
MacDonell, Cam ;
Lin, Guohui ;
Lu, Paul ;
Wishart, David S. .
NUCLEIC ACIDS RESEARCH, 2010, 38 :W633-W640
[4]   GeNMR: a web server for rapid NMR-based protein structure determination [J].
Berjanskii, Mark ;
Tang, Peter ;
Liang, Jack ;
Cruz, Joseph A. ;
Zhou, Jianjun ;
Zhou, You ;
Bassett, Edward ;
MacDonell, Cam ;
Lu, Paul ;
Lin, Guohui ;
Wishart, David S. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W670-W677
[5]   A simple method to predict protein flexibility using secondary chemical shifts [J].
Berjanskii, MV ;
Wishart, DS .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2005, 127 (43) :14970-14971
[6]  
Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350
[7]   Data mining in bioinformatics using Weka [J].
Frank, E ;
Hall, M ;
Trigg, L ;
Holmes, G ;
Witten, IH .
BIOINFORMATICS, 2004, 20 (15) :2479-2481
[8]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[9]   Detection of unrealistic molecular environments in protein structures based on expected electron densities [J].
Ginzinger, Simon W. ;
Weichenberger, Christian X. ;
Sippl, Manfred J. .
JOURNAL OF BIOMOLECULAR NMR, 2010, 47 (01) :33-40
[10]   CheckShift improved: fast chemical shift reference correction with high accuracy [J].
Ginzinger, Simon W. ;
Skocibusic, Marko ;
Heun, Volker .
JOURNAL OF BIOMOLECULAR NMR, 2009, 44 (04) :207-211