ProTstab - predictor for cellular protein stability

被引:25
作者
Yang, Yang [1 ,2 ,3 ]
Ding, Xuesong [1 ]
Zhu, Guanchen [1 ]
Niroula, Abhishek [2 ]
Lv, Qiang [1 ]
Vihinen, Mauno [2 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
[2] Lund Univ, Dept Expt Med Sci, BMC B13, Lund, Sweden
[3] Soochow Univ, Prov Key Lab Comp Informat Proc Technol, Suzhou, Peoples R China
基金
瑞典研究理事会;
关键词
Protein stability; Prediction; Machine learning; Proteome properties; MUTATION; FLEXIBILITY; ENERGETICS; SEQUENCE; SERVER;
D O I
10.1186/s12864-019-6138-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Stability is one of the most fundamental intrinsic characteristics of proteins and can be determined with various methods. Characterization of protein properties does not keep pace with increase in new sequence data and therefore even basic properties are not known for far majority of identified proteins. There have been some attempts to develop predictors for protein stabilities; however, they have suffered from small numbers of known examples. Results We took benefit of results from a recently developed cellular stability method, which is based on limited proteolysis and mass spectrometry, and developed a machine learning method using gradient boosting of regression trees. ProTstab method has high performance and is well suited for large scale prediction of protein stabilities. Conclusions The Pearson's correlation coefficient was 0.793 in 10-fold cross validation and 0.763 in independent blind test. The corresponding values for mean absolute error are 0.024 and 0.036, respectively. Comparison with a previously published method indicated ProTstab to have superior performance. We used the method to predict stabilities of all the remaining proteins in the entire human proteome and then correlated the predicted stabilities to protein chain lengths of isoforms and to localizations of proteins.
引用
收藏
页数:9
相关论文
共 47 条
[1]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[2]   A QUANTUM-THEORY OF MOLECULAR-STRUCTURE AND ITS APPLICATIONS [J].
BADER, RFW .
CHEMICAL REVIEWS, 1991, 91 (05) :893-928
[3]  
Braiuca Paolo, 2007, Biotechnology Journal, V2, P214, DOI 10.1002/biot.200600175
[4]   I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure [J].
Capriotti, E ;
Fariselli, P ;
Casadio, R .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W306-W310
[5]   Energetics of side chain packing in staphylococcal nuclease assessed by systematic double mutant cycles [J].
Chen, JM ;
Stites, WE .
BIOCHEMISTRY, 2001, 40 (46) :14004-14011
[6]   Revisiting the correlation between proteins' thermoresistance and organisms' thermophilicity [J].
Dehouck, Yves ;
Folch, Benjamin ;
Rooman, Marianne .
PROTEIN ENGINEERING DESIGN & SELECTION, 2008, 21 (04) :275-278
[7]   PoPMuSiC 2.1: a web server for the estimation of protein stability changes upon mutation and sequence optimality [J].
Dehouck, Yves ;
Kwasigroch, Jean Marc ;
Gilis, Dimitri ;
Rooman, Marianne .
BMC BIOINFORMATICS, 2011, 12
[8]   Prediction of Thermostability from Amino Acid Attributes by Combination of Clustering with Attribute Weighting: A New Vista in Engineering Enzymes [J].
Ebrahimi, Mansour ;
Lakizadeh, Amir ;
Agha-Golzadeh, Parisa ;
Ebrahimie, Esmaeil ;
Ebrahimi, Mahdi .
PLOS ONE, 2011, 6 (08)
[9]   INPS: predicting the impact of non-synonymous variations on protein stability from sequence [J].
Fariselli, Piero ;
Martelli, Pier Luigi ;
Savojardo, Castrense ;
Casadio, Rita .
BIOINFORMATICS, 2015, 31 (17) :2816-2821
[10]   EASE-MM: Sequence-Based Prediction of Mutation-Induced Stability Changes with Feature-Based Multiple Models [J].
Folkman, Lukas ;
Stantic, Bela ;
Sattar, Abdul ;
Zhou, Yaoqi .
JOURNAL OF MOLECULAR BIOLOGY, 2016, 428 (06) :1394-1405