Solubility-Weighted Index: fast and accurate prediction of protein solubility

被引:51
作者
Bhandari, Bikash K. [1 ]
Gardner, Paul P. [1 ,2 ]
Lim, Chun Shen [1 ]
机构
[1] Univ Otago, Sch Biomed Sci, Dept Biochem, Dunedin, New Zealand
[2] Univ Canterbury, Biomol Interact Ctr, Christchurch, New Zealand
关键词
PRODUCTION PLATFORM; FLEXIBILITY; EXPRESSION; !text type='PYTHON']PYTHON[!/text; TOOL; CONSTRAINTS; EVOLUTION; PEPTIDE; SURFACE;
D O I
10.1093/bioinformatics/btaa578
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Recombinant protein production is a widely used technique in the biotechnology and biomedical industries, yet only a quarter of target proteins are soluble and can therefore be purified. Results: We have discovered that global structural flexibility, which can be modeled by normalized B-factors, accurately predicts the solubility of 12 216 recombinant proteins expressed in Escherichia coll. We have optimized these B-factors, and derived a new set of values for solubility scoring that further improves prediction accuracy. We call this new predictor the 'Solubility-Weighted Index' (SWI). Importantly, SWI outperforms many existing protein solubility prediction tools. Furthermore, we have developed 'SoDoPE' (Soluble Domain for Protein Expression), a web interface that allows users to choose a protein region of interest for predicting and maximizing both protein expression and solubility.
引用
收藏
页码:4691 / 4698
页数:8
相关论文
共 76 条
[1]   Robotic cloning and Protein Production Platform of the Northeast Structural Genomics Consortium [J].
Acton, TB ;
Gunsalus, KC ;
Xiao, R ;
Ma, LC ;
Aramini, J ;
Baran, MC ;
Chiang, YW ;
Climent, T ;
Cooper, B ;
Denissova, NG ;
Douglas, SM ;
Everett, JK ;
Ho, CK ;
Macapagal, D ;
Rajan, PK ;
Shastry, R ;
Shih, LY ;
Swapna, GVT ;
Wilson, M ;
Wu, M ;
Gerstein, M ;
Inouye, M ;
Hunt, JF ;
Montelione, GT .
NUCLEAR MAGNETIC RESONANCE OF BIOLOGICAL MACROMOLECULES, PART C, 2005, 394 :210-243
[2]   ccSOL omics: a webserver for solubility prediction of endogenous and heterologous expression in Escherichia coli [J].
Agostini, Federico ;
Cirillo, Davide ;
Maria Livi, Carmen ;
Delli Ponti, Riccardo ;
Gaetano Tartaglia, Gian .
BIOINFORMATICS, 2014, 30 (20) :2975-2977
[3]  
[Anonymous], **DATA OBJECT**, DOI DOI 10.5281/ZENODO.1313201
[4]  
Åslund F, 1999, J BACTERIOL, V181, P1375
[5]  
Bhandari BK., 2019, HIGHLY ACCESSIBLE TR
[6]  
BHASKARAN R, 1988, INT J PEPT PROT RES, V32, P241
[7]   Learning to predict expression efficacy of vectors in recombinant protein production [J].
Chan, Wen-Ching ;
Liang, Po-Huang ;
Shih, Yan-Ping ;
Yang, Ueng-Cheng ;
Lin, Wen-chang ;
Hsu, Chun-Nan .
BMC BIOINFORMATICS, 2010, 11
[8]   TargetDB: a target registration database for structural genomics projects [J].
Chen, L ;
Oughtred, R ;
Berman, HM ;
Westbrook, J .
BIOINFORMATICS, 2004, 20 (16) :2860-2862
[9]   Rationalization of the effects of mutations on peptide and protein aggregation rates [J].
Chiti, F ;
Stefani, M ;
Taddei, N ;
Ramponi, G ;
Dobson, CM .
NATURE, 2003, 424 (6950) :805-808
[10]   Biopython']python: freely available Python']Python tools for computational molecular biology and bioinformatics [J].
Cock, Peter J. A. ;
Antao, Tiago ;
Chang, Jeffrey T. ;
Chapman, Brad A. ;
Cox, Cymon J. ;
Dalke, Andrew ;
Friedberg, Iddo ;
Hamelryck, Thomas ;
Kauff, Frank ;
Wilczynski, Bartek ;
de Hoon, Michiel J. L. .
BIOINFORMATICS, 2009, 25 (11) :1422-1423