Prediction of Protein Solubility in E. coli

被引:0
|
作者
Samak, Taghrid [1 ]
Gunter, Dan [1 ]
Wang, Zhong [1 ]
机构
[1] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
关键词
SEQUENCE-BASED PREDICTION; MACHINE-BASED METHOD; OVEREXPRESSION; PROPENSITY;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Gene synthesis is a key step to convert digitally predicted proteins to functional proteins. However, it is a relatively expensive and labor-intensive process. About 30-50% of the synthesized proteins are not soluble, thereby further reduces the efficacy of gene synthesis as a method for protein function characterization. Solubility prediction from primary protein sequences holds the promise to dramatically reduce the cost of gene synthesis. This work presents a framework that creates models of solubility from sequence information. From the primary protein sequences of the genes to be synthesized, sequence features can be used to build computational models for solubility. This way, biologists can focus the effort on synthesizing genes that are highly likely to generate soluble proteins. We have developed a framework that employs several machine learning algorithms to model protein solubility. The framework is used to predict protein solubility in the Escherichia coli expression system. The analysis is performed on over 1,600 quantified proteins. The approach successfully predicted the solubility with more than 80% accuracy, and enabled in depth analysis of the most important features affecting solubility. The analysis pipeline is general and can be applied to any set of sequence features to predict any binary measure. The framework also provides the biologist with a comprehensive comparison between different learning algorithms, and insightful feature analysis.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] A rapid solubility-optimized screening procedure for recombinant subtilisins in E. coli
    Bjerga, Gro Elin Kjaereng
    Arsin, Hasan
    Larsen, Oivind
    Puntervoll, Pal
    Kleivdal, Hans Torstein
    JOURNAL OF BIOTECHNOLOGY, 2016, 222 : 38 - 46
  • [32] A Comparative Analysis of Recombinant Expression and Solubility Screening of Two Phytases in E. coli
    Vasudevan, Ushasree Mrudula
    Salim, Sumayya Husaiba Beevi
    Pandey, Ashok
    FOOD TECHNOLOGY AND BIOTECHNOLOGY, 2011, 49 (03) : 304 - 309
  • [33] Effects of E. coli chaperones on the solubility of human receptors in an in vitro expression system
    Sumiyo Endo
    Yusuke Tomimoto
    Hiroyuki Shimizu
    Yoshitaka Taniguchi
    Takuo Onizuka
    Molecular Biotechnology, 2006, 33 : 199 - 209
  • [34] A review of prediction models for E. coli in urban surface waters
    van der Meulen, E. S.
    Tertienko, A.
    Blauw, A. N.
    Sutton, N. B.
    van de Ven, F. H. M.
    Rijnaarts, H. H. M.
    van Oel, P. R.
    URBAN WATER JOURNAL, 2024, 21 (05) : 539 - 548
  • [35] Protein-protein interactions in a nitrogen signal pathway of E. coli
    Piszczek, Grzegorz
    Fodor, Elfrieda
    Peterkofsky, Alan
    Ginsburg, Ann
    BIOPHYSICAL JOURNAL, 2007, : 377A - 377A
  • [36] Binding and Cleavage of E. coli HUβ by the E. coli Lon Protease
    Liao, Jiahn-Haur
    Lin, Yu-Ching
    Hsu, Jowey
    Lee, Alan Yueh-Luen
    Chen, Tse-An
    Hsu, Chun-Hua
    Chir, Jiun-Ly
    Hua, Kuo-Feng
    Wu, Tzu-Hua
    Hong, Li-Jenn
    Yen, Pei-Wen
    Chiou, Arthur
    Wu, Shih-Hsiung
    BIOPHYSICAL JOURNAL, 2010, 98 (01) : 129 - 137
  • [37] A chimeric Anabaena/Escherichia coli KdpD protein (Anacoli KdpD) functionally interacts with E. coli KdpE and activates kdp expression in E. coli
    Anand Ballal
    Ralf Heermann
    Kirsten Jung
    Michael Gaßel
    Shree Apte
    Karlheinz Altendorf
    Archives of Microbiology, 2002, 178 : 141 - 148
  • [38] 13C metabolic flux analysis of E. coli/E. coli and E. coli/yeast co-culture
    Antoniewicz, Maciek
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2015, 249
  • [39] The Mysterious E. Coli
    Dhillon, Amrita
    Kahlon, Arunpreet
    Kahlon, Arundeep
    Housam, Hegazy
    AMERICAN JOURNAL OF GASTROENTEROLOGY, 2015, 110 : S110 - S110
  • [40] Chasing E. coli
    Swannell, Cate
    MEDICAL JOURNAL OF AUSTRALIA, 2016, 205 (06) : C1 - C2