OPT: Codon optimize gene sequences for E. coli protein overexpression

被引:0
作者
Wong, Daniel P. H. [1 ]
Wong, Kam-Ho [2 ]
Park, Sunjae [2 ]
Boel, Gregory [3 ]
Hunt, John F. [2 ]
Aalberts, Daniel P. [1 ]
机构
[1] Williams Coll, Dept Phys, Williamstown, MA 01267 USA
[2] Columbia Univ, Dept Biol Sci, New York, NY 10027 USA
[3] Univ Paris Cite, CNRS, Inst Biol Physiochim, Express Genet Microbienne, F-75005 Paris, France
基金
美国国家卫生研究院;
关键词
protein production; synonymous codons; ESCHERICHIA-COLI; RNA-POLYMERASE; USAGE BIAS; EXPRESSION; PURIFICATION; EFFICIENCY;
D O I
10.1016/j.jmb.2025.168965
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The ability to overexpress proteins is valuable for biotechnology, but not all sequences are compatible with high yield. We previously analyzed the sequence features and mRNA folding stability of a large data set of 6,384 distinct gene constructs, and developed a model for protein yield. Our OPT.williams.edu server (1) predicts the probability an input sequence will produce protein at a high level when overexpressed in E. coli, and (2) returns optimized synonymous sequences designed to boost protein expression. Here we also present experimental evidence of the high yields of our OPT constructs for eight commercially produced proteins. (c) 2025 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
引用
收藏
页数:8
相关论文
共 53 条
[1]   Robotic cloning and Protein Production Platform of the Northeast Structural Genomics Consortium [J].
Acton, TB ;
Gunsalus, KC ;
Xiao, R ;
Ma, LC ;
Aramini, J ;
Baran, MC ;
Chiang, YW ;
Climent, T ;
Cooper, B ;
Denissova, NG ;
Douglas, SM ;
Everett, JK ;
Ho, CK ;
Macapagal, D ;
Rajan, PK ;
Shastry, R ;
Shih, LY ;
Swapna, GVT ;
Wilson, M ;
Wu, M ;
Gerstein, M ;
Inouye, M ;
Hunt, JF ;
Montelione, GT .
NUCLEAR MAGNETIC RESONANCE OF BIOLOGICAL MACROMOLECULES, PART C, 2005, 394 :210-243
[2]   Cell factories for insulin production [J].
Baeshen, Nabih A. ;
Baeshen, Mohammed N. ;
Sheikh, Abdullah ;
Bora, Roop S. ;
Ahmed, Mohamed Morsi M. ;
Ramadan, Hassan A. I. ;
Saini, Kulvinder Singh ;
Redwan, Elrashdy M. .
MICROBIAL CELL FACTORIES, 2014, 13
[3]   SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics [J].
Bertone, P ;
Kluger, Y ;
Lan, N ;
Zheng, DY ;
Christendat, D ;
Yee, A ;
Edwards, AM ;
Arrowsmith, CH ;
Montelione, GT ;
Gerstein, M .
NUCLEIC ACIDS RESEARCH, 2001, 29 (13) :2884-2898
[4]   Efficient folding of proteins with multiple disulfide bonds in the Escherichia coli cytoplasm [J].
Bessette, PH ;
Åslund, F ;
Beckwith, J ;
Georgiou, G .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (24) :13703-13708
[5]   Accessibility of the Shine-Dalgarno Sequence Dictates N-Terminal Codon Bias in E. coli [J].
Bhattacharyya, Sanchari ;
Jacobs, William M. ;
Adkar, Bharat V. ;
Yan, Jin ;
Zhang, Wenli ;
Shakhnovich, Eugene I. .
MOLECULAR CELL, 2018, 70 (05) :894-+
[6]   Correlation between nucleotide composition and folding energy of coding sequences with special attention to wobble bases [J].
Biro, Jan C. .
THEORETICAL BIOLOGY AND MEDICAL MODELLING, 2008, 5
[7]   Codon influence on protein expression in E. coli correlates with mRNA levels [J].
Boel, Gregory ;
Letso, Reka ;
Neely, Helen ;
Price, W. Nicholson ;
Wong, Kam-Ho ;
Su, Min ;
Luff, Jon D. ;
Valecha, Mayank ;
Everett, John K. ;
Acton, Thomas B. ;
Xiao, Rong ;
Montelione, Gaetano T. ;
Aalberts, Daniel P. ;
Hunt, John F. .
NATURE, 2016, 529 (7586) :358-+
[8]   Evaluation of 244,000 synthetic sequences reveals design principles to optimize translation in Escherichia coli [J].
Cambray, Guillaume ;
Guimaraes, Joao C. ;
Arkin, Adam Paul .
NATURE BIOTECHNOLOGY, 2018, 36 (10) :1005-+
[9]   A Role for Codon Order in Translation Dynamics [J].
Cannarrozzi, Gina ;
Schraudolph, Nicol N. ;
Faty, Mahamadou ;
von Rohr, Peter ;
Friberg, Markus T. ;
Roth, Alexander C. ;
Gonnet, Pedro ;
Gonnet, Gaston ;
Barral, Yves .
CELL, 2010, 141 (02) :355-367
[10]   CELL-FREE SYNTHESIS OF PROTEINS CODING FOR MOBILIZATION FUNCTIONS OF COLE1 AND TRANSPOSITION FUNCTIONS OF TN3 [J].
COLLINS, J .
GENE, 1979, 6 (01) :29-42