Computational codon optimization of synthetic gene for protein expression

被引:52
作者
Chung, Bevan Kai-Sheng [1 ,2 ,3 ]
Lee, Dong-Yup [1 ,2 ,3 ]
机构
[1] Natl Univ Singapore, Dept Chem & Biomol Engn, Singapore 117576, Singapore
[2] Natl Univ Singapore, NUS Grad Sch Integrat Sci & Engn, Singapore 117456, Singapore
[3] ASTAR, Bioproc Technol Inst, Singapore 138668, Singapore
关键词
ESCHERICHIA-COLI; LACTOCOCCUS-LACTIS; MESSENGER-RNA; TRANSCRIPTIONAL REGULATION; OLIGONUCLEOTIDE DESIGN; GENOME SEQUENCE; PICHIA-PASTORIS; DNA-SEQUENCES; USAGE; TRANSLATION;
D O I
10.1186/1752-0509-6-134
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The construction of customized nucleic acid sequences allows us to have greater flexibility in gene design for recombinant protein expression. Among the various parameters considered for such DNA sequence design, individual codon usage (ICU) has been implicated as one of the most crucial factors affecting mRNA translational efficiency. However, previous works have also reported the significant influence of codon pair usage, also known as codon context (CC), on the level of protein expression. Results: In this study, we have developed novel computational procedures for evaluating the relative importance of optimizing ICU and CC for enhancing protein expression. By formulating appropriate mathematical expressions to quantify the ICU and CC fitness of a coding sequence, optimization procedures based on genetic algorithm were employed to maximize its ICU and/or CC fitness. Surprisingly, the in silico validation of the resultant optimized DNA sequences for Escherichia coli, Lactococcus lactis, Pichia pastoris and Saccharomyces cerevisiae suggests that CC is a more relevant design criterion than the commonly considered ICU. Conclusions: The proposed CC optimization framework can complement and enhance the capabilities of current gene design tools, with potential applications to heterologous protein production and even vaccine development in synthetic biotechnology.
引用
收藏
页数:14
相关论文
共 66 条
[1]  
Applegate David L, 2006, TRAVELING SALESMAN P
[2]   Integer-programming software systems [J].
Atamtürk, A ;
Savelsbergh, MWP .
ANNALS OF OPERATIONS RESEARCH, 2005, 140 (01) :67-124
[3]   TmPrime: fast, flexible oligonucleotide design software for gene synthesis [J].
Bode, Marcus ;
Khor, Samuel ;
Ye, Hongye ;
Li, Mo-Huang ;
Ying, Jackie Y. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W214-W221
[4]   The complete genome sequence of the lactic acid bacterium Lactococcus lactis ssp lactis IL1403 [J].
Bolotin, A ;
Wincker, P ;
Mauger, S ;
Jaillon, O ;
Malarme, K ;
Weissenbach, J ;
Ehrlich, SD ;
Sorokin, A .
GENOME RESEARCH, 2001, 11 (05) :731-753
[5]  
BULMER M, 1991, GENETICS, V129, P897
[6]   Instability of repetitive DNA sequences: The role of replication in multiple mechanisms [J].
Bzymek, M ;
Lovett, ST .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (15) :8319-8325
[7]  
Cai Y., 2010, J BIOINFORMATICS SEQ, V2, P25
[8]   Flux-sum analysis: a metabolite-centric approach for understanding the metabolic network [J].
Chung, Bevan Kai Sheng ;
Lee, Dong-Yup .
BMC SYSTEMS BIOLOGY, 2009, 3
[9]   SOME METHODS FOR STRENGTHENING THE COMMON X2 TESTS [J].
COCHRAN, WG .
BIOMETRICS, 1954, 10 (04) :417-451
[10]   Virus attenuation by genome-scale changes in codon pair bias [J].
Coleman, J. Robert ;
Papamichail, Dimitris ;
Skiena, Steven ;
Futcher, Bruce ;
Wimmer, Eckard ;
Mueller, Steffen .
SCIENCE, 2008, 320 (5884) :1784-1787