Machine learning-assisted directed protein evolution with combinatorial libraries

被引:357
作者
Wu, Zachary [1 ]
Kan, S. B. Jennifer [1 ]
Lewis, Russell D. [2 ]
Wittmann, Bruce J. [2 ]
Arnold, Frances H. [1 ,2 ]
机构
[1] CALTECH, Div Chem & Chem Engn, Pasadena, CA 91125 USA
[2] CALTECH, Div Biol & Bioengn, Pasadena, CA 91125 USA
基金
美国国家科学基金会;
关键词
protein engineering; machine learning; directed evolution; enzyme; catalysis; FITNESS LANDSCAPE; OPTIMIZATION; SILICON;
D O I
10.1073/pnas.1901979116
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
To reduce experimental effort associated with directed protein evolution and to explore the sequence space encoded by mutating multiple positions simultaneously, we incorporate machine learning into the directed evolution workflow. Combinatorial sequence space can be quite expensive to sample experimentally, but machine-learning models trained on tested variants provide a fast method for testing sequence space computationally. We validated this approach on a large published empirical fitness landscape for human GB1 binding protein, demonstrating that machine learning-guided directed evolution finds variants with higher fitness than those found by other directed evolution approaches. We then provide an example application in evolving an enzyme to produce each of the two possible product enantiomers (i.e., stereodivergence) of a new-to-nature carbene Si-H insertion reaction. The approach predicted libraries enriched in functional enzymes and fixed seven mutations in two rounds of evolution to identify variants for selective catalysis with 93% and 79% ee (enantiomeric excess). By greatly increasing throughput with in silico modeling, machine learning enhances the quality and diversity of sequence solutions for a protein engineering problem.
引用
收藏
页码:8852 / 8858
页数:7
相关论文
共 47 条
  • [1] CADEE: Computer-Aided Directed Evolution of Enzymes
    Amrein, Beat Anton
    Steffen-Munsberg, Fabian
    Szeler, Ireneusz
    Purg, Miha
    Kulkarni, Yashraj
    Kamerlin, Shina Caroline Lynn
    [J]. IUCRJ, 2017, 4 : 50 - 64
  • [2] [Anonymous], ARXIV171201815V1
  • [3] Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
  • [4] Improved Descriptors for the Quantitative Structure-Activity Relationship Modeling of Peptides and Proteins
    Barley, Mark H.
    Turner, Nicholas J.
    Goodacre, Royston
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (02) : 234 - 243
  • [5] Robustness-epistasis link shapes the fitness landscape of a randomly drifting protein
    Bershtein, Shimon
    Segal, Michal
    Bekerman, Roy
    Tokuriki, Nobuhiko
    Tawfik, Dan S.
    [J]. NATURE, 2006, 444 (7121) : 929 - 932
  • [6] Protein stability promotes evolvability
    Bloom, JD
    Labthavikul, ST
    Otey, CR
    Arnold, FH
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (15) : 5869 - 5874
  • [7] Mathematical expressions useful in the construction, description and evaluation of protein libraries
    Bosley, AD
    Ostermeier, M
    [J]. BIOMOLECULAR ENGINEERING, 2005, 22 (1-3): : 57 - 61
  • [8] Brookes DH, 2018, ARXIV181003714V3
  • [9] A machine learning approach for reliable prediction of amino acid interactions and its application in the directed evolution of enantioselective enzymes
    Cadet, Frederic
    Fontaine, Nicolas
    Li, Guangyue
    Sanchis, Joaquin
    Chong, Matthieu Ng Fuk
    Pandjaitan, Rudy
    Vetrivel, Iyanar
    Offmann, Bernard
    Reetz, Manfred T.
    [J]. SCIENTIFIC REPORTS, 2018, 8
  • [10] Kinetic Characterization of 100 Glycoside Hydrolase Mutants Enables the Discovery of Structural Features Correlated with Kinetic Constants
    Carlin, Dylan Alexander
    Caster, Ryan W.
    Wang, Xiaokang
    Betzenderfer, Stephanie A.
    Chen, Claire X.
    Duong, Veasna M.
    Ryklansky, Carolina V.
    Alpekin, Alp
    Beaumont, Nathan
    Kapoor, Harshul
    Kim, Nicole
    Mohabbot, Hosna
    Pang, Boyu
    Teel, Rachel
    Whithaus, Lillian
    Tagkopoulos, Ilias
    Siegel, Justin B.
    [J]. PLOS ONE, 2016, 11 (01):