Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering

被引:77
作者
Yang, Jason [1 ]
Li, Francesca-Zhoufan [2 ]
Arnold, Frances H. [1 ,2 ]
机构
[1] CALTECH, Div Chem & Chem Engn, Pasadena, CA 91125 USA
[2] CALTECH, Div Biol & Biol Engn, Pasadena, CA 91125 USA
基金
美国国家科学基金会;
关键词
PROTEIN DESIGN; FITNESS LANDSCAPES; PREDICTION; EVOLUTION; LANGUAGE; RECONSTRUCTION; EPISTASIS; MODEL; UNCERTAINTY; DISCOVERY;
D O I
10.1021/acscentsci.3c01275
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency-or even to unlock new catalytic activities not found in nature. Because the search space of possible proteins is vast, enzyme engineering usually involves discovering an enzyme starting point that has some level of the desired activity followed by directed evolution to improve its "fitness" for a desired application. Recently, machine learning (ML) has emerged as a powerful tool to complement this empirical process. ML models can contribute to (1) starting point discovery by functional annotation of known protein sequences or generating novel protein sequences with desired functions and (2) navigating protein fitness landscapes for fitness optimization by learning mappings between protein sequences and their associated fitness values. In this Outlook, we explain how ML complements enzyme engineering and discuss its future potential to unlock improved engineering outcomes.
引用
收藏
页码:226 / 241
页数:16
相关论文
共 248 条
[1]   Pervasive cooperative mutational effects on multiple catalytic enzyme traits emerge via long-range conformational dynamics [J].
Acevedo-Rocha, Carlos G. ;
Li, Aitao ;
D'Amore, Lorenzo ;
Hoebenreich, Sabrina ;
Sanchis, Joaquin ;
Lubrano, Paul ;
Ferla, Matteo P. ;
Garcia-Borras, Marc ;
Osuna, Silvia ;
Reetz, Manfred T. .
NATURE COMMUNICATIONS, 2021, 12 (01)
[2]   Molecular dynamics: Survey of methods for simulating the activity of proteins [J].
Adcock, Stewart A. ;
McCammon, J. Andrew .
CHEMICAL REVIEWS, 2006, 106 (05) :1589-1615
[3]   Epistatic Net allows the sparse spectral regularization of deep neural networks for inferring fitness functions [J].
Aghazadeh, Amirali ;
Nisonoff, Hunter ;
Ocal, Orhan ;
Brookes, David H. ;
Huang, Yijie ;
Koyluoglu, O. Ozan ;
Listgarten, Jennifer ;
Ramchandran, Kannan .
NATURE COMMUNICATIONS, 2021, 12 (01)
[4]  
Alamdari S, 2024, bioRxiv, DOI [10.1101/2023.09.11.556673, 10.1101/2023.09.11.556673, DOI 10.1101/2023.09.11.556673]
[5]   The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design [J].
Alford, Rebecca F. ;
Leaver-Fay, Andrew ;
Jeliazkov, Jeliazko R. ;
O'Meara, Matthew J. ;
DiMaio, Frank P. ;
Park, Hahnbeom ;
Shapovalov, Maxim V. ;
Renfrew, P. Douglas ;
Mulligan, Vikram K. ;
Kappel, Kalli ;
Labonte, Jason W. ;
Pacella, Michael S. ;
Bonneau, Richard ;
Bradley, Philip ;
Dunbrack, Roland L., Jr. ;
Das, Rhiju ;
Baker, David ;
Kuhlman, Brian ;
Kortemme, Tanja ;
Gray, Jeffrey J. .
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2017, 13 (06) :3031-3048
[6]   Unified rational protein engineering with sequence-based deep representation learning [J].
Alley, Ethan C. ;
Khimulya, Grigory ;
Biswas, Surojit ;
AlQuraishi, Mohammed ;
Church, George M. .
NATURE METHODS, 2019, 16 (12) :1315-+
[7]  
Amin AN, 2023, Arxiv, DOI arXiv:2304.03775
[8]   De novo protein design by deep network hallucination [J].
Anishchenko, Ivan ;
Pellock, Samuel J. ;
Chidyausiku, Tamuka M. ;
Ramelot, Theresa A. ;
Ovchinnikov, Sergey ;
Hao, Jingzhou ;
Bafna, Khushboo ;
Norn, Christoffer ;
Kang, Alex ;
Bera, Asim K. ;
DiMaio, Frank ;
Carter, Lauren ;
Chow, Cameron M. ;
Montelione, Gaetano T. ;
Baker, David .
NATURE, 2021, 600 (7889) :547-+
[9]  
[Anonymous], 2021, MSA TRANSFORMER INT, V139
[10]   Directed Evolution: Bringing New Chemistry to Life [J].
Arnold, Frances H. .
ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2018, 57 (16) :4143-4148