Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering

被引:77
作者
Yang, Jason [1 ]
Li, Francesca-Zhoufan [2 ]
Arnold, Frances H. [1 ,2 ]
机构
[1] CALTECH, Div Chem & Chem Engn, Pasadena, CA 91125 USA
[2] CALTECH, Div Biol & Biol Engn, Pasadena, CA 91125 USA
基金
美国国家科学基金会;
关键词
PROTEIN DESIGN; FITNESS LANDSCAPES; PREDICTION; EVOLUTION; LANGUAGE; RECONSTRUCTION; EPISTASIS; MODEL; UNCERTAINTY; DISCOVERY;
D O I
10.1021/acscentsci.3c01275
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency-or even to unlock new catalytic activities not found in nature. Because the search space of possible proteins is vast, enzyme engineering usually involves discovering an enzyme starting point that has some level of the desired activity followed by directed evolution to improve its "fitness" for a desired application. Recently, machine learning (ML) has emerged as a powerful tool to complement this empirical process. ML models can contribute to (1) starting point discovery by functional annotation of known protein sequences or generating novel protein sequences with desired functions and (2) navigating protein fitness landscapes for fitness optimization by learning mappings between protein sequences and their associated fitness values. In this Outlook, we explain how ML complements enzyme engineering and discuss its future potential to unlock improved engineering outcomes.
引用
收藏
页码:226 / 241
页数:16
相关论文
共 248 条
[21]   Keep on Moving: Discovering and Perturbing the Conformational Dynamics of Enzymes [J].
Bhabha, Gira ;
Biel, Justin T. ;
Fraser, James S. .
ACCOUNTS OF CHEMICAL RESEARCH, 2015, 48 (02) :423-430
[22]   Using deep learning to annotate the protein universe [J].
Bileschi, Maxwell L. ;
Belanger, David ;
Bryant, Drew ;
Sanderson, Theo ;
Carter, Brandon ;
Sculley, D. ;
Bateman, Alex ;
DePristo, Mark A. ;
Colwell, Lucy J. .
NATURE BIOTECHNOLOGY, 2022, 40 (06) :932-+
[23]   Low-N protein engineering with data-efficient deep learning [J].
Biswas, Surojit ;
Khimulya, Grigory ;
Alley, Ethan C. ;
Esvelt, Kevin M. ;
Church, George M. .
NATURE METHODS, 2021, 18 (04) :389-+
[24]   Protein stability promotes evolvability [J].
Bloom, JD ;
Labthavikul, ST ;
Otey, CR ;
Arnold, FH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (15) :5869-5874
[25]  
Boiko D. A., 2023, EMERGENT AUTONOMOUS
[26]   Engineering the third wave of biocatalysis [J].
Bornscheuer, U. T. ;
Huisman, G. W. ;
Kazlauskas, R. J. ;
Lutz, S. ;
Moore, J. C. ;
Robins, K. .
NATURE, 2012, 485 (7397) :185-194
[27]   ProteinBERT: a universal deep-learning model of protein sequence and function [J].
Brandes, Nadav ;
Ofer, Dan ;
Peleg, Yam ;
Rappoport, Nadav ;
Linial, Michal .
BIOINFORMATICS, 2022, 38 (08) :2102-2110
[28]   On the sparsity of fitness functions and implications for learning [J].
Brookes, David H. ;
Aghazadeh, Amirali ;
Listgarten, Jennifer .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (01)
[29]  
Brookes David H., 2019, PR MACH LEARN RES, V97
[30]   Deep diversification of an AAV capsid protein by machine learning [J].
Bryant, Drew H. ;
Bashir, Ali ;
Sinai, Sam ;
Jain, Nina K. ;
Ogden, Pierce J. ;
Riley, Patrick F. ;
Church, George M. ;
Colwell, Lucy J. ;
Kelsic, Eric D. .
NATURE BIOTECHNOLOGY, 2021, 39 (06) :691-696