Learning deep representations of enzyme thermal adaptation

被引:20
作者
Li, Gang [1 ]
Buric, Filip [1 ,2 ]
Zrimec, Jan [1 ]
Viknander, Sandra [1 ]
Nielsen, Jens [1 ,3 ,4 ]
Zelezniak, Aleksej [5 ]
Engqvist, Martin K. M. [1 ,6 ,7 ]
机构
[1] Chalmers Univ Technol, Dept Biol & Biol Engn, Gothenburg, Sweden
[2] Natl Inst Biol, Dept Biotechnol & Syst Biol, Ljubljana, Slovenia
[3] BioInnovat Inst, Copenhagen, Denmark
[4] Vilnius Univ, Inst Biotechnol, Life Sci Ctr, Vilnius, Lithuania
[5] Kings Coll London, Randall Ctr Cell & Mol Biophys, New Hunts House Guys Campus, London SE1 1UL, England
[6] Enginzyme AB, Stockholm, Sweden
[7] Chalmers Univ Technol, Dept Biol & Biol Engn, Kemivagen 10, SE-41296 Gothenburg, Sweden
基金
欧盟地平线“2020”; 瑞典研究理事会;
关键词
bioinformatics; deep neural networks; enzyme catalytic temperatures; optimal growth temperatures; protein thermostability; transfer learning; PROTEIN; PROKARYOTES; BIOLOGY; GROWTH;
D O I
10.1002/pro.4480
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Temperature is a fundamental environmental factor that shapes the evolution of organisms. Learning thermal determinants of protein sequences in evolution thus has profound significance for basic biology, drug discovery, and protein engineering. Here, we use a data set of over 3 million BRENDA enzymes labeled with optimal growth temperatures (OGTs) of their source organisms to train a deep neural network model (DeepET). The protein-temperature representations learned by DeepET provide a temperature-related statistical summary of protein sequences and capture structural properties that affect thermal stability. For prediction of enzyme optimal catalytic temperatures and protein melting temperatures via a transfer learning approach, our DeepET model outperforms classical regression models trained on rationally designed features and other deep-learning-based representations. DeepET thus holds promise for understanding enzyme thermal adaptation and guiding the engineering of thermostable enzymes.
引用
收藏
页数:14
相关论文
共 74 条
[1]  
Abadi M, 2016, ARXIV160304467 CSDC
[2]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[3]   Unified rational protein engineering with sequence-based deep representation learning [J].
Alley, Ethan C. ;
Khimulya, Grigory ;
Biswas, Surojit ;
AlQuraishi, Mohammed ;
Church, George M. .
NATURE METHODS, 2019, 16 (12) :1315-+
[4]   How enzymes adapt: lessons from directed evolution [J].
Arnold, FH ;
Wintrode, PL ;
Miyazaki, K ;
Gershenson, A .
TRENDS IN BIOCHEMICAL SCIENCES, 2001, 26 (02) :100-106
[5]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[6]  
Bileschi ML, 2019, bioRxiv, DOI [10.1101/626507, 10.1101/626507, DOI 10.1101/626507]
[7]   The InterPro protein families and domains database: 20 years on [J].
Blum, Matthias ;
Chang, Hsin-Yu ;
Chuguransky, Sara ;
Grego, Tiago ;
Kandasaamy, Swaathi ;
Mitchell, Alex ;
Nuka, Gift ;
Paysan-Lafosse, Typhaine ;
Qureshi, Matloob ;
Raj, Shriya ;
Richardson, Lorna ;
Salazar, Gustavo A. ;
Williams, Lowri ;
Bork, Peer ;
Bridge, Alan ;
Gough, Julian ;
Haft, Daniel H. ;
Letunic, Ivica ;
Marchler-Bauer, Aron ;
Mi, Huaiyu ;
Natale, Darren A. ;
Necci, Marco ;
Orengo, Christine A. ;
Pandurangan, Arun P. ;
Rivoire, Catherine ;
Sigrist, Christian J. A. ;
Sillitoe, Ian ;
Thanki, Narmada ;
Thomas, Paul D. ;
Tosatto, Silvio C. E. ;
Wu, Cathy H. ;
Bateman, Alex ;
Finn, Robert D. .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D344-D354
[8]   The Gene Ontology resource: enriching a GOld mine [J].
Carbon, Seth ;
Douglass, Eric ;
Good, Benjamin M. ;
Unni, Deepak R. ;
Harris, Nomi L. ;
Mungall, Christopher J. ;
Basu, Siddartha ;
Chisholm, Rex L. ;
Dodson, Robert J. ;
Hartline, Eric ;
Fey, Petra ;
Thomas, Paul D. ;
Albou, Laurent-Philippe ;
Ebert, Dustin ;
Kesling, Michael J. ;
Mi, Huaiyu ;
Muruganujan, Anushya ;
Huang, Xiaosong ;
Mushayahama, Tremayne ;
LaBonte, Sandra A. ;
Siegele, Deborah A. ;
Antonazzo, Giulia ;
Attrill, Helen ;
Brown, Nick H. ;
Garapati, Phani ;
Marygold, Steven J. ;
Trovisco, Vitor ;
Dos Santos, Gil ;
Falls, Kathleen ;
Tabone, Christopher ;
Zhou, Pinglei ;
Goodman, Joshua L. ;
Strelets, Victor B. ;
Thurmond, Jim ;
Garmiri, Penelope ;
Ishtiaq, Rizwan ;
Rodriguez-Lopez, Milagros ;
Acencio, Marcio L. ;
Kuiper, Martin ;
Laegreid, Astrid ;
Logie, Colin ;
Lovering, Ruth C. ;
Kramarz, Barbara ;
Saverimuttu, Shirin C. C. ;
Pinheiro, Sandra M. ;
Gunn, Heather ;
Su, Renzhi ;
Thurlow, Katherine E. ;
Chibucos, Marcus ;
Giglio, Michelle .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D325-D334
[9]   Structural Systems Biology Evaluation of Metabolic Thermotolerance in Escherichia coli [J].
Chang, Roger L. ;
Andrews, Kathleen ;
Kim, Donghyuk ;
Li, Zhanwen ;
Godzik, Adam ;
Palsson, Bernhard O. .
SCIENCE, 2013, 340 (6137) :1220-1223
[10]   DTL-DephosSite: Deep Transfer Learning Based Approach to Predict Dephosphorylation Sites [J].
Chaudhari, Meenal ;
Thapa, Niraj ;
Ismail, Hamid ;
Chopade, Sandhya ;
Caragea, Doina ;
Koehn, Maja ;
Newman, Robert H. ;
KC, Dukka B. .
FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2021, 9