Reproducing Reaction Mechanisms with Machine-Learning Models Trained on a Large-Scale Mechanistic Dataset

被引:0
作者
Joung, Joonyoung F. [1 ]
Fong, Mun Hong [1 ]
Roh, Jihye [1 ]
Tu, Zhengkai [2 ]
Bradshaw, John [1 ]
Coley, Connor W. [1 ,2 ]
机构
[1] MIT, Dept Chem Engn, Cambridge, MA 02139 USA
[2] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA
基金
美国国家科学基金会;
关键词
Machine learning; Reaction outcome prediction; Reaction mechanisms; Organic chemistry; HYPERSPHERE SEARCH METHOD; ELASTIC BAND METHOD; AUTOMATED DISCOVERY; CHEMICAL-REACTIONS; REACTION PATHWAYS; PREDICTION; EXPLORATION; GENERATION; CHEMISTRY; NETWORKS;
D O I
10.1002/anie.202411296
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Mechanistic understanding of organic reactions can facilitate reaction development, impurity prediction, and in principle, reaction discovery. While several machine learning models have sought to address the task of predicting reaction products, their extension to predicting reaction mechanisms has been impeded by the lack of a corresponding mechanistic dataset. In this study, we construct such a dataset by imputing intermediates between experimentally reported reactants and products using expert reaction templates and train several machine learning models on the resulting dataset of 5,184,184 elementary steps. We explore the performance and capabilities of these models, focusing on their ability to predict reaction pathways and recapitulate the roles of catalysts and reagents. Additionally, we demonstrate the potential of mechanistic models in predicting impurities, often overlooked by conventional models. We conclude by evaluating the generalizability of mechanistic models to new reaction types, revealing challenges related to dataset diversity, consecutive predictions, and violations of atom conservation. Machine learning models trained on mechanistic datasets created using expert reaction templates demonstrate the ability to successfully predict known reaction mechanisms. This study illustrates how such mechanistic models can explain how reaction outcomes are produced, recapitulate the roles of catalysts and reagents, and suggest potential side products and impurities. image
引用
收藏
页数:10
相关论文
共 52 条
[1]   Heuristics-Guided Exploration of Reaction Mechanisms [J].
Bergeler, Maike ;
Simm, Gregor N. ;
Proppe, Jonny ;
Reiher, Markus .
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2015, 11 (12) :5712-5722
[2]   Transition state geometry prediction using molecular group contributions [J].
Bhoorasingh, Pierre L. ;
West, Richard H. .
PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2015, 17 (48) :32173-32182
[3]  
Bradshaw John, 2018, arXiv preprint arXiv:1805.10970
[4]   COMPUTER-GENERATED PYROLYSIS MODELING - ON-THE-FLY GENERATION OF SPECIES, REACTIONS, AND RATES [J].
BROADBELT, LJ ;
STARK, SM ;
KLEIN, MT .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 1994, 33 (04) :790-799
[5]   Lexicography of kinetic modeling of complex reaction networks [J].
Broadbelt, LJ ;
Pfaendtner, J .
AICHE JOURNAL, 2005, 51 (08) :2112-2121
[6]   No Electron Left Behind: A Rule-Based Expert System To Predict Chemical Reactions and Reaction Mechanisms [J].
Chen, Jonathan H. ;
Baldi, Pierre .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (09) :2034-2043
[7]  
Chen S., AI ACC MAT DES NEURI
[8]   A generalized-template-based graph neural network for accurate organic reactivity prediction [J].
Chen, Shuan ;
Jung, Yousung .
NATURE MACHINE INTELLIGENCE, 2022, 4 (09) :772-780
[9]   A graph-convolutional neural network model for the prediction of chemical reactivity [J].
Coley, Connor W. ;
Jin, Wengong ;
Rogers, Luke ;
Jamison, Timothy F. ;
Jaakkola, Tommi S. ;
Green, William H. ;
Barzilay, Regina ;
Jensen, Klavs F. .
CHEMICAL SCIENCE, 2019, 10 (02) :370-377
[10]   Prediction of Organic Reaction Outcomes Using Machine Learning [J].
Coley, Connor W. ;
Barzilay, Regina ;
Jaakkola, Tommi S. ;
Green, William H. ;
Jensen, Klays F. .
ACS CENTRAL SCIENCE, 2017, 3 (05) :434-443