Predicting chemical reaction outcomes: A grammar ontology-based transformer framework

被引:37
作者
Mann, Vipul [1 ]
Venkatasubramanian, Venkat [1 ]
机构
[1] Columbia Univ, Dept Chem Engn, New York, NY 10027 USA
关键词
artificial intelligence; computational chemistry (reaction modeling); context-free grammar; computational chemistry (organic reactions); neural machine translation; DESIGN; MODELS;
D O I
10.1002/aic.17190
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Discovering and designing novel materials is a challenging problem as it often requires searching through a combinatorially large space of potential candidates, typically requiring great amounts of effort, time, expertise, and money. The ability to predict reaction outcomes without performing extensive experiments is, therefore, important. Toward that goal, we report an approach that uses context-free grammar-based representations of molecules in a neural machine translation framework. This involves discovering the transformations from the source sequence (comprising the reactants and agents) to the target sequence (comprising the major product) in the reaction. The grammar ontology-based representation hierarchically incorporates rich molecular-structure information, ensures syntactic validity of predictions, and overcomes over-parameterization in complex machine learning architectures. We achieve an accuracy of 80.1% (86.3% top-2 accuracy) and 99% syntactic validity of predictions on a standard reaction dataset. Moreover, our model is characterized by only a fraction of the number of training parameters used in other similar works in this area.
引用
收藏
页数:13
相关论文
共 42 条
[1]  
[Anonymous], 2015, ACS SYM SER
[2]  
Bahdanau D., 2015, PROC INT C LEARN REP
[3]   Optimization in polymer design using connectivity indices [J].
Camarda, KV ;
Maranas, CD .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 1999, 38 (05) :1884-1892
[4]  
CHOMSKY N, 1956, IRE T INFORM THEOR, V2, P113
[5]   A graph-convolutional neural network model for the prediction of chemical reactivity [J].
Coley, Connor W. ;
Jin, Wengong ;
Rogers, Luke ;
Jamison, Timothy F. ;
Jaakkola, Tommi S. ;
Green, William H. ;
Barzilay, Regina ;
Jensen, Klavs F. .
CHEMICAL SCIENCE, 2019, 10 (02) :370-377
[6]   Prediction of Organic Reaction Outcomes Using Machine Learning [J].
Coley, Connor W. ;
Barzilay, Regina ;
Jaakkola, Tommi S. ;
Green, William H. ;
Jensen, Klays F. .
ACS CENTRAL SCIENCE, 2017, 3 (05) :434-443
[7]   Deep learning for molecular design-a review of the state of the art [J].
Elton, Daniel C. ;
Boukouvalas, Zois ;
Fuge, Mark D. ;
Chung, Peter W. .
MOLECULAR SYSTEMS DESIGN & ENGINEERING, 2019, 4 (04) :828-849
[8]   CO: A chemical ontology for identification of functional groups and semantic comparison of small molecules [J].
Feldman, HJ ;
Dumontier, M ;
Ling, S ;
Haider, N ;
Hogue, CWV .
FEBS LETTERS, 2005, 579 (21) :4685-4691
[9]   Using Machine Learning To Predict Suitable Conditions for Organic Reactions [J].
Gao, Hanyu ;
Struble, Thomas J. ;
Coley, Connor W. ;
Wang, Yuran ;
Green, William H. ;
Jensen, Klavs F. .
ACS CENTRAL SCIENCE, 2018, 4 (11) :1465-1476
[10]   Machine learning for heterogeneous catalyst design and discovery [J].
Goldsmith, Bryan R. ;
Esterhuizen, Jacques ;
Liu, Jin-Xun ;
Bartel, Christopher J. ;
Sutton, Christopher .
AICHE JOURNAL, 2018, 64 (07) :2311-2323