ConvPred: A deep learning-based framework for predictions of potential organic reactions

被引:6
作者
Wang, Wenlong [1 ]
Liu, Qilei [1 ,3 ]
Dong, Yachao [1 ]
Du, Jian [1 ]
Meng, Qingwei [1 ,2 ]
Zhang, Lei [1 ,3 ]
机构
[1] Dalian Univ Technol, Sch Chem Engn, Inst Chem Proc Syst Engn, Frontiers Sci Ctr Smart Mat Oriented Chem Engn,Sta, Dalian, Peoples R China
[2] Dalian Univ Technol, Ningbo Res Inst, Ningbo, Peoples R China
[3] Dalian Univ Technol, Sch Chem Engn, Inst Chem Proc Syst Engn, Frontiers Sci Ctr Smart Mat Oriented Chem Engn,Sta, Dalian 116024, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
deep learning; molecular fingerprint; reaction fingerprint; reaction prediction; MODEL;
D O I
10.1002/aic.18019
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
The traditional theory-based technologies on reaction prediction are not efficient due to their heavy dependence on human expertise and experience. To this end, this article proposes a framework for predictions of potential organic reactions based on reaction templates and two-dimensional convolutional neural network (2D CNN) model. The quantum mechanics-based sigma-profiles and the sub-molecular structure-based ECFP4 are used individually to encode chemical reactions. Using 605,753 patented reactions extracted from the USPTO 1976-2016 database and their generated counterparts, the 2D CNN models are trained to evaluate the likelihood of molecular transformations by learning the feature differences between reactants and products. The classification accuracies of the sigma-profiles-based model and the ECFP4-based model for the non-trained reactions are 97.881 and 99.593%. Challenging reactions from literature involving identification of chemo-, stereo-, and regio-selectivity are correctly predicted. Furthermore, a sigma-profiles-based visual reaction fingerprint is introduced to provide novel insights into the model interpretability.
引用
收藏
页数:17
相关论文
共 45 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Sigma profiles in deep learning: towards a universal molecular descriptor
    Abranches, Dinis O.
    Zhang, Yong
    Maginn, Edward J.
    Colon, Yamil J.
    [J]. CHEMICAL COMMUNICATIONS, 2022, 58 (37) : 5630 - 5633
  • [3] [Anonymous], Daylight Chemical Information Systems Inc
  • [4] [Anonymous], RDKit: Open-source cheminformatics
  • [5] [Anonymous], EPAM SYSTEMS
  • [6] Prediction of Major Regio-, Site-, and Diastereoisomers in Diels-Alder Reactions by Using Machine-Learning: The Importance of Physically Meaningful Descriptors
    Beker, Wiktor
    Gajewska, Ewa P.
    Badowski, Tomasz
    Grzybowski, Bartosz A.
    [J]. ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2019, 58 (14) : 4515 - 4519
  • [7] A graph-convolutional neural network model for the prediction of chemical reactivity
    Coley, Connor W.
    Jin, Wengong
    Rogers, Luke
    Jamison, Timothy F.
    Jaakkola, Tommi S.
    Green, William H.
    Barzilay, Regina
    Jensen, Klavs F.
    [J]. CHEMICAL SCIENCE, 2019, 10 (02) : 370 - 377
  • [8] Machine Learning in Computer-Aided Synthesis Planning
    Coley, Connor W.
    Green, William H.
    Jensen, Klays F.
    [J]. ACCOUNTS OF CHEMICAL RESEARCH, 2018, 51 (05) : 1281 - 1289
  • [9] Prediction of Organic Reaction Outcomes Using Machine Learning
    Coley, Connor W.
    Barzilay, Regina
    Jaakkola, Tommi S.
    Green, William H.
    Jensen, Klays F.
    [J]. ACS CENTRAL SCIENCE, 2017, 3 (05) : 434 - 443
  • [10] COMPUTER-ASSISTED SYNTHETIC ANALYSIS - SYNTHETIC STRATEGIES BASED ON APPENDAGES AND USE OF RECONNECTIVE TRANSFORMS
    COREY, EJ
    JORGENSEN, WL
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1976, 98 (01) : 189 - 203