ConvPred: A deep learning-based framework for predictions of potential organic reactions

被引:6
作者
Wang, Wenlong [1 ]
Liu, Qilei [1 ,3 ]
Dong, Yachao [1 ]
Du, Jian [1 ]
Meng, Qingwei [1 ,2 ]
Zhang, Lei [1 ,3 ]
机构
[1] Dalian Univ Technol, Sch Chem Engn, Inst Chem Proc Syst Engn, Frontiers Sci Ctr Smart Mat Oriented Chem Engn,Sta, Dalian, Peoples R China
[2] Dalian Univ Technol, Ningbo Res Inst, Ningbo, Peoples R China
[3] Dalian Univ Technol, Sch Chem Engn, Inst Chem Proc Syst Engn, Frontiers Sci Ctr Smart Mat Oriented Chem Engn,Sta, Dalian 116024, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
deep learning; molecular fingerprint; reaction fingerprint; reaction prediction; MODEL;
D O I
10.1002/aic.18019
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
The traditional theory-based technologies on reaction prediction are not efficient due to their heavy dependence on human expertise and experience. To this end, this article proposes a framework for predictions of potential organic reactions based on reaction templates and two-dimensional convolutional neural network (2D CNN) model. The quantum mechanics-based sigma-profiles and the sub-molecular structure-based ECFP4 are used individually to encode chemical reactions. Using 605,753 patented reactions extracted from the USPTO 1976-2016 database and their generated counterparts, the 2D CNN models are trained to evaluate the likelihood of molecular transformations by learning the feature differences between reactants and products. The classification accuracies of the sigma-profiles-based model and the ECFP4-based model for the non-trained reactions are 97.881 and 99.593%. Challenging reactions from literature involving identification of chemo-, stereo-, and regio-selectivity are correctly predicted. Furthermore, a sigma-profiles-based visual reaction fingerprint is introduced to provide novel insights into the model interpretability.
引用
收藏
页数:17
相关论文
共 45 条
  • [41] Tavakoli M., 2022, ARXIV
  • [42] Chemistry with ADF
    te Velde, G
    Bickelhaupt, FM
    Baerends, EJ
    Guerra, CF
    Van Gisbergen, SJA
    Snijders, JG
    Ziegler, T
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2001, 22 (09) : 931 - 967
  • [43] State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis
    Tetko, Igor V.
    Karpov, Pavel
    Van Deursen, Ruud
    Godin, Guillaume
    [J]. NATURE COMMUNICATIONS, 2020, 11 (01)
  • [44] Vaswani A, 2017, ADV NEUR IN, V30
  • [45] SMILES, A CHEMICAL LANGUAGE AND INFORMATION-SYSTEM .1. INTRODUCTION TO METHODOLOGY AND ENCODING RULES
    WEININGER, D
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1988, 28 (01): : 31 - 36