Self-Supervised Contrastive Molecular Representation Learning with a Chemical Synthesis Knowledge Graph

被引:5
作者
Xie, Jiancong [1 ]
Wang, Yi [1 ]
Rao, Jiahua [1 ]
Zheng, Shuangjia [2 ]
Yang, Yuedong [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] Shanghai Jiao Tong Univ, Global Inst Future Technol, Shanghai 200030, Peoples R China
[3] Sun Yat Sen Univ, Key Lab Machine Intelligence & Adv Comp, Guangzhou 510006, Peoples R China
关键词
TRANSFORMER; PREDICTION; OUTCOMES;
D O I
10.1021/acs.jcim.4c00157
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Self-supervised molecular representation learning has demonstrated great promise in bridging machine learning and chemical science to accelerate the development of new drugs. Due to the limited reaction data, existing methods are mostly pretrained by augmenting the intrinsic topology of molecules without effectively incorporating chemical reaction prior information, which makes them difficult to generalize to chemical reaction-related tasks. To address this issue, we propose ReaKE, a reaction knowledge embedding framework, which formulates chemical reactions as a knowledge graph. Specifically, we constructed a chemical synthesis knowledge graph with reactants and products as nodes and reaction rules as the edges. Based on the knowledge graph, we further proposed novel contrastive learning at both molecule and reaction levels to capture the reaction-related functional group information within and between molecules. Extensive experiments demonstrate the effectiveness of ReaKE compared with state-of-the-art methods on several downstream tasks, including reaction classification, product prediction, and yield prediction.
引用
收藏
页码:1945 / 1954
页数:10
相关论文
共 44 条
[1]  
Bergstra J., 2011, Advances inneural information processing systems, V24
[2]   ATOM PAIRS AS MOLECULAR-FEATURES IN STRUCTURE ACTIVITY STUDIES - DEFINITION AND APPLICATIONS [J].
CARHART, RE ;
SMITH, DH ;
VENKATARAGHAVAN, R .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1985, 25 (02) :64-73
[3]   PREDICTION OF REACTION RATE CONSTANTS OF HYDROXYL RADICAL WITH ORGANIC COMPOUNDS [J].
Chen, Zhen ;
Yu, Xinliang ;
Huang, Xianwei ;
Zhang, Shihua .
JOURNAL OF THE CHILEAN CHEMICAL SOCIETY, 2014, 59 (01) :2252-2259
[4]   RDChiral: An RDKit Wrapper for Handling Stereochemistry in Retrosynthetic Template Extraction and Application [J].
Coley, Connor W. ;
Green, William H. ;
Jensen, Klays F. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (06) :2529-2537
[5]   Prediction of Organic Reaction Outcomes Using Machine Learning [J].
Coley, Connor W. ;
Barzilay, Regina ;
Jaakkola, Tommi S. ;
Green, William H. ;
Jensen, Klays F. .
ACS CENTRAL SCIENCE, 2017, 3 (05) :434-443
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]  
Du J, 2017, ARXIV171010370 CSLG, DOI DOI 10.48550/ARXIV.1710.10370
[8]   Response to Comment on "Predicting reaction performance in C-N cross-coupling using machine learning" [J].
Estrada, Jesus G. ;
Ahneman, Derek T. ;
Sheridan, Robert P. ;
Dreher, Spencer D. ;
Doyle, Abigail G. .
SCIENCE, 2018, 362 (6416)
[9]  
Fabian B., 2020, ARXIV
[10]  
Fatemi B., 2019, arXiv