ReLMole: Molecular Representation Learning Based on Two-Level Graph Similarities

被引:24
作者
Ji, Zewei [1 ,2 ]
Shi, Runhan [1 ,2 ]
Lu, Jiarui [1 ,2 ]
Li, Fang [1 ,2 ]
Yang, Yang [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interact, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
PREDICTION; DRUGS;
D O I
10.1021/acs.jcim.2c00798
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Molecular representation is a critical part of various prediction tasks for physicochemical properties of molecules and drug design. As graph notations are common in expressing the structural information of chemical compounds, graph neural networks (GNNs) have become the mainstream backbone model for learning molecular representation. However, the scarcity of task-specific labels in the biomedical domain limits the power of GNNs. Recently, self-supervised pretraining for GNNs has been leveraged to deal with this issue, while the existing pretraining methods are mainly designed for graph data in general domains without considering the specific data properties of molecules. In this paper, we propose a representation learning method for molecular graphs, called ReLMole, which is featured by a hierarchical graph modeling of molecules and a contrastive learning scheme based on two-level graph similarities. We assess the performance of ReLMole on two types of downstream tasks, namely, the prediction of molecular properties (MPs) and drug-drug interaction (DDIs). ReLMole achieves promising results for all the tasks. It outperforms the baseline models by over 2.6% on ROC-AUC averaged across six MP prediction tasks, and it improves the F1 value by 7-18% in DDI prediction for unseen drugs compared with other self-supervised models.
引用
收藏
页码:5361 / 5372
页数:12
相关论文
共 60 条
[1]   The properties of known drugs .1. Molecular frameworks [J].
Bemis, GW ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (15) :2887-2893
[2]   A maximum common substructure-based algorithm for searching and predicting drug-like compounds [J].
Cao, Yiqun ;
Jiang, Tao ;
Girke, Thomas .
BIOINFORMATICS, 2008, 24 (13) :I366-I374
[3]   Predicting postoperative peritoneal metastasis in gastric cancer with serosal invasion using a collagen nomogram [J].
Chen, Dexin ;
Liu, Zhangyuanzhu ;
Liu, Wenju ;
Fu, Meiting ;
Jiang, Wei ;
Xu, Shuoyu ;
Wang, Guangxing ;
Chen, Feng ;
Lu, Jianping ;
Chen, Hao ;
Dong, Xiaoyu ;
Li, Guoxin ;
Chen, Gang ;
Zhuo, Shuangmu ;
Yan, Jun .
NATURE COMMUNICATIONS, 2021, 12 (01)
[4]  
Chen Feilong, 2021, ARXIV
[5]  
Chen T, 2020, PR MACH LEARN RES, V119
[6]  
CTTI, 2010, AGGR AN CLINCALTRAIL
[7]   On the Art of Compiling and Using 'Drug-Like' Chemical Fragment Spaces [J].
Degen, Joerg ;
Wegscheid-Gerlach, Christof ;
Zaliani, Andrea ;
Rarey, Matthias .
CHEMMEDCHEM, 2008, 3 (10) :1503-1507
[8]  
Devlin J., 2018, P C N AM CHAPT ASS C, P1
[9]   Maximum Common Substructure-Based Data Fusion in Similarity Searching [J].
Duesbury, Edmund ;
Holliday, John ;
Willett, Peter .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2015, 55 (02) :222-230
[10]   An algorithm to identify functional groups in organic molecules [J].
Ertl, Peter .
JOURNAL OF CHEMINFORMATICS, 2017, 9