ReLMole: Molecular Representation Learning Based on Two-Level Graph Similarities

被引:24
作者
Ji, Zewei [1 ,2 ]
Shi, Runhan [1 ,2 ]
Lu, Jiarui [1 ,2 ]
Li, Fang [1 ,2 ]
Yang, Yang [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interact, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
PREDICTION; DRUGS;
D O I
10.1021/acs.jcim.2c00798
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Molecular representation is a critical part of various prediction tasks for physicochemical properties of molecules and drug design. As graph notations are common in expressing the structural information of chemical compounds, graph neural networks (GNNs) have become the mainstream backbone model for learning molecular representation. However, the scarcity of task-specific labels in the biomedical domain limits the power of GNNs. Recently, self-supervised pretraining for GNNs has been leveraged to deal with this issue, while the existing pretraining methods are mainly designed for graph data in general domains without considering the specific data properties of molecules. In this paper, we propose a representation learning method for molecular graphs, called ReLMole, which is featured by a hierarchical graph modeling of molecules and a contrastive learning scheme based on two-level graph similarities. We assess the performance of ReLMole on two types of downstream tasks, namely, the prediction of molecular properties (MPs) and drug-drug interaction (DDIs). ReLMole achieves promising results for all the tasks. It outperforms the baseline models by over 2.6% on ROC-AUC averaged across six MP prediction tasks, and it improves the F1 value by 7-18% in DDI prediction for unseen drugs compared with other self-supervised models.
引用
收藏
页码:5361 / 5372
页数:12
相关论文
共 60 条
[31]  
Li QM, 2018, AAAI CONF ARTIF INTE, P3538
[32]  
Marinka Zitnik S. M., 2018, BioSNAP Datasets: Stanford Biomedical Network Dataset Collection
[33]   A Bayesian Approach to in Silico Blood-Brain Barrier Penetration Modeling [J].
Martins, Ines Filipa ;
Teixeira, Ana L. ;
Pinheiro, Luis ;
Falcao, Andre O. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2012, 52 (06) :1686-1697
[34]  
Maziarz K., 2021, INT C MACH LEARN
[35]  
NCATS, 2014, TOX21 DAT CHALL
[36]   Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles [J].
Noroozi, Mehdi ;
Favaro, Paolo .
COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :69-84
[37]   SWEETLEAD: an In Silico Database of Approved Drugs, Regulated Chemicals, and Herbal Isolates for Computer-Aided Drug Discovery [J].
Novick, Paul A. ;
Ortiz, Oscar F. ;
Poelman, Jared ;
Abdulhay, Amir Y. ;
Pande, Vijay S. .
PLOS ONE, 2013, 8 (11)
[38]   GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [J].
Qiu, Jiezhong ;
Chen, Qibin ;
Dong, Yuxiao ;
Zhang, Jing ;
Yang, Hongxia ;
Ding, Ming ;
Wang, Kuansan ;
Tang, Jie .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :1150-1160
[39]  
Ramsundar B., 2019, Deep Learning for the Life Sciences: Applying Deep Learning to Genomics, Microscopy, Drug Discovery, and More
[40]   ToxCast Chemical Landscape: Paving the Road to 21st Century Toxicology [J].
Richard, Ann M. ;
Judson, Richard S. ;
Houck, Keith A. ;
Grulke, Christopher M. ;
Volarath, Patra ;
Thillainadarajah, Inthirany ;
Yang, Chihae ;
Rathman, James ;
Martin, Matthew T. ;
Wambaugh, John F. ;
Knudsen, Thomas B. ;
Kancherla, Jayaram ;
Mansouri, Kamel ;
Patlewicz, Grace ;
Williams, Antony J. ;
Little, Stephen B. ;
Crofton, Kevin M. ;
Thomas, Russell S. .
CHEMICAL RESEARCH IN TOXICOLOGY, 2016, 29 (08) :1225-1251