A Topology-Enhanced Multi-Viewed Contrastive Approach for Molecular Graph Representation Learning and Classification

被引:3
作者
Pham, Phu [1 ]
机构
[1] HUTECH Univ, Fac Informat Technol, Ho Chi Minh City, Vietnam
关键词
graph contrastive learning; graph neural network; molecular graph learning; topological graph neural network;
D O I
10.1002/minf.202400252
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
In recent times, graph representation learning has been becoming a hot research topic which has attracted a lot of attention from researchers. Graph embeddings have diverse applications across fields such as information and social network analysis, bioinformatics and cheminformatics, natural language processing (NLP), and recommendation systems. Among the advanced deep learning (DL) based architectures used in graph representation learning, graph neural networks (GNNs) have emerged as the dominant and highly effective framework. The recent GNN-based methods have demonstrated state-of-the-art performance on complex supervised and unsupervised tasks at both the node and graph levels. In recent years, to enhance multi-view and structured graph representations, contrastive learning-based techniques have been developed, introducing models known as graph contrastive learning (GCL) models. These GCL approaches leverage unsupervised contrastive methods to capture multi-view graph representations by comparing node and graph embeddings, yielding significant improvements in both graph-level representations and task-specific applications, such as molecular embedding and classification. However, as most GCL techniques are primarily designed to focus on the explicit graph structure through GNN-based encoders, they often overlook critical topological insights that could be provided through topological data analysis (TDA). Given the promising research indicating that topological features can greatly benefit various graph learning tasks, we propose a novel topology-enhanced, multi-view graph contrastive learning model called TMGCL. Our TMGCL model is designed to capture and utilize both comprehensive multi-scale topological and global structural information from graphs. This enhanced representation capability positions TMGCL to directly support a range of applications, such as molecular classification, with improved accuracy and robustness. Extensive experiments within two real-world datasets proved the effectiveness and outperformance of our proposed TMGCL in comparing with state-of-the-art GNN/GCL-based baselines.
引用
收藏
页数:15
相关论文
共 41 条
[1]  
Chen DL, 2020, AAAI CONF ARTIF INTE, V34, P3438
[2]   Single-step retrosynthesis prediction by leveraging commonly preserved substructures [J].
Fang, Lei ;
Li, Junren ;
Zhao, Ming ;
Tan, Li ;
Lou, Jian-Guang .
NATURE COMMUNICATIONS, 2023, 14 (01)
[3]  
Feng WZ, 2020, ADV NEUR IN, V33
[4]   node2vec: Scalable Feature Learning for Networks [J].
Grover, Aditya ;
Leskovec, Jure .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :855-864
[5]  
Hamilton WL, 2017, ADV NEUR IN, V30
[6]  
Hassani K, 2020, PR MACH LEARN RES, V119
[7]  
Hjelm R.D., 2019, INT C LEARN REPR ICL
[8]  
Hjelm R Devon, 2018, ICLR
[9]  
Hofer C., 2020, ICML
[10]  
Horn M., 2022, ICLR