Self-Supervised Pre-Training via Multi-View Graph Information Bottleneck for Molecular Property Prediction

被引:1
|
作者
Zang, Xuan [1 ]
Zhang, Junjie [1 ]
Tang, Buzhou [2 ,3 ]
机构
[1] Harbin Inst Technol Shenzhen, Sch Comp Sci & Technol, Shenzhen 518000, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518000, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518066, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Task analysis; Drugs; Graph neural networks; Representation learning; Perturbation methods; Message passing; Data mining; Drug analysis; graph neural networks; molecular property prediction; molecular pre-training;
D O I
10.1109/JBHI.2024.3422488
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Molecular representation learning has remarkably accelerated the development of drug analysis and discovery. It implements machine learning methods to encode molecule embeddings for diverse downstream drug-related tasks. Due to the scarcity of labeled molecular data, self-supervised molecular pre-training is promising as it can handle large-scale unlabeled molecular data to prompt representation learning. Although many universal graph pre-training methods have been successfully introduced into molecular learning, there still exist some limitations. Many graph augmentation methods, such as atom deletion and bond perturbation, tend to destroy the intrinsic properties and connections of molecules. In addition, identifying subgraphs that are important to specific chemical properties is also challenging for molecular learning. To address these limitations, we propose the self-supervised Molecular Graph Information Bottleneck (MGIB) model for molecular pre-training. MGIB observes molecular graphs from the atom view and the motif view, deploys a learnable graph compression process to extract the core subgraphs, and extends the graph information bottleneck into the self-supervised molecular pre-training framework. Model analysis validates the contribution of the self-supervised graph information bottleneck and illustrates the interpretability of MGIB through the extracted subgraphs. Extensive experiments involving molecular property prediction, including 7 binary classification tasks and 6 regression tasks demonstrate the effectiveness and superiority of our proposed MGIB.
引用
收藏
页码:7659 / 7669
页数:11
相关论文
共 47 条
  • [1] Self-Supervised Information Bottleneck for Deep Multi-View Subspace Clustering
    Wang, Shiye
    Li, Changsheng
    Li, Yanming
    Yuan, Ye
    Wang, Guoren
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 (1555-1567) : 1555 - 1567
  • [2] MVEB: Self-Supervised Learning With Multi-View Entropy Bottleneck
    Wen, Liangjian
    Wang, Xiasi
    Liu, Jianzhuang
    Xu, Zenglin
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 6097 - 6108
  • [3] Semi-Supervised and Self-Supervised Classification with Multi-View Graph Neural Networks
    Yuan, Jinliang
    Yu, Hualei
    Cao, Meng
    Xu, Ming
    Xie, Junyuan
    Wang, Chongjun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 2466 - 2476
  • [4] Self-Supervised Graph Convolutional Network for Multi-View Clustering
    Xia, Wei
    Wang, Qianqian
    Gao, Quanxue
    Zhang, Xiangdong
    Gao, Xinbo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 24 : 3182 - 3192
  • [5] Self-Supervised Graph Information Bottleneck for Multiview Molecular Embedding Learning
    Li C.
    Mao K.
    Wang S.
    Yuan Y.
    Wang G.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (04): : 1554 - 1562
  • [6] A Multi-view Molecular Pre-training with Generative Contrastive Learning
    Liu, Yunwu
    Zhang, Ruisheng
    Yuan, Yongna
    Ma, Jun
    Li, Tongfeng
    Yu, Zhixuan
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2024, 16 (03) : 741 - 754
  • [7] Learnable Graph Guided Deep Multi-View Representation Learning via Information Bottleneck
    Zhao, Liang
    Wang, Xiao
    Liu, Zhenjiao
    Wang, Ziyue
    Chen, Zhikui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (04) : 3303 - 3314
  • [8] Object Adaptive Self-Supervised Dense Visual Pre-Training
    Zhang, Yu
    Zhang, Tao
    Zhu, Hongyuan
    Chen, Zihan
    Mi, Siya
    Peng, Xi
    Geng, Xin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 2228 - 2240
  • [9] Contrastive Self-Supervised Pre-Training for Video Quality Assessment
    Chen, Pengfei
    Li, Leida
    Wu, Jinjian
    Dong, Weisheng
    Shi, Guangming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 458 - 471
  • [10] SPAKT: A Self-Supervised Pre-TrAining Method for Knowledge Tracing
    Ma, Yuling
    Han, Peng
    Qiao, Huiyan
    Cui, Chaoran
    Yin, Yilong
    Yu, Dehu
    IEEE ACCESS, 2022, 10 : 72145 - 72154