Interaction-Based Inductive Bias in Graph Neural Networks: Enhancing Protein-Ligand Binding Affinity Predictions From 3D Structures

被引:5
作者
Yang, Ziduo [1 ]
Zhong, Weihe [2 ]
Lv, Qiujie [1 ]
Dong, Tiejun [1 ]
Chen, Guanxing [1 ]
Chen, Calvin Yu-Chian [3 ,4 ,5 ,6 ]
机构
[1] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Artificial Intelligence Med Res Ctr, Shenzhen Campus, Shenzhen 518107, Peoples R China
[2] Chinese Acad Sci, Shanghai Inst Mat Med, Zhongshan Inst Drug Discovery, Zhongshan 528400, Peoples R China
[3] Peking Univ, Shenzhen Grad Sch, AI Sci AI4S Preferred Program, Shenzhen 518055, Peoples R China
[4] Peking Univ, Shenzhen Grad Sch, Sch Elect & Comp Engn, Shenzhen 518055, Peoples R China
[5] China Med Univ Hosp, Dept Med Res, Taichung 404, Taiwan
[6] Asia Univ, Dept Bioinformat & Med Engn, Taichung 413, Taiwan
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Proteins; Programmable logic arrays; Three-dimensional displays; Predictive models; Graph neural networks; Data models; Convolution; Protein-ligand binding affinity; graph neural networks; inductive bias; drug-target interaction; structure-based virtual screening; SCORING FUNCTIONS; ACCURATE DOCKING; DATABASE; GLIDE;
D O I
10.1109/TPAMI.2024.3400515
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inductive bias in machine learning (ML) is the set of assumptions describing how a model makes predictions. Different ML-based methods for protein-ligand binding affinity (PLA) prediction have different inductive biases, leading to different levels of generalization capability and interpretability. Intuitively, the inductive bias of an ML-based model for PLA prediction should fit in with biological mechanisms relevant for binding to achieve good predictions with meaningful reasons. To this end, we propose an interaction-based inductive bias to restrict neural networks to functions relevant for binding with two assumptions: 1) A protein-ligand complex can be naturally expressed as a heterogeneous graph with covalent and non-covalent interactions; 2) The predicted PLA is the sum of pairwise atom-atom affinities determined by non-covalent interactions. The interaction-based inductive bias is embodied by an explainable heterogeneous interaction graph neural network (EHIGN) for explicitly modeling pairwise atom-atom interactions to predict PLA from 3D structures. Extensive experiments demonstrate that EHIGN achieves better generalization capability than other state-of-the-art ML-based baselines in PLA prediction and structure-based virtual screening. More importantly, comprehensive analyses of distance-affinity, pose-affinity, and substructure-affinity relations suggest that the interaction-based inductive bias can guide the model to learn atomic interactions that are consistent with physical reality. As a case study to demonstrate practical usefulness, our method is tested for predicting the efficacy of Nirmatrelvir against SARS-CoV-2 variants. EHIGN successfully recognizes the changes in the efficacy of Nirmatrelvir for different SARS-CoV-2 variants with meaningful reasons.
引用
收藏
页码:8191 / 8208
页数:18
相关论文
共 85 条
  • [1] Interpretable bilinear attention network with domain adaptation improves drug-target prediction
    Bai, Peizhen
    Miljkovic, Filip
    John, Bino
    Lu, Haiping
    [J]. NATURE MACHINE INTELLIGENCE, 2023, 5 (02) : 126 - 136
  • [2] Ballester Pedro J, 2019, Drug Discov Today Technol, V32-33, P81, DOI 10.1016/j.ddtec.2020.09.001
  • [3] A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking
    Ballester, Pedro J.
    Mitchell, John B. O.
    [J]. BIOINFORMATICS, 2010, 26 (09) : 1169 - 1175
  • [4] Bauer MR, 2013, J CHEM INF MODEL, V53, P1447, DOI [10.1021/ci400115b, 10.1021/ci400115bl]
  • [5] Learning from the ligand: using ligand-based features to improve binding affinity prediction
    Boyles, Fergus
    Deane, Charlotte M.
    Morris, Garrett M.
    [J]. BIOINFORMATICS, 2020, 36 (03) : 758 - 764
  • [6] Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening
    Cang, Zixuan
    Mu, Lin
    Wei, Guo-Wei
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (01)
  • [7] Sequence-based drug design as a concept in computational drug design
    Chen, Lifan
    Fan, Zisheng
    Chang, Jie
    Yang, Ruirui
    Hou, Hui
    Guo, Hao
    Zhang, Yinghui
    Yang, Tianbiao
    Zhou, Chenmao
    Sui, Qibang
    Chen, Zhengyang
    Zheng, Chen
    Hao, Xinyue
    Zhang, Keke
    Cui, Rongrong
    Zhang, Zehong
    Ma, Hudson
    Ding, Yiluan
    Zhang, Naixia
    Lu, Xiaojie
    Luo, Xiaomin
    Jiang, Hualiang
    Zhang, Sulin
    Zheng, Mingyue
    [J]. NATURE COMMUNICATIONS, 2023, 14 (01)
  • [8] Structural interaction fingerprint (SIFt): A novel method for analyzing three-dimensional protein-ligand binding interactions
    Deng, Z
    Chuaqui, C
    Singh, J
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (02) : 337 - 344
  • [9] Encoding Protein-Ligand Interaction Patterns in Fingerprints and Graphs
    Desaphy, Jeremy
    Raimbaud, Eric
    Ducrot, Pierre
    Rognan, Didier
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (03) : 623 - 637
  • [10] The power metric: a new statistically robust enrichment-type metric for virtual screening applications with early recovery capability
    Dias Lopes, Julio Cesar
    dos Santos, Fabio Mendes
    Martins-Jose, Andrelly
    Augustyns, Koen
    De Winter, Hans
    [J]. JOURNAL OF CHEMINFORMATICS, 2017, 9