Interaction-Based Inductive Bias in Graph Neural Networks: Enhancing Protein-Ligand Binding Affinity Predictions From 3D Structures

被引：5

作者：

Yang, Ziduo ^{[1
]}

Zhong, Weihe ^{[2
]}

Lv, Qiujie ^{[1
]}

Dong, Tiejun ^{[1
]}

Chen, Guanxing ^{[1
]}

Chen, Calvin Yu-Chian ^{[3
,4
,5
,6
]}

机构：

[1] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Artificial Intelligence Med Res Ctr, Shenzhen Campus, Shenzhen 518107, Peoples R China

[2] Chinese Acad Sci, Shanghai Inst Mat Med, Zhongshan Inst Drug Discovery, Zhongshan 528400, Peoples R China

[3] Peking Univ, Shenzhen Grad Sch, AI Sci AI4S Preferred Program, Shenzhen 518055, Peoples R China

[4] Peking Univ, Shenzhen Grad Sch, Sch Elect & Comp Engn, Shenzhen 518055, Peoples R China

[5] China Med Univ Hosp, Dept Med Res, Taichung 404, Taiwan

[6] Asia Univ, Dept Bioinformat & Med Engn, Taichung 413, Taiwan

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Proteins; Programmable logic arrays; Three-dimensional displays; Predictive models; Graph neural networks; Data models; Convolution; Protein-ligand binding affinity; graph neural networks; inductive bias; drug-target interaction; structure-based virtual screening; SCORING FUNCTIONS; ACCURATE DOCKING; DATABASE; GLIDE;

D O I：

10.1109/TPAMI.2024.3400515

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Inductive bias in machine learning (ML) is the set of assumptions describing how a model makes predictions. Different ML-based methods for protein-ligand binding affinity (PLA) prediction have different inductive biases, leading to different levels of generalization capability and interpretability. Intuitively, the inductive bias of an ML-based model for PLA prediction should fit in with biological mechanisms relevant for binding to achieve good predictions with meaningful reasons. To this end, we propose an interaction-based inductive bias to restrict neural networks to functions relevant for binding with two assumptions: 1) A protein-ligand complex can be naturally expressed as a heterogeneous graph with covalent and non-covalent interactions; 2) The predicted PLA is the sum of pairwise atom-atom affinities determined by non-covalent interactions. The interaction-based inductive bias is embodied by an explainable heterogeneous interaction graph neural network (EHIGN) for explicitly modeling pairwise atom-atom interactions to predict PLA from 3D structures. Extensive experiments demonstrate that EHIGN achieves better generalization capability than other state-of-the-art ML-based baselines in PLA prediction and structure-based virtual screening. More importantly, comprehensive analyses of distance-affinity, pose-affinity, and substructure-affinity relations suggest that the interaction-based inductive bias can guide the model to learn atomic interactions that are consistent with physical reality. As a case study to demonstrate practical usefulness, our method is tested for predicting the efficacy of Nirmatrelvir against SARS-CoV-2 variants. EHIGN successfully recognizes the changes in the efficacy of Nirmatrelvir for different SARS-CoV-2 variants with meaningful reasons.

引用

页码：8191 / 8208

页数：18

共 85 条

[1] Interpretable bilinear attention network with domain adaptation improves drug-target prediction
Bai, Peizhen
Miljkovic, Filip
John, Bino
Lu, Haiping
[J]. NATURE MACHINE INTELLIGENCE, 2023, 5 (02) : 126 - 136
[2] Ballester Pedro J, 2019, Drug Discov Today Technol, V32-33, P81, DOI 10.1016/j.ddtec.2020.09.001
[3] A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking
Ballester, Pedro J.
Mitchell, John B. O.
[J]. BIOINFORMATICS, 2010, 26 (09) : 1169 - 1175
[4] Bauer MR, 2013, J CHEM INF MODEL, V53, P1447, DOI [10.1021/ci400115b, 10.1021/ci400115bl]
[5] Learning from the ligand: using ligand-based features to improve binding affinity prediction
Boyles, Fergus
Deane, Charlotte M.
Morris, Garrett M.
[J]. BIOINFORMATICS, 2020, 36 (03) : 758 - 764
[6] Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening
Cang, Zixuan
Mu, Lin
Wei, Guo-Wei
[J]. PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (01)
[7] Sequence-based drug design as a concept in computational drug design
Chen, Lifan
Fan, Zisheng
Chang, Jie
Yang, Ruirui
Hou, Hui
Guo, Hao
Zhang, Yinghui
Yang, Tianbiao
Zhou, Chenmao
Sui, Qibang
Chen, Zhengyang
Zheng, Chen
Hao, Xinyue
Zhang, Keke
Cui, Rongrong
Zhang, Zehong
Ma, Hudson
Ding, Yiluan
Zhang, Naixia
Lu, Xiaojie
Luo, Xiaomin
Jiang, Hualiang
Zhang, Sulin
Zheng, Mingyue
[J]. NATURE COMMUNICATIONS, 2023, 14 (01)
[8] Structural interaction fingerprint (SIFt): A novel method for analyzing three-dimensional protein-ligand binding interactions
Deng, Z
Chuaqui, C
Singh, J
[J]. JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (02) : 337 - 344
[9] Encoding Protein-Ligand Interaction Patterns in Fingerprints and Graphs
Desaphy, Jeremy
Raimbaud, Eric
Ducrot, Pierre
Rognan, Didier
[J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (03) : 623 - 637
[10] The power metric: a new statistically robust enrichment-type metric for virtual screening applications with early recovery capability
Dias Lopes, Julio Cesar
dos Santos, Fabio Mendes
Martins-Jose, Andrelly
Augustyns, Koen
De Winter, Hans
[J]. JOURNAL OF CHEMINFORMATICS, 2017, 9

← 1 2 3 4 5 6 7 8 9 →