ExplainableFold: Understanding AlphaFold Prediction with Explainable AI

被引:4
作者
Tan, Juntao [1 ]
Zhang, Yongfeng [1 ]
机构
[1] Rutgers State Univ, New Brunswick, NJ 08854 USA
来源
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年
关键词
AlphaFold; Protein Structure Prediction; Explainable AI; Counterfactual Reasoning; AMINO-ACID SUBSTITUTIONS; PROTEIN-STRUCTURE; NEXT-GENERATION; MUTAGENESIS; SEQUENCE;
D O I
10.1145/3580305.3599337
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents ExplainableFold (xFold), which is an Explainable AI framework for protein structure prediction. Despite the success of AI-based methods such as AlphaFold (alpha Fold) in this field, the underlying reasons for their predictions remain unclear due to the black-box nature of deep learning models. To address this, we propose a counterfactual learning framework inspired by biological principles to generate counterfactual explanations for protein structure prediction, enabling a dry-lab experimentation approach. Our experimental results demonstrate the ability of ExplainableFold to generate high-quality explanations for AlphaFold's predictions, providing near-experimental understanding of the effects of amino acids on 3D protein structure. This framework has the potential to facilitate a deeper understanding of protein structures. Source code and data of the ExplainableFold project are available at https://github.com/rutgerswiselab/ExplainableFold.
引用
收藏
页码:2166 / 2176
页数:11
相关论文
共 66 条
[1]   EFFECTS OF SITE-SPECIFIC AMINO-ACID MODIFICATION ON PROTEIN INTERACTIONS AND BIOLOGICAL FUNCTION [J].
ACKERS, GK ;
SMITH, FR .
ANNUAL REVIEW OF BIOCHEMISTRY, 1985, 54 :597-629
[2]  
Ahdritz G., 2022, OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization, P2022, DOI [DOI 10.1101/2022.11.20.517210, 10.1101/2022.11.20.517210]
[3]   Machine learning in protein structure prediction [J].
AlQuraishi, Mohammed .
CURRENT OPINION IN CHEMICAL BIOLOGY, 2021, 65 :1-8
[4]   Random Single Amino Acid Deletion Sampling Unveils Structural Tolerance and the Benefits of Helical Registry Shift on GFP Folding and Structure [J].
Arpino, James A. J. ;
Reddington, Samuel C. ;
Halliwell, Lisa M. ;
Rizkallah, Pierre J. ;
Jones, D. Dafydd .
STRUCTURE, 2014, 22 (06) :889-898
[5]  
Benesty Jacob, 2009, The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation, P1, DOI DOI 10.4135/9781506326139
[6]  
Betts M.J., 2003, Bioinforma. Genet, DOI [DOI 10.1002/0470867302.CH14, 10.1002/9780470059180.ch13, DOI 10.1002/9780470059180.CH13]
[7]   OCCAM RAZOR [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
INFORMATION PROCESSING LETTERS, 1987, 24 (06) :377-380
[8]   SUGGESTIONS FOR SAFE RESIDUE SUBSTITUTIONS IN SITE-DIRECTED MUTAGENESIS [J].
BORDO, D ;
ARGOS, P .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 217 (04) :721-729
[9]   Decoding by linear programming [J].
Candes, EJ ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (12) :4203-4215
[10]  
CARTER P, 1986, BIOCHEM J, V237, P1