Interpretability research of deep learning: A literature survey

被引：26

作者：

Xu, Biao ^{[1
]}

Yang, Guanci ^{[1
]}

机构：

[1] Guizhou Univ, Key Lab Adv Mfg Technol, Minist Educ, Guiyang 550025, Peoples R China

来源：

INFORMATION FUSION | 2025年 / 115卷

基金：

中国国家自然科学基金;

关键词：

Deep learning; Interpretability; Active explanations; Passive explanations; Explainable artificial intelligence; NEURAL-NETWORKS; EXPLANATIONS; SENSITIVITY; SYSTEMS; PREDICTION; FRAMEWORK; ACCURACY; MODELS;

D O I：

10.1016/j.inffus.2024.102721

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning (DL) has been widely used in various fields. However, its black-box nature limits people's understanding and trust in its decision-making process. Therefore, it becomes crucial to research the DL interpretability, which can elucidate the model's decision-making processes and behaviors. This review provides an overview of the current status of interpretability research. First, the DL's typical models, principles, and applications are introduced. Then, the definition and significance of interpretability are clarified. Subsequently, some typical interpretability algorithms are introduced into four groups: active, passive, supplementary, and integrated explanations. After that, several evaluation indicators for interpretability are briefly described, and the relationship between interpretability and model performance is explored. Next, the specific applications of some interpretability methods/models in actual scenarios are introduced. Finally, the interpretability research challenges and future development directions are discussed.

引用

页数：46

共 316 条

[11]

ALVAREZ M.D ..., 2018, Adv. Neural Inf. Process. Syst., V31, DOI [10.48550/arXiv.1806.07538, DOI 10.48550/ARXIV.1806.07538]

[12]

Alvarez-Melis D, 2018, Arxiv, DOI arXiv:1806.08049

[13] GEnI: A framework for the generation of explanations and insights of knowledge graph embedding predictions [J].

Amador-Dominguez, Elvira ;

Serrano, Emilio ;

Manrique, Daniel .

NEUROCOMPUTING, 2023, 521 :199-212

[14]

Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640

[15] Visualizing the effects of predictor variables in black box supervised learning models [J].

Apley, Daniel W. ;

Zhu, Jingyu .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2020, 82 (04) :1059-1086

[16] Explainable Artificial Intelligence for Autonomous Driving: A Comprehensive Overview and Field Guide for Future Research Directions [J].

Atakishiyev, Shahin ;

Salameh, Mohammad ;

Yao, Hengshuai ;

Goebel, Randy .

IEEE ACCESS, 2024, 12 :101603-101625

[17]

BANG S., 2021, P AAAI C ART INT, DOI [10.48550/arXiv.1902.06918C, DOI 10.48550/ARXIV.1902.06918C]

[18] Relation between prognostics predictor evaluation metrics andlocal interpretability SHAP values [J].

Baptista, Marcia L. ;

Goebel, Kai ;

Henriques, Elsa M. P. .

ARTIFICIAL INTELLIGENCE, 2022, 306

[19] Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI [J].

Barredo Arrieta, Alejandro ;

Diaz-Rodriguez, Natalia ;

Del Ser, Javier ;

Bennetot, Adrien ;

Tabik, Siham ;

Barbado, Alberto ;

Garcia, Salvador ;

Gil-Lopez, Sergio ;

Molina, Daniel ;

Benjamins, Richard ;

Chatila, Raja ;

Herrera, Francisco .

INFORMATION FUSION, 2020, 58 :82-115

[20] Deep learning: a statistical viewpoint [J].

Bartlett, Peter L. ;

Montanari, Andrea ;

Rakhlin, Alexander .

ACTA NUMERICA, 2021, 30 :87-201

← 1 2 3 4 5 6 7 8 9 10 →