Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

被引:0
作者
Ardis, Paul [1 ]
Flenner, Arjuna [2 ]
机构
[1] GE Aerosp Res, 1 Res Circle, Niskayuna, NY 12309 USA
[2] GE Aerosp, 3290 Patterson Ave SE, Grand Rapids, MI 49512 USA
来源
ASSURANCE AND SECURITY FOR AI-ENABLED SYSTEMS | 2024年 / 13054卷
关键词
D O I
10.1117/12.3012765
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Neural Networks (DNNs) do not inherently compute or exhibit empirically-justified task confidence. In mission critical applications, it is important to both understand associated DNN reasoning and its supporting evidence. In this paper, we propose a novel Bayesian approach to extract explanations, justifications, and uncertainty estimates from DNNs. Our approach is efficient both in terms of memory and computation, and can be applied to any black box DNN without any retraining, including applications to anomaly detection and out-of-distribution detection tasks. We validate our approach on the CIFAR-10 dataset, and show that it can significantly improve the interpretability and reliability of DNNs.
引用
收藏
页数:8
相关论文
共 50 条
[21]   Multi-bit, Black-box Watermarking of Deep Neural Networks in Embedded Applications [J].
Leroux, Sam ;
Vanassche, Stijn ;
Simoens, Pieter .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2024, :2121-2130
[22]   Explaining the black-box model: A survey of local interpretation methods for deep neural networks [J].
Liang, Yu ;
Li, Siguang ;
Yan, Chungang ;
Li, Maozhen ;
Jiang, Changjun .
NEUROCOMPUTING, 2021, 419 :168-182
[23]   Black-box error diagnosis in Deep Neural Networks for computer vision: a survey of tools [J].
Fraternali, Piero ;
Milani, Federico ;
Torres, Rocio Nahime ;
Zangrando, Niccolo .
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (04) :3041-3062
[24]   Practical Black-Box Attacks on Deep Neural Networks Using Efficient Query Mechanisms [J].
Bhagoji, Arjun Nitin ;
He, Warren ;
Li, Bo ;
Song, Dawn .
COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 :158-174
[25]   A CMA-ES-Based Adversarial Attack on Black-Box Deep Neural Networks [J].
Kuang, Xiaohui ;
Liu, Hongyi ;
Wang, Ye ;
Zhang, Qikun ;
Zhang, Quanxin ;
Zheng, Jun .
IEEE ACCESS, 2019, 7 :172938-172947
[26]   Black-box error diagnosis in Deep Neural Networks for computer vision: a survey of tools [J].
Piero Fraternali ;
Federico Milani ;
Rocio Nahime Torres ;
Niccolò Zangrando .
Neural Computing and Applications, 2023, 35 :3041-3062
[27]   Generative causal explanations of black-box classifiers [J].
O'Shaughnessy, Matthew ;
Canal, Gregory ;
Connor, Marissa ;
Davenport, Mark ;
Rozell, Christopher .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[28]   Learning Groupwise Explanations for Black-Box Models [J].
Gao, Jingyue ;
Wang, Xiting ;
Wang, Yasha ;
Yan, Yulan ;
Xie, Xing .
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, :2396-2402
[29]   Extracting Relational Explanations From Deep Neural Networks: A Survey From a Neural-Symbolic Perspective [J].
Townsend, Joe ;
Chaton, Thomas ;
Monteiro, Joao M. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (09) :3456-3470
[30]   Black-Box Multi-Robustness Testing for Neural Networks [J].
Downing, Mara ;
Bultan, Tevfik .
2025 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS, ICSTW, 2025, :423-432