Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation

被引：0

作者：

Singla, Sahil ^{[1
]}

Wallace, Eric ^{[1
]}

Feng, Shi ^{[1
]}

Feizi, Soheil ^{[1
]}

机构：

[1] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97 | 2019年 / 97卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Current saliency map interpretations for neural networks generally rely on two key assumptions. First, they use first-order approximations of the loss function, neglecting higher-order terms such as the loss curvature. Second, they evaluate each feature's importance in isolation, ignoring feature interdependencies. This work studies the effect of relaxing these two assumptions. First, we characterize a closed-form formula for the input Hessian matrix of a deep ReLU network. Using this, we show that, for classification problems with many classes, if a prediction has high probability then including the Hessian term has a small impact on the interpretation. We prove this result by demonstrating that these conditions cause the Hessian matrix to be approximately rank one and its leading eigenvector to be almost parallel to the gradient of the loss. We empirically validate this theory by interpreting ImageNet classifiers. Second, we incorporate feature interdependencies by calculating the importance of group-features using a sparsity regularization term. We use an L-0 - L-1 relaxation technique along with proximal gradient descent to efficiently compute group-feature importance values. Our empirical results show that our method significantly improves deep learning interpretations.

引用

页数：9

共 24 条

[1] [Anonymous], 2013, ICML
[2] [Anonymous], 2016, Understanding neural networks through representation erasure
[3] [Anonymous], 2017, 31 C NEURAL INFORM P
[4] A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
Beck, Amir
Teboulle, Marc
[J]. SIAM JOURNAL ON IMAGING SCIENCES, 2009, 2 (01): : 183 - 202
[5] Decoding by linear programming
Candes, EJ
Tao, T
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (12) : 4203 - 4215
[6] Compressed sensing
Donoho, DL
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (04) : 1289 - 1306
[7] Least angle regression - Rejoinder
Efron, B
Hastie, T
Johnstone, I
Tibshirani, R
[J]. ANNALS OF STATISTICS, 2004, 32 (02) : 494 - 499
[8] Feng S, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P3719
[9] Ghorbani A, 2019, AAAI CONF ARTIF INTE, P3681
[10] Glorot X., 2011, P 14 INT C ART INT S, P315

← 1 2 3 →