Transformer Interpretability Beyond Attention Visualization

被引:462
作者
Chefer, Hila [1 ]
Gur, Shir [1 ]
Wolf, Lior [1 ,2 ]
机构
[1] Tel Aviv Univ, Sch Comp Sci, Tel Aviv, Israel
[2] Facebook AI Res FAIR, Tel Aviv, Israel
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
基金
欧洲研究理事会;
关键词
D O I
10.1109/CVPR46437.2021.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-attention techniques, and specifically Transformers, are dominating the field of text processing and are becoming increasingly popular in computer vision classification tasks. In order to visualize the parts of the image that led to a certain classification, existing methods either rely on the obtained attention maps or employ heuristic propagation along the attention graph. In this work, we propose a novel way to compute relevancy for Transformer networks. The method assigns local relevance based on the Deep Taylor Decomposition principle and then propagates these relevancy scores through the layers. This propagation involves attention layers and skip connections, which challenge existing methods. Our solution is based on a specific formulation that is shown to maintain the total relevancy across layers. We benchmark our method on very recent visual Transformer networks, as well as on a text classification problem, and demonstrate a clear advantage over the existing explainability methods.
引用
收藏
页码:782 / 791
页数:10
相关论文
共 48 条
[1]  
Abnar S, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P4190
[2]  
[Anonymous], 2008, P 2008 C EMP METH NA
[3]  
[Anonymous], 2017, ADV NEUR IN
[4]   On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation [J].
Bach, Sebastian ;
Binder, Alexander ;
Montavon, Gregoire ;
Klauschen, Frederick ;
Mueller, Klaus-Robert ;
Samek, Wojciech .
PLOS ONE, 2015, 10 (07)
[5]   Layer-Wise Relevance Propagation for Neural Networks with Local Renormalization Layers [J].
Binder, Alexander ;
Montavon, Gregoire ;
Lapuschkin, Sebastian ;
Mueller, Klaus-Robert ;
Samek, Wojciech .
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 :63-71
[6]  
Carion Nicolas, 2020, EUROPEAN C COMPUTER
[7]  
Chen J., 2019, INT C LEARNING REPRE
[8]  
Chen M, 2020, PR MACH LEARN RES, V119
[9]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]  
DeYoung Jay, 2019, ARXIV191103429