C2F-Explainer: Explaining Transformers Better Through a Coarse-to-Fine Strategy

被引:1
|
作者
Ding, Weiping [1 ]
Cheng, Xiaotian [1 ]
Geng, Yu [1 ]
Huang, Jiashuang [1 ]
Ju, Hengrong [1 ]
机构
[1] Nantong Univ, Sch Artificial Intelligence & Comp Sci, Nantong 226019, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Transformers; Head; Feature extraction; Computer vision; Visualization; Semantics; Computational modeling; Interpretable method; perturbation mask; self-attention mechanism; sequential three-way decision;
D O I
10.1109/TKDE.2024.3443888
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer interpretability research is a hot topic in the area of deep learning. Traditional interpretation methods mostly use the final layer output of the Transformer encoder as masks to generate an explanation map. However, These approaches overlook two crucial aspects. At the coarse-grained level, the mask may contain uncertain information, including unreliable and incomplete object location data; at the fine-grained level, there is information loss on the mask, resulting in spatial noise and detail loss. To address these issues, in this paper, we propose a two-stage coarse-to-fine strategy (C2F-Explainer) for improving Transformer interpretability. Specifically, we first design a sequential three-way mask (S3WM) module to handle the problem of uncertain information at the coarse-grained level. This module uses sequential three-way decisions to process the mask, preventing uncertain information on the mask from impacting the interpretation results, thus obtaining coarse-grained interpretation results with accurate position. Second, to further reduce the impact of information loss at the fine-grained level, we devised an attention fusion (AF) module inspired by the fact that self-attention can capture global semantic information, AF aggregates the attention matrix to generate a cross-layer relation matrix, which is then used to optimize detailed information on the interpretation results and produce fine-grained interpretation results with clear and complete edges. Experimental results show that the proposed C2F-Explainer has good interpretation results on both natural and medical image datasets, and the mIoU is improved by 2.08% on the PASCAL VOC 2012 dataset.
引用
收藏
页码:7708 / 7724
页数:17
相关论文
共 22 条
  • [1] C2F: An effective coarse-to-fine network for video summarization
    Jin, Ye
    Tian, Xiaoyan
    Zhang, Zhao
    Liu, Peng
    Tang, Xianglong
    IMAGE AND VISION COMPUTING, 2024, 144
  • [2] C2F: Coarse-to-fine vision control system for automated microassembly
    Tripathi S.
    Jain D.R.
    Sharma H.D.
    Nanoscience and Nanotechnology - Asia, 2019, 9 (02): : 229 - 239
  • [3] C2F-Net: Coarse-to-Fine Multidrone Collaborative Perception Network for Object Trajectory Prediction
    Chen, Mingxin
    Wang, Zhirui
    Wang, Zhechao
    Zhao, Liangjin
    Cheng, Peirui
    Wang, Hongqi
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 6314 - 6328
  • [4] C2F-CFN: Coarse-to-Fine ClothFlow Network for High-Fidelity Virtual Try-On
    Bi, Yanli
    Qi, Lizhe
    Sun, Yunquan
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [5] C2-CRS: Coarse-to-Fine Contrastive Learning for Conversational Recommender System
    Zhou, Yuanhang
    Zhou, Kun
    Zhao, Wayne Xin
    Wang, Cheng
    Jiang, Peng
    Hu, He
    WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 1488 - 1496
  • [6] C2FDA: Coarse-to-Fine Domain Adaptation for Traffic Object Detection
    Zhang, Hui
    Luo, Guiyang
    Li, Jinglin
    Wang, Fei-Yue
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 12633 - 12647
  • [7] C2FAR: Coarse-to-Fine Autoregressive Networks for Precise Probabilistic Forecasting
    Bergsma, Shane
    Zeyl, Timothy
    Anaraki, Javad Rahimipour
    Guo, Lei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer
    Wei, Dongxu
    Xu, Xiaowei
    Shen, Haibin
    Huang, Kejie
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2852 - 2860
  • [9] Vision-based 2-D automatic micrograsping using coarse-to-fine grasping strategy
    Ren, Lu
    Wang, Lidai
    Mills, James K.
    Sun, Dong
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2008, 55 (09) : 3324 - 3331
  • [10] C2F-3DToothSeg: Coarse-to-fine 3D tooth segmentation via intuitive single clicks
    Jiang, Xiaotong
    Xu, Benlian
    Wei, Mingqiang
    Wu, Ke
    Yang, Siyuan
    Qian, Longgen
    Liu, Ningzhong
    Peng, Qingjin
    Computers and Graphics (Pergamon), 2022, 102 : 601 - 609