Gradient-Based Instance-Specific Visual Explanations for Object Specification and Object Discrimination

被引:2
|
作者
Zhao, Chenyang [1 ]
Hsiao, Janet H. [2 ]
Chan, Antoni B. [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Div Social Sci, Hong Kong, Peoples R China
关键词
Detectors; Visualization; Heat maps; Task analysis; Object detection; Predictive models; Transformers; Deep learning; explainable AI; explaining object detection; gradient-based explanation; human eye gaze; instance-level explanation; knowledge distillation; non-maximum suppression; object discrimination; object specification; NMS;
D O I
10.1109/TPAMI.2024.3380604
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visual explanation technique for interpreting the predictions of object detectors. Utilizing the gradients of detector targets flowing into the intermediate feature maps, ODAM produces heat maps that show the influence of regions on the detector's decision for each predicted attribute. Compared to previous works on classification activation maps (CAM), ODAM generates instance-specific explanations rather than class-specific ones. We show that ODAM is applicable to one-stage, two-stage, and transformer-based detectors with different types of detector backbones and heads, and produces higher-quality visual explanations than the state-of-the-art in terms of both effectiveness and efficiency. We discuss two explanation tasks for object detection: 1) object specification: what is the important region for the prediction? 2) object discrimination: which object is detected? Aiming at these two aspects, we present a detailed analysis of the visual explanations of detectors and carry out extensive experiments to validate the effectiveness of the proposed ODAM. Furthermore, we investigate user trust on the explanation maps, how well the visual explanations of object detectors agrees with human explanations, as measured through human eye gaze, and whether this agreement is related with user trust. Finally, we also propose two applications, ODAM-KD and ODAM-NMS, based on these two abilities of ODAM. ODAM-KD utilizes the object specification of ODAM to generate top-down attention for key predictions and instruct the knowledge distillation of object detection. ODAM-NMS considers the location of the model's explanation for each prediction to distinguish the duplicate detected objects. A training scheme, ODAM-Train, is proposed to improve the quality on object discrimination, and help with ODAM-NMS.
引用
收藏
页码:5967 / 5985
页数:19
相关论文
共 50 条
  • [21] A Novel Pornographic Visual Content Classifier based on Sensitive Object Detection
    Dinh-Duy Phan
    Thanh-Thien Nguyen
    Quang-Huy Nguyen
    Hoang-Loc Tran
    Khac-Ngoc-Khoi Nguyen
    Duc-Lung Vu
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (05) : 787 - 795
  • [22] Sparse Transformer-Based Sequence Generation for Visual Object Tracking
    Tian, Dan
    Liu, Dong-Xin
    Wang, Xiao
    Hao, Ying
    IEEE ACCESS, 2024, 12 : 154418 - 154425
  • [23] Graph-Based Object Semantic Refinement for Visual Emotion Recognition
    Zhang, Jing
    Liu, Xinyu
    Wang, Zhe
    Yang, Hai
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3036 - 3049
  • [24] Traffic Object Detection and Recognition Based on the Attentional Visual Field of Drivers
    Shirpour, Mohsen
    Khairdoost, Nima
    Bauer, Michael A.
    Beauchemin, Steven S.
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (01): : 594 - 604
  • [25] A Novel Algorithm Based on a Common Subspace Fusion for Visual Object Tracking
    Javed, Sajid
    Mahmood, Arif
    Ullah, Ihsan
    Bouwmans, Thierry
    Khonji, Majid
    Dias, Jorge Manuel Miranda
    Werghi, Naoufel
    IEEE ACCESS, 2022, 10 : 24690 - 24703
  • [26] SPATIAL SENSITIVE GRAD-CAM: VISUAL EXPLANATIONS FOR OBJECT DETECTION BY INCORPORATING SPATIAL SENSITIVITY
    Yamauchi, Toshinori
    Ishikawa, Masayoshi
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 256 - 260
  • [27] Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
    Theodorou, Charalambos
    Velisavljevic, Vladan
    Dyo, Vladimir
    SENSORS, 2022, 22 (19)
  • [28] Bimodal-based Object Detection and Instance Segmentation Models for Substation Equipments
    Yan, Nannan
    Zhou, Taiji
    Gu, Chunjie
    Jiang, Anfeng
    Lu, Wenlian
    IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 428 - 434
  • [29] Visual SLAM in dynamic environments based on object detection
    Ai, Yong-bao
    Rui, Ting
    Yang, Xiao-qiang
    He, Jia-lin
    Fu, Lei
    Li, Jian-bin
    Lu, Ming
    DEFENCE TECHNOLOGY, 2021, 17 (05) : 1712 - 1721
  • [30] Fast object detection based on selective visual attention
    Guo, Mingwei
    Zhao, Yuzhou
    Zhang, Chenbin
    Chen, Zonghai
    NEUROCOMPUTING, 2014, 144 : 184 - 197