Gradient-Based Instance-Specific Visual Explanations for Object Specification and Object Discrimination

被引：2

作者：

Zhao, Chenyang ^{[1
]}

Hsiao, Janet H. ^{[2
]}

Chan, Antoni B. ^{[1
]}

机构：

[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

[2] Hong Kong Univ Sci & Technol, Div Social Sci, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 09期

关键词：

Detectors; Visualization; Heat maps; Task analysis; Object detection; Predictive models; Transformers; Deep learning; explainable AI; explaining object detection; gradient-based explanation; human eye gaze; instance-level explanation; knowledge distillation; non-maximum suppression; object discrimination; object specification; NMS;

D O I：

10.1109/TPAMI.2024.3380604

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visual explanation technique for interpreting the predictions of object detectors. Utilizing the gradients of detector targets flowing into the intermediate feature maps, ODAM produces heat maps that show the influence of regions on the detector's decision for each predicted attribute. Compared to previous works on classification activation maps (CAM), ODAM generates instance-specific explanations rather than class-specific ones. We show that ODAM is applicable to one-stage, two-stage, and transformer-based detectors with different types of detector backbones and heads, and produces higher-quality visual explanations than the state-of-the-art in terms of both effectiveness and efficiency. We discuss two explanation tasks for object detection: 1) object specification: what is the important region for the prediction? 2) object discrimination: which object is detected? Aiming at these two aspects, we present a detailed analysis of the visual explanations of detectors and carry out extensive experiments to validate the effectiveness of the proposed ODAM. Furthermore, we investigate user trust on the explanation maps, how well the visual explanations of object detectors agrees with human explanations, as measured through human eye gaze, and whether this agreement is related with user trust. Finally, we also propose two applications, ODAM-KD and ODAM-NMS, based on these two abilities of ODAM. ODAM-KD utilizes the object specification of ODAM to generate top-down attention for key predictions and instruct the knowledge distillation of object detection. ODAM-NMS considers the location of the model's explanation for each prediction to distinguish the duplicate detected objects. A training scheme, ODAM-Train, is proposed to improve the quality on object discrimination, and help with ODAM-NMS.

引用

页码：5967 / 5985

页数：19

共 50 条

[21] A Novel Pornographic Visual Content Classifier based on Sensitive Object Detection
Dinh-Duy Phan
Thanh-Thien Nguyen
Quang-Huy Nguyen
Hoang-Loc Tran
Khac-Ngoc-Khoi Nguyen
Duc-Lung Vu
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (05) : 787 - 795
[22] Sparse Transformer-Based Sequence Generation for Visual Object Tracking
Tian, Dan
Liu, Dong-Xin
Wang, Xiao
Hao, Ying
IEEE ACCESS, 2024, 12 : 154418 - 154425
[23] Graph-Based Object Semantic Refinement for Visual Emotion Recognition
Zhang, Jing
Liu, Xinyu
Wang, Zhe
Yang, Hai
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3036 - 3049
[24] Traffic Object Detection and Recognition Based on the Attentional Visual Field of Drivers
Shirpour, Mohsen
Khairdoost, Nima
Bauer, Michael A.
Beauchemin, Steven S.
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (01): : 594 - 604
[25] A Novel Algorithm Based on a Common Subspace Fusion for Visual Object Tracking
Javed, Sajid
Mahmood, Arif
Ullah, Ihsan
Bouwmans, Thierry
Khonji, Majid
Dias, Jorge Manuel Miranda
Werghi, Naoufel
IEEE ACCESS, 2022, 10 : 24690 - 24703
[26] SPATIAL SENSITIVE GRAD-CAM: VISUAL EXPLANATIONS FOR OBJECT DETECTION BY INCORPORATING SPATIAL SENSITIVITY
Yamauchi, Toshinori
Ishikawa, Masayoshi
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 256 - 260
[27] Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
Theodorou, Charalambos
Velisavljevic, Vladan
Dyo, Vladimir
SENSORS, 2022, 22 (19)
[28] Bimodal-based Object Detection and Instance Segmentation Models for Substation Equipments
Yan, Nannan
Zhou, Taiji
Gu, Chunjie
Jiang, Anfeng
Lu, Wenlian
IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 428 - 434
[29] Visual SLAM in dynamic environments based on object detection
Ai, Yong-bao
Rui, Ting
Yang, Xiao-qiang
He, Jia-lin
Fu, Lei
Li, Jian-bin
Lu, Ming
DEFENCE TECHNOLOGY, 2021, 17 (05) : 1712 - 1721
[30] Fast object detection based on selective visual attention
Guo, Mingwei
Zhao, Yuzhou
Zhang, Chenbin
Chen, Zonghai
NEUROCOMPUTING, 2014, 144 : 184 - 197

← 1 2 3 4 5 →