Dual Attention Feature Fusion for Visible-Infrared Object Detection

被引：1

作者：

Hu, Yuxuan ^{[1
,2
]}

Shi, Limin ^{[3
]}

Yao, Libo ^{[4
]}

Weng, Lubin ^{[3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[3] Chinese Acad Sci, Inst Automat, Res Ctr Aerosp Informat, Beijing, Peoples R China

[4] Naval Aviat Univ, Inst Informat Fus, Yantai, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII | 2023年 / 14260卷

基金：

中国国家自然科学基金;

关键词：

Feature fusion; Visible-infrared; Object detection;

D O I：

10.1007/978-3-031-44195-0_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Feature fusion is an essential component of multimodal object detection to exploit the complementary information and common information between multi-source images. When it comes to visible-infrared image pairs, however, the visible images are prone to illumination and visibility and there may be a lot of interference information and little useful information. We suggest performing common feature enhancement and spatial cross attention sequentially to solve this problem. For this purpose, a novel Dual Attention Transformer Feature Fusion (DATFF) module which is designed for feature fusion of intermediate feature maps is proposed. We integrate it into two-stream object detectors and achieve state-of-the-art performance on DroneVehicle and FLIR visible-infrared object detection datasets. Our code is available at https://github.com/a21401624/DATFF.

引用

页码：53 / 65

页数：13

共 26 条

[1] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[2] Chen K, 2019, Arxiv, DOI arXiv:1906.07155
[3] Multimodal Object Detection via Probabilistic Ensembling
Chen, Yi-Ting
Shi, Jinghao
Ye, Zelin
Mertz, Christoph
Ramanan, Deva
Kong, Shu
[J]. COMPUTER VISION, ECCV 2022, PT IX, 2022, 13669 : 139 - 158
[4] Learning RoI Transformer for Oriented Object Detection in Aerial Images
Ding, Jian
Xue, Nan
Long, Yang
Xia, Gui-Song
Lu, Qikai
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2844 - 2853
[5] Dosovitskiy A., 2021, ICLR
[6] Fang Q., arXiv
[7] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
Fang Qingyun
Wang Zhaokui
[J]. PATTERN RECOGNITION, 2022, 130
[8] Kailai Zhou, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12363), P787, DOI 10.1007/978-3-030-58523-5_46
[9] Illumination-aware faster R-CNN for robust multispectral pedestrian detection
Li, Chengyang
Song, Dan
Tong, Ruofeng
Tang, Min
[J]. PATTERN RECOGNITION, 2019, 85 : 161 - 171
[10] Feature Pyramid Networks for Object Detection
Lin, Tsung-Yi
Dollar, Piotr
Girshick, Ross
He, Kaiming
Hariharan, Bharath
Belongie, Serge
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944

← 1 2 3 →