Dual Attention Feature Fusion for Visible-Infrared Object Detection

被引:1
作者
Hu, Yuxuan [1 ,2 ]
Shi, Limin [3 ]
Yao, Libo [4 ]
Weng, Lubin [3 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Res Ctr Aerosp Informat, Beijing, Peoples R China
[4] Naval Aviat Univ, Inst Informat Fus, Yantai, Peoples R China
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII | 2023年 / 14260卷
基金
中国国家自然科学基金;
关键词
Feature fusion; Visible-infrared; Object detection;
D O I
10.1007/978-3-031-44195-0_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature fusion is an essential component of multimodal object detection to exploit the complementary information and common information between multi-source images. When it comes to visible-infrared image pairs, however, the visible images are prone to illumination and visibility and there may be a lot of interference information and little useful information. We suggest performing common feature enhancement and spatial cross attention sequentially to solve this problem. For this purpose, a novel Dual Attention Transformer Feature Fusion (DATFF) module which is designed for feature fusion of intermediate feature maps is proposed. We integrate it into two-stream object detectors and achieve state-of-the-art performance on DroneVehicle and FLIR visible-infrared object detection datasets. Our code is available at https://github.com/a21401624/DATFF.
引用
收藏
页码:53 / 65
页数:13
相关论文
共 26 条
  • [1] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [2] Chen K, 2019, Arxiv, DOI arXiv:1906.07155
  • [3] Multimodal Object Detection via Probabilistic Ensembling
    Chen, Yi-Ting
    Shi, Jinghao
    Ye, Zelin
    Mertz, Christoph
    Ramanan, Deva
    Kong, Shu
    [J]. COMPUTER VISION, ECCV 2022, PT IX, 2022, 13669 : 139 - 158
  • [4] Learning RoI Transformer for Oriented Object Detection in Aerial Images
    Ding, Jian
    Xue, Nan
    Long, Yang
    Xia, Gui-Song
    Lu, Qikai
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2844 - 2853
  • [5] Dosovitskiy A., 2021, ICLR
  • [6] Fang Q., arXiv
  • [7] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
    Fang Qingyun
    Wang Zhaokui
    [J]. PATTERN RECOGNITION, 2022, 130
  • [8] Kailai Zhou, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12363), P787, DOI 10.1007/978-3-030-58523-5_46
  • [9] Illumination-aware faster R-CNN for robust multispectral pedestrian detection
    Li, Chengyang
    Song, Dan
    Tong, Ruofeng
    Tang, Min
    [J]. PATTERN RECOGNITION, 2019, 85 : 161 - 171
  • [10] Feature Pyramid Networks for Object Detection
    Lin, Tsung-Yi
    Dollar, Piotr
    Girshick, Ross
    He, Kaiming
    Hariharan, Bharath
    Belongie, Serge
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944