Gated weighted normative feature fusion for multispectral object detection

被引:3
作者
Wu, Xianjun [1 ]
Jiang, Xian [1 ]
Dong, Ligang [1 ]
机构
[1] Zhejiang Gongshang Univ, Sussex Artificial Intelligence Inst, Sch Informat & Elect Engn, Xuezheng St, Hangzhou 310018, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Multispectral object detection; Feature fusion; Cross-modality; Autonomous vehicles; CNN;
D O I
10.1007/s00371-023-03173-6
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Multispectral image pairs can provide independent and complementary information to more comprehensively describe detection targets, thereby improving the robustness and reliability of object detectors. The performance of an object detector depends on how cross-modality features are extracted and fused. To exploit the different modalities fully, we propose a lightweight yet effective cross-modality feature fusion approach named gated weighted normative feature fusion. In the feature extraction stage, our proposed dual-input backbone network can extract richer and more useful features. In the feature fusion stage, the fusion module can eliminate redundant features, dynamically weigh the importance of two image features, and further normalize fused features. Experiments and ablation studies on several publicly available datasets demonstrate the effectiveness of our method. Our proposed method achieved better performance in terms of mAP50 with 80.3%, mAP with 41.8%, and mAP50 with 98.0%, mAP with 68.0% on the FLIR and LLVIP datasets, respectively. In particular, the inference speed of our method is twice as fast as the current state-of-the-art method.
引用
收藏
页码:6409 / 6419
页数:11
相关论文
共 37 条
  • [1] [Anonymous], PROC CVPR IEEE, DOI [DOI 10.1017/JPA.2016.141, DOI 10.1109/CVPR.2016.141]
  • [2] [Anonymous], 2016, ESANN
  • [3] A Low-Complexity Pedestrian Detection Framework for Smart Video Surveillance Systems
    Bilal, Muhammad
    Khan, Asim
    Khan, Muhammad Umar Karim
    Kyung, Chong-Min
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (10) : 2260 - 2273
  • [4] Bochkovskiy A, 2020, ARXIV, DOI DOI 10.48550/ARXIV.2004.10934
  • [5] Attention Fusion for One-Stage Multispectral Pedestrian Detection
    Cao, Zhiwei
    Yang, Huihua
    Zhao, Juan
    Guo, Shuhong
    Li, Lingqiao
    [J]. SENSORS, 2021, 21 (12)
  • [6] Multi-layer fusion techniques using a CNN for multispectral pedestrian detection
    Chen, Yunfan
    Xie, Han
    Shin, Hyunchul
    [J]. IET COMPUTER VISION, 2018, 12 (08) : 1179 - 1187
  • [7] Cui Yufeng, 2022, arXiv
  • [8] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [9] Using local binary patterns as features for classification of dolphin calls
    Esfahanian, M.
    Zhuang, H.
    Erdol, N.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (01) : EL105 - EL111
  • [10] F.Team, FREE TELEDYNE FLIR T