Multispectral Object Detection Based on Multilevel Feature Fusion and Dual Feature Modulation

被引:4
作者
Sun, Jin [1 ]
Yin, Mingfeng [1 ]
Wang, Zhiwei [1 ]
Xie, Tao [1 ]
Bei, Shaoyi [1 ]
机构
[1] Jiangsu Univ Technol, Sch Automobile & Traff Engn, Changzhou 213001, Peoples R China
基金
中国国家自然科学基金;
关键词
multispectral object detection; remote sensing; visible-infrared images; multilevel feature fusion; dual feature modulation; FASTER R-CNN; NETWORK;
D O I
10.3390/electronics13020443
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multispectral object detection is a crucial technology in remote sensing image processing, particularly in low-light environments. Most current methods extract features at a single scale, resulting in the fusion of invalid features and the failure to detect small objects. To address these issues, we propose a multispectral object detection network based on multilevel feature fusion and dual feature modulation (GMD-YOLO). Firstly, a novel dual-channel CSPDarknet53 network is used to extract deep features from visible-infrared images. This network incorporates a Ghost module, which generates additional feature maps through a series of linear operations, achieving a balance between accuracy and speed. Secondly, the multilevel feature fusion (MLF) module is designed to utilize cross-modal information through the construction of hierarchical residual connections. This approach strengthens the complementarity between different modalities, allowing the network to improve multiscale representation capabilities at a more refined granularity level. Finally, a dual feature modulation (DFM) decoupling head is introduced to enhance small object detection. This decoupled head effectively meets the distinct requirements of classification and localization tasks. GMD-YOLO is validated on three public visible-infrared datasets: DroneVehicle, KAIST, and LLVIP. DroneVehicle and LLVIP achieved mAP@0.5 of 78.0% and 98.0%, outperforming baseline methods by 3.6% and 4.4%, respectively. KAIST exhibited an MR of 7.73% with an FPS of 61.7. Experimental results demonstrated that our method surpasses existing advanced methods and exhibits strong robustness.
引用
收藏
页数:18
相关论文
共 53 条
  • [1] Effectiveness Guided Cross-Modal Information Sharing for Aligned RGB-T Object Detection
    An, Zijia
    Liu, Chunlei
    Han, Yuqi
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2562 - 2566
  • [2] Dual-YOLO Architecture from Infrared and Visible Images for Object Detection
    Bao, Chun
    Cao, Jie
    Hao, Qun
    Cheng, Yang
    Ning, Yaqian
    Zhao, Tianhua
    [J]. SENSORS, 2023, 23 (06)
  • [3] Biswas M, 2023, Arxiv, DOI [arXiv:2308.06983, 10.48550/ARXIV.2308.069832308.06983]
  • [4] Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
  • [5] Disentangle Your Dense Object Detector
    Chen, Zehui
    Yang, Chenhongyi
    Li, Qiaofei
    Zhao, Feng
    Zha, Zheng-Jun
    Wu, Feng
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4939 - 4948
  • [6] KAIST Multi-Spectral Day/Night Data Set for Autonomous and Assisted Driving
    Choi, Yukyung
    Kim, Namil
    Hwang, Soonmin
    Park, Kibaek
    Yoon, Jae Shin
    An, Kyounghwan
    Kweon, In So
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 19 (03) : 934 - 948
  • [7] Dai JF, 2016, ADV NEUR IN, V29
  • [8] diaeresis>rg Wagner Jo<spacing, 2016, ESANN
  • [9] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
    Fang Qingyun
    Wang Zhaokui
    [J]. PATTERN RECOGNITION, 2022, 130
  • [10] Convolutional Two-Stream Network Fusion for Video Action Recognition
    Feichtenhofer, Christoph
    Pinz, Axel
    Zisserman, Andrew
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941