Cross-Modal Local Calibration and Global Context Modeling Network for RGB–Infrared Remote-Sensing Object Detection

被引:7
|
作者
Xie, Jin [1 ]
Nie, Jing [2 ]
Ding, Bonan [1 ]
Yu, Mingyang [1 ]
Cao, Jiale [3 ,4 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400044, Peoples R China
[2] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing 400044, Peoples R China
[3] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[4] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Multimodal fusion; object detection; remote-sensing object detection; IMAGERY;
D O I
10.1109/JSTARS.2023.3315544
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
RGB-infrared object detection in remote-sensing images is crucial for achieving around-clock surveillance of unmanned aerial vehicles. RGB-infrared remote-sensing object detection methods based on deep learning usually mine the complementary information from RGB and infrared modalities by utilizing feature aggregation to achieve robust object detection for around-the-clock applications. Most of the existing methods aggregate features from RGB and infrared images by utilizing elementwise operations (e.g., elementwise addition or concatenation). The detection accuracy of these methods is limited. The main reasons can be concluded as follows: local location misalignment across modalities and insufficient nonlocal contextual information extraction. To address the above issues, we propose a cross-modal local calibration and global context modeling network (CLGNet), consisting of two novel modules: a cross-modal local calibration (CLC) module and a cross-modal global context (CGC) modeling module. The CLC module first aligns features from different modalities and then aggregates them selectively. The CGC module is embedded into the backbone network to capture cross-modal nonlocal long-range dependencies. The experimental results on popular RGB-infrared remote-sensing object detection datasets, namely DRoneVehicle and VEDAI, demonstrate the effectiveness and efficiency of our CLGNet.
引用
收藏
页码:8933 / 8942
页数:10
相关论文
共 50 条
  • [1] Cross-Modal Adaptation for Object Detection in Infrared Remote Sensing Imagery
    Wang, Zeyu
    Li, Shuaiting
    Huang, Kejie
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2025, 22
  • [2] Cross-Modal Attentional Context Learning for RGB-D Object Detection
    Li, Guanbin
    Gan, Yukang
    Wu, Hejun
    Xiao, Nong
    Lin, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1591 - 1601
  • [3] Global Guided Cross-Modal Cross-Scale Network for RGB-D Salient Object Detection
    Wang, Shuaihui
    Jiang, Fengyi
    Xu, Boqian
    SENSORS, 2023, 23 (16)
  • [4] Dual-Dynamic Cross-Modal Interaction Network for Multimodal Remote Sensing Object Detection
    Bao, Wei
    Huang, Meiyu
    Hu, Jingjing
    Xiang, Xueshuang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [5] Cross-modal hierarchical interaction network for RGB-D salient object detection
    Bi, Hongbo
    Wu, Ranwan
    Liu, Ziqi
    Zhu, Huihui
    Zhang, Cong
    Xiang, Tian -Zhu
    PATTERN RECOGNITION, 2023, 136
  • [6] Asymmetric cross-modal activation network for RGB-T salient object detection
    Xu, Chang
    Li, Qingwu
    Zhou, Qingkai
    Jiang, Xiongbiao
    Yu, Dabing
    Zhou, Yaqin
    KNOWLEDGE-BASED SYSTEMS, 2022, 258
  • [7] Global-Local Information Soft-Alignment for Cross-Modal Remote-Sensing Image-Text Retrieval
    Hu, Gang
    Wen, Zaidao
    Lv, Yafei
    Zhang, Jianting
    Wu, Qian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [8] Specificity-Guided Cross-Modal Feature Reconstruction for RGB-Infrared Object Detection
    Sun, Xiaoyu
    Zhu, Yaohui
    Huang, Hua
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 26 (01) : 950 - 961
  • [9] Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention
    Zhao, Pujie
    Ye, Xia
    Du, Ziang
    SENSORS, 2024, 24 (13)
  • [10] Cross-Modal Fusion and Progressive Decoding Network for RGB-D Salient Object Detection
    Hu, Xihang
    Sun, Fuming
    Sun, Jing
    Wang, Fasheng
    Li, Haojie
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) : 3067 - 3085