Cross-Modal Local Calibration and Global Context Modeling Network for RGB–Infrared Remote-Sensing Object Detection

被引:7
|
作者
Xie, Jin [1 ]
Nie, Jing [2 ]
Ding, Bonan [1 ]
Yu, Mingyang [1 ]
Cao, Jiale [3 ,4 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400044, Peoples R China
[2] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing 400044, Peoples R China
[3] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[4] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Multimodal fusion; object detection; remote-sensing object detection; IMAGERY;
D O I
10.1109/JSTARS.2023.3315544
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
RGB-infrared object detection in remote-sensing images is crucial for achieving around-clock surveillance of unmanned aerial vehicles. RGB-infrared remote-sensing object detection methods based on deep learning usually mine the complementary information from RGB and infrared modalities by utilizing feature aggregation to achieve robust object detection for around-the-clock applications. Most of the existing methods aggregate features from RGB and infrared images by utilizing elementwise operations (e.g., elementwise addition or concatenation). The detection accuracy of these methods is limited. The main reasons can be concluded as follows: local location misalignment across modalities and insufficient nonlocal contextual information extraction. To address the above issues, we propose a cross-modal local calibration and global context modeling network (CLGNet), consisting of two novel modules: a cross-modal local calibration (CLC) module and a cross-modal global context (CGC) modeling module. The CLC module first aligns features from different modalities and then aggregates them selectively. The CGC module is embedded into the backbone network to capture cross-modal nonlocal long-range dependencies. The experimental results on popular RGB-infrared remote-sensing object detection datasets, namely DRoneVehicle and VEDAI, demonstrate the effectiveness and efficiency of our CLGNet.
引用
收藏
页码:8933 / 8942
页数:10
相关论文
共 50 条
  • [31] CAE-Net: Cross-Modal Attention Enhancement Network for RGB-T Salient Object Detection
    Lv, Chengtao
    Wan, Bin
    Zhou, Xiaofei
    Sun, Yaoqi
    Hu, Ji
    Zhang, Jiyong
    Yan, Chenggang
    ELECTRONICS, 2023, 12 (04)
  • [32] Cross-Modal Feature Fusion and Interaction Strategy for CNN-Transformer-Based Object Detection in Visual and Infrared Remote Sensing Imagery
    Nie, Jinyan
    Sun, He
    Sun, Xu
    Ni, Li
    Gao, Lianru
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [33] Unified Transformer with Cross-Modal Mixture Experts for Remote-Sensing Visual Question Answering
    Liu, Gang
    He, Jinlong
    Li, Pengfei
    Zhong, Shenjun
    Li, Hongyang
    He, Genrong
    REMOTE SENSING, 2023, 15 (19)
  • [34] Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond
    Chen, Hao
    Shen, Feihong
    Ding, Ding
    Deng, Yongjian
    Li, Chao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1699 - 1709
  • [35] Joint Cross-Modal and Unimodal Features for RGB-D Salient Object Detection
    Huang, Nianchang
    Liu, Yi
    Zhang, Qiang
    Han, Jungong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2428 - 2441
  • [36] Attentive Cross-Modal Fusion Network for RGB-D Saliency Detection
    Liu, Di
    Zhang, Kao
    Chen, Zhenzhong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 967 - 981
  • [37] Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection
    Du, Qinsheng
    Bian, Yingxu
    Wu, Jianyu
    Zhang, Shiyan
    Zhao, Jian
    APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [38] PSGCNet: A Pyramidal Scale and Global Context Guided Network for Dense Object Counting in Remote-Sensing Images
    Gao, Guangshuai
    Liu, Qingjie
    Hu, Zhenghui
    Li, Lu
    Wen, Qi
    Wang, Yunhong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [39] Attention-aware Cross-modal Cross-level Fusion Network for RGB-D Salient Object Detection
    Chen, Hao
    Li, You-Fu
    Su, Dan
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 6821 - 6826
  • [40] GCWNet: A Global Context-Weaving Network for Object Detection in Remote Sensing Images
    Wu, Yulin
    Zhang, Ke
    Wang, Jingyu
    Wang, Yezi
    Wang, Qi
    Li, Xuelong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60