Capsule-inferenced Object Detection for Remote Sensing Images

被引:9
作者
Han, Yingchao [1 ]
Meng, Weixiao [1 ]
Tang, Wei [2 ]
机构
[1] Harbin Inst Technol, Sch Elect & Informat Engn, Harbin 150001, Peoples R China
[2] Beijing Fifth Acad Aerosp, State Key Lab Space Earth Integrat Informat Techno, Beijing 100048, Peoples R China
基金
中国国家自然科学基金;
关键词
CapsNet; object detection; remote sensing image; transformer; NETWORK; CONTEXT; AWARE;
D O I
10.1109/JSTARS.2023.3266794
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Frequent and accurate object detection based on remote sensing images can effectively monitor dynamic objects on the earth's surface. While the detection transformer (DETR) offers a simple encoder-decoder structure and a direct set prediction approach to object detection, it falls short in complex remote sensing scenes where entity information and relative positions between objects are critical to target reasoning. Notably, the DETR model's feedforward neural network (FFN) relies on weighted summation for target reasoning, disregarding interactive feature information, which is a major factor affecting detection effectiveness. To address these shortcomings, in this article, we propose a DETR-based detection model called (CI_DETR), which uses capsule inference to improve remote sensing object detection. Our approach adds a multilevel feature fusion module to the DETR network, allowing the network to learn how to spatially alter features at different levels, preserving only beneficial information for combination. In addition, we introduce a capsule reasoning module to mine entity information during inference and more effectively model the hierarchical correlation of internal knowledge representation in the neural network, consistent with the thinking model of the human brain. Lastly, we employ a sausage model to measure the similarities and differences of capsules, projecting them onto a curved surface for nonlinear function approximation and maximum preservation of the local responsiveness of capsule entities. Our experiments on public datasets confirm the superior detection performance of our proposed algorithm relative to many current detectors.
引用
收藏
页码:5260 / 5270
页数:11
相关论文
共 65 条
  • [1] Al-Rfou R, 2019, AAAI CONF ARTIF INTE, P3159
  • [2] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [3] End-to-End Airplane Detection Using Transfer Learning in Remote Sensing Images
    Chen, Zhong
    Zhang, Ting
    Ouyang, Chao
    [J]. REMOTE SENSING, 2018, 10 (01)
  • [4] Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images
    Cheng, Gong
    Si, Yongjie
    Hong, Hailong
    Yao, Xiwen
    Guo, Lei
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (03) : 431 - 435
  • [5] Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images
    Cheng, Gong
    Zhou, Peicheng
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12): : 7405 - 7415
  • [6] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
    Dai, Zhigang
    Cai, Bolun
    Lin, Yugeng
    Chen, Junying
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1601 - 1610
  • [7] FAR-Net: Fast Anchor Refining for Arbitrary-Oriented Object Detection
    Deng, Chenwei
    Jing, Donglin
    Han, Yuqi
    Wang, Shuliang
    Wang, Hongshuo
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [8] High-Performance Visual Tracking With Extreme Learning Machine Framework
    Deng, Chenwei
    Han, Yuqi
    Zhao, Baojun
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (06) : 2781 - 2792
  • [9] Multi-scale object detection in remote sensing imagery with convolutional neural networks
    Deng, Zhipeng
    Sun, Hao
    Zhou, Shilin
    Zhao, Juanping
    Lei, Lin
    Zou, Huanxin
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 145 : 3 - 22
  • [10] CenterNet: Keypoint Triplets for Object Detection
    Duan, Kaiwen
    Bai, Song
    Xie, Lingxi
    Qi, Honggang
    Huang, Qingming
    Tian, Qi
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6568 - 6577