Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection

被引:5
|
作者
Fang, Fen [1 ]
Liang, Wenyu [1 ]
Cheng, Yi [1 ]
Xu, Qianli [1 ]
Lim, Joo-Hwee [1 ,2 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
关键词
Detectors; Training; Object detection; Feature extraction; Costs; Transformers; Prediction algorithms; Small object detection; reinforcement learning; coarse-to-fine framework;
D O I
10.1109/TCSVT.2023.3284453
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Although object detection has achieved significant progress in the past decade, detecting small objects is still far from satisfactory due to the high variability of object scales and complex backgrounds. The common way to enhance small object detection is to use high-resolution (HR) images. However, this method incurs huge computational resources which grow squarely with the resolution of images. To achieve both accuracy and efficiency, we propose a novel reinforcement learning framework that employs an efficient policy network consisting of a Spatial Transformation Network to enhance the state representation learning and a Transformer model with early convolution to improve feature extraction. Our method has two main steps: (1) coarse location query (CLQ), where an RL agent is trained to predict the locations of small objects on low-resolution (LR) (down-sampled version of HR) images; (2) context-sensitive object detection where HR image patches are used to detect objects on the selected coarse locations and LR image patches on background areas (containing no small objects). In this way, we can obtain high detection performance on small objects while avoiding unnecessary computation on background areas. The proposed method has been tested and benchmarked on various datasets. On the Caltech Pedestrians Detection and Web Pedestrians datasets, the proposed method improves the detection accuracy by 2%, while reducing the number of processed pixels. On the Vision meets Drone object detection dataset and the Oil and Gas Storage Tank dataset, the proposed method outperforms the state-of-the-art (SotA) methods. On MS COCO mini-val set, our method outperforms SotA methods on small object detection, while also achieving comparable performance on medium and large objects.
引用
收藏
页码:315 / 328
页数:14
相关论文
共 50 条
  • [1] Pay Attention to Them: Deep Reinforcement Learning-Based Cascade Object Detection
    Liu, Songtao
    Huang, Di
    Wang, Yunhong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) : 2544 - 2556
  • [2] AdaAug+: A Reinforcement Learning-Based Adaptive Data Augmentation for Change Detection
    Huang, Rui
    Wei, Jieda
    Xing, Yan
    Guo, Qing
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [3] Deep learning-based small object detection: A survey
    Feng, Qihan
    Xu, Xinzheng
    Wang, Zhixiao
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (04) : 6551 - 6590
  • [4] Learning Cross-Modality High-Resolution Representation for Thermal Small-Object Detection
    Zhang, Yan
    Lei, Xu
    Hu, Qian
    Xu, Chang
    Yang, Wen
    Xia, Gui-Song
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [5] CGDINet: A Deep Learning-Based Salient Object Detection Algorithm
    Hu, Chengyu
    Guo, Jianxin
    Xie, Hanfei
    Zhu, Qing
    Yuan, Baoxi
    Gao, Yujie
    Ma, Xiangyang
    Chen, Jialu
    Tian, Juan
    IEEE ACCESS, 2025, 13 : 4697 - 4723
  • [6] A Survey of Deep Learning-Based Object Detection Methods and Datasets for Overhead Imagery
    Kang, Junhyung
    Tariq, Shahroz
    Oh, Han
    Woo, Simon S.
    IEEE ACCESS, 2022, 10 : 20118 - 20134
  • [7] AirFormer: Learning-Based Object Detection for Mars Helicopter
    Qi, Yifan
    Xiao, Xueming
    Yao, Meibao
    Xiong, Yonggang
    Zhang, Lei
    Cui, Hutao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 100 - 111
  • [8] Enhancing Reinforcement Learning-Based Energy Management Through Transfer Learning With Load and PV Forecasting
    Xu, Chang
    Inuiguchi, Masahiro
    Hayashi, Naoki
    Raymond, Wong Jee Keen
    Mokhlis, Hazlie
    Illias, Hazlee Azil
    IEEE ACCESS, 2025, 13 : 43956 - 43972
  • [9] Learning-Based Image Synthesis for Hazardous Object Detection in X-Ray Security Applications
    Kim, Hyo-Young
    Cho, Sung-Jin
    Baek, Seung-Jin
    Jung, Seung-Won
    Ko, Sung-Jea
    IEEE ACCESS, 2021, 9 : 135256 - 135265
  • [10] Deep Learning-Based Object Detection Improvement for Fine-Grained Birds
    Yang, Kuihe
    Song, Ziying
    IEEE ACCESS, 2021, 9 : 67901 - 67915