Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection

被引:5
作者
Fang, Fen [1 ]
Liang, Wenyu [1 ]
Cheng, Yi [1 ]
Xu, Qianli [1 ]
Lim, Joo-Hwee [1 ,2 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
关键词
Detectors; Training; Object detection; Feature extraction; Costs; Transformers; Prediction algorithms; Small object detection; reinforcement learning; coarse-to-fine framework;
D O I
10.1109/TCSVT.2023.3284453
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Although object detection has achieved significant progress in the past decade, detecting small objects is still far from satisfactory due to the high variability of object scales and complex backgrounds. The common way to enhance small object detection is to use high-resolution (HR) images. However, this method incurs huge computational resources which grow squarely with the resolution of images. To achieve both accuracy and efficiency, we propose a novel reinforcement learning framework that employs an efficient policy network consisting of a Spatial Transformation Network to enhance the state representation learning and a Transformer model with early convolution to improve feature extraction. Our method has two main steps: (1) coarse location query (CLQ), where an RL agent is trained to predict the locations of small objects on low-resolution (LR) (down-sampled version of HR) images; (2) context-sensitive object detection where HR image patches are used to detect objects on the selected coarse locations and LR image patches on background areas (containing no small objects). In this way, we can obtain high detection performance on small objects while avoiding unnecessary computation on background areas. The proposed method has been tested and benchmarked on various datasets. On the Caltech Pedestrians Detection and Web Pedestrians datasets, the proposed method improves the detection accuracy by 2%, while reducing the number of processed pixels. On the Vision meets Drone object detection dataset and the Oil and Gas Storage Tank dataset, the proposed method outperforms the state-of-the-art (SotA) methods. On MS COCO mini-val set, our method outperforms SotA methods on small object detection, while also achieving comparable performance on medium and large objects.
引用
收藏
页码:315 / 328
页数:14
相关论文
共 50 条
  • [31] A Deep Reinforcement Learning-Based Framework for PolSAR Imagery Classification
    Nie, Wen
    Huang, Kui
    Yang, Jie
    Li, Pingxiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [32] Learning to Regrasp Using Visual-Tactile Representation-Based Reinforcement Learning
    Zhang, Zhuangzhuang
    Sun, Han
    Zhou, Zhenning
    Wang, Yizhao
    Huang, Huang
    Zhang, Zhinan
    Cao, Qixin
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [33] Regional attention reinforcement learning for rapid object detection
    Yao, Hongge
    Dong, Peng
    Cheng, Siyi
    Yu, Jun
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 98
  • [34] Reinforcement Learning-Based Detection for State Estimation Under False Data Injection
    Jiang, Weiliang
    Yang, Wen
    Zhou, Jiayu
    Ding, Wenjie
    Luo, Yue
    Liu, Yun
    IEEE ACCESS, 2021, 9 : 66498 - 66508
  • [35] Reinforcement Learning-Based Adaptive Feature Boosting for Smart Grid Intrusion Detection
    Hu, Chengming
    Yan, Jun
    Liu, Xue
    IEEE TRANSACTIONS ON SMART GRID, 2023, 14 (04) : 3150 - 3163
  • [36] Enhancing Autonomous Driving With Spatial Memory and Attention in Reinforcement Learning
    Gerasyov, Matvey
    Savchenko, Andrey V.
    Makarov, Ilya
    IEEE ACCESS, 2024, 12 : 173316 - 173324
  • [37] Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection
    Xu, Hongyu
    Lv, Xutao
    Wang, Xiaoyu
    Ren, Zhou
    Bodla, Navaneeth
    Chellappa, Rama
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (06) : 1914 - 1927
  • [38] Reinforcement learning-based detection method for malware behavior in industrial control systems
    Gao Y.
    Wang L.-W.
    Ren W.
    Xie F.
    Mo X.-F.
    Luo X.
    Wang W.-P.
    Yang X.
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2020, 42 (04): : 455 - 462
  • [39] A Deep Learning-Based Hybrid Framework for Object Detection and Recognition in Autonomous Driving
    Li, Yanfen
    Wang, Hanxiang
    Dang, L. Minh
    Nguyen, Tan N.
    Han, Dongil
    Lee, Ahyun
    Jang, Insung
    Moon, Hyeonjoon
    IEEE ACCESS, 2020, 8 : 194228 - 194239
  • [40] Deep Reinforcement Learning-based Quantization for Federated Learning
    Zheng, Sihui
    Dong, Yuhan
    Chen, Xiang
    2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC, 2023,