Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection

被引:5
|
作者
Fang, Fen [1 ]
Liang, Wenyu [1 ]
Cheng, Yi [1 ]
Xu, Qianli [1 ]
Lim, Joo-Hwee [1 ,2 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
关键词
Detectors; Training; Object detection; Feature extraction; Costs; Transformers; Prediction algorithms; Small object detection; reinforcement learning; coarse-to-fine framework;
D O I
10.1109/TCSVT.2023.3284453
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Although object detection has achieved significant progress in the past decade, detecting small objects is still far from satisfactory due to the high variability of object scales and complex backgrounds. The common way to enhance small object detection is to use high-resolution (HR) images. However, this method incurs huge computational resources which grow squarely with the resolution of images. To achieve both accuracy and efficiency, we propose a novel reinforcement learning framework that employs an efficient policy network consisting of a Spatial Transformation Network to enhance the state representation learning and a Transformer model with early convolution to improve feature extraction. Our method has two main steps: (1) coarse location query (CLQ), where an RL agent is trained to predict the locations of small objects on low-resolution (LR) (down-sampled version of HR) images; (2) context-sensitive object detection where HR image patches are used to detect objects on the selected coarse locations and LR image patches on background areas (containing no small objects). In this way, we can obtain high detection performance on small objects while avoiding unnecessary computation on background areas. The proposed method has been tested and benchmarked on various datasets. On the Caltech Pedestrians Detection and Web Pedestrians datasets, the proposed method improves the detection accuracy by 2%, while reducing the number of processed pixels. On the Vision meets Drone object detection dataset and the Oil and Gas Storage Tank dataset, the proposed method outperforms the state-of-the-art (SotA) methods. On MS COCO mini-val set, our method outperforms SotA methods on small object detection, while also achieving comparable performance on medium and large objects.
引用
收藏
页码:315 / 328
页数:14
相关论文
共 50 条
  • [21] Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection
    Li, Xiang
    Lv, Chengqi
    Wang, Wenhai
    Li, Gang
    Yang, Lingfeng
    Yang, Jian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3139 - 3153
  • [22] Reinforcement Learning-Based Formation Pinning and Shape Transformation for Swarms
    Dong, Zhaoqi
    Wu, Qizhen
    Chen, Lei
    DRONES, 2023, 7 (11)
  • [23] Reinforcement Learning-Based Generative Security Framework for Host Intrusion Detection
    Kim, Yongsik
    Hong, Su-Youn
    Park, Sungjin
    Kim, Huy Kang
    IEEE ACCESS, 2025, 13 : 15346 - 15362
  • [24] An efficient reinforcement learning-based Botnet detection approach
    Alauthman, Mohammad
    Aslam, Nauman
    Al-kasassbeh, Mouhammd
    Khan, Suleman
    Al-Qerem, Ahmad
    Choo, Kim-Kwang Raymond
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2020, 150 (150)
  • [25] GLCONet: Learning Multisource Perception Representation for Camouflaged Object Detection
    Sun, Yanguang
    Xuan, Hanyu
    Yang, Jian
    Luo, Lei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [26] Enhancing representation learning by exploiting effective receptive fields for object detection
    Wang, Qijin
    Zhang, Shengyu
    Qian, Yu
    Zhang, Guangcai
    Wang, Hongqiang
    NEUROCOMPUTING, 2022, 481 : 22 - 32
  • [27] A Reinforcement Learning-Based Adaptive Learning System
    Shawky, Doaa
    Badawi, Ashraf
    INTERNATIONAL CONFERENCE ON ADVANCED MACHINE LEARNING TECHNOLOGIES AND APPLICATIONS (AMLTA2018), 2018, 723 : 221 - 231
  • [28] Rumor Containment in Hypergraph Representation of Social Networks: A Deep Reinforcement Learning-Based Solution
    Kundu, Gouri
    Ghosh, Smita
    Choudhury, Sankhayan
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024,
  • [29] A Recurrent Reinforcement Learning Approach for Small Object Detection with Dynamic Refinement
    Li, Yue
    Han, Xuechun
    Ge, Litong
    Li, Fanghao
    Chai, Yimeng
    Zhou, Xianchun
    Wang, Wei
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [30] Reinforcement Learning-Based Dual-Identity Double Auction in Personalized Federated Learning
    Li, Juan
    Chen, Zishang
    Zang, Tianzi
    Liu, Tong
    Wu, Jie
    Zhu, Yanmin
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2025, 24 (05) : 4086 - 4103