Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection

被引:5
作者
Fang, Fen [1 ]
Liang, Wenyu [1 ]
Cheng, Yi [1 ]
Xu, Qianli [1 ]
Lim, Joo-Hwee [1 ,2 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
关键词
Detectors; Training; Object detection; Feature extraction; Costs; Transformers; Prediction algorithms; Small object detection; reinforcement learning; coarse-to-fine framework;
D O I
10.1109/TCSVT.2023.3284453
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Although object detection has achieved significant progress in the past decade, detecting small objects is still far from satisfactory due to the high variability of object scales and complex backgrounds. The common way to enhance small object detection is to use high-resolution (HR) images. However, this method incurs huge computational resources which grow squarely with the resolution of images. To achieve both accuracy and efficiency, we propose a novel reinforcement learning framework that employs an efficient policy network consisting of a Spatial Transformation Network to enhance the state representation learning and a Transformer model with early convolution to improve feature extraction. Our method has two main steps: (1) coarse location query (CLQ), where an RL agent is trained to predict the locations of small objects on low-resolution (LR) (down-sampled version of HR) images; (2) context-sensitive object detection where HR image patches are used to detect objects on the selected coarse locations and LR image patches on background areas (containing no small objects). In this way, we can obtain high detection performance on small objects while avoiding unnecessary computation on background areas. The proposed method has been tested and benchmarked on various datasets. On the Caltech Pedestrians Detection and Web Pedestrians datasets, the proposed method improves the detection accuracy by 2%, while reducing the number of processed pixels. On the Vision meets Drone object detection dataset and the Oil and Gas Storage Tank dataset, the proposed method outperforms the state-of-the-art (SotA) methods. On MS COCO mini-val set, our method outperforms SotA methods on small object detection, while also achieving comparable performance on medium and large objects.
引用
收藏
页码:315 / 328
页数:14
相关论文
共 50 条
  • [41] Graph learning-based generation of abstractions for reinforcement learning
    Xue, Yuan
    Kudenko, Daniel
    Khosla, Megha
    NEURAL COMPUTING & APPLICATIONS, 2023,
  • [42] Deep Reinforcement Learning-based Quantization for Federated Learning
    Zheng, Sihui
    Dong, Yuhan
    Chen, Xiang
    2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC, 2023,
  • [43] Object detection and recognition using deep learning-based techniques
    Sharma, Preksha
    Gupta, Surbhi
    Vyas, Sonali
    Shabaz, Mohammad
    IET COMMUNICATIONS, 2023, 17 (13) : 1589 - 1599
  • [44] Deep Learning-Based Thermal Image Reconstruction and Object Detection
    Batchuluun, Ganbayar
    Kang, Jin Kyu
    Nguyen, Dat Tien
    Pham, Tuyen Danh
    Arsalan, Muhammad
    Park, Kang Ryoung
    IEEE ACCESS, 2021, 9 : 5951 - 5971
  • [45] Deep Learning-Based Object Detection in Diverse Weather Conditions
    Ravinder, M.
    Jaiswal, Arunima
    Gulati, Shivani
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2022, 18 (01)
  • [46] Learning-based method for lane detection using regionlet representation
    Chen, Yuxuan
    Chen, Wei-Gang
    Wang, Xun
    Yu, Runyi
    Tian, Yan
    IET INTELLIGENT TRANSPORT SYSTEMS, 2019, 13 (12) : 1745 - 1753
  • [47] Monocular 3D Object Detection Utilizing Auxiliary Learning With Deformable Convolution
    Chen, Jiun-Han
    Shieh, Jeng-Lun
    Haq, Muhamad Amirul
    Ruan, Shanq-Jang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (03) : 2424 - 2436
  • [48] A Survey of Research and Application of Small Object Detection Based on Deep Learning
    Liu Y.
    Liu H.-Y.
    Fan J.-L.
    Gong Y.-C.
    Li Y.-H.
    Wang F.-P.
    Lu J.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (03): : 590 - 601
  • [49] Enhancing Transferability of Deep Reinforcement Learning-Based Variable Speed Limit Control Using Transfer Learning
    Ke, Zemian
    Li, Zhibin
    Cao, Zehong
    Liu, Pan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (07) : 4684 - 4695
  • [50] Dense Information Learning Based Semi-Supervised Object Detection
    Yang, Xi
    Li, Penghui
    Zhou, Qiubai
    Wang, Nannan
    Gao, Xinbo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 1022 - 1035