Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection

被引：5

作者：

Fang, Fen ^{[1
]}

Liang, Wenyu ^{[1
]}

Cheng, Yi ^{[1
]}

Xu, Qianli ^{[1
]}

Lim, Joo-Hwee ^{[1
,2
]}

机构：

[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 01期

关键词：

Detectors; Training; Object detection; Feature extraction; Costs; Transformers; Prediction algorithms; Small object detection; reinforcement learning; coarse-to-fine framework;

D O I：

10.1109/TCSVT.2023.3284453

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Although object detection has achieved significant progress in the past decade, detecting small objects is still far from satisfactory due to the high variability of object scales and complex backgrounds. The common way to enhance small object detection is to use high-resolution (HR) images. However, this method incurs huge computational resources which grow squarely with the resolution of images. To achieve both accuracy and efficiency, we propose a novel reinforcement learning framework that employs an efficient policy network consisting of a Spatial Transformation Network to enhance the state representation learning and a Transformer model with early convolution to improve feature extraction. Our method has two main steps: (1) coarse location query (CLQ), where an RL agent is trained to predict the locations of small objects on low-resolution (LR) (down-sampled version of HR) images; (2) context-sensitive object detection where HR image patches are used to detect objects on the selected coarse locations and LR image patches on background areas (containing no small objects). In this way, we can obtain high detection performance on small objects while avoiding unnecessary computation on background areas. The proposed method has been tested and benchmarked on various datasets. On the Caltech Pedestrians Detection and Web Pedestrians datasets, the proposed method improves the detection accuracy by 2%, while reducing the number of processed pixels. On the Vision meets Drone object detection dataset and the Oil and Gas Storage Tank dataset, the proposed method outperforms the state-of-the-art (SotA) methods. On MS COCO mini-val set, our method outperforms SotA methods on small object detection, while also achieving comparable performance on medium and large objects.

引用

页码：315 / 328

页数：14

共 50 条

[31] A Deep Reinforcement Learning-Based Framework for PolSAR Imagery Classification
Nie, Wen
Huang, Kui
Yang, Jie
Li, Pingxiang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[32] Learning to Regrasp Using Visual-Tactile Representation-Based Reinforcement Learning
Zhang, Zhuangzhuang
Sun, Han
Zhou, Zhenning
Wang, Yizhao
Huang, Huang
Zhang, Zhinan
Cao, Qixin
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
[33] Regional attention reinforcement learning for rapid object detection
Yao, Hongge
Dong, Peng
Cheng, Siyi
Yu, Jun
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 98
[34] Reinforcement Learning-Based Detection for State Estimation Under False Data Injection
Jiang, Weiliang
Yang, Wen
Zhou, Jiayu
Ding, Wenjie
Luo, Yue
Liu, Yun
IEEE ACCESS, 2021, 9 : 66498 - 66508
[35] Reinforcement Learning-Based Adaptive Feature Boosting for Smart Grid Intrusion Detection
Hu, Chengming
Yan, Jun
Liu, Xue
IEEE TRANSACTIONS ON SMART GRID, 2023, 14 (04) : 3150 - 3163
[36] Enhancing Autonomous Driving With Spatial Memory and Attention in Reinforcement Learning
Gerasyov, Matvey
Savchenko, Andrey V.
Makarov, Ilya
IEEE ACCESS, 2024, 12 : 173316 - 173324
[37] Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection
Xu, Hongyu
Lv, Xutao
Wang, Xiaoyu
Ren, Zhou
Bodla, Navaneeth
Chellappa, Rama
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (06) : 1914 - 1927
[38] Reinforcement learning-based detection method for malware behavior in industrial control systems
Gao Y.
Wang L.-W.
Ren W.
Xie F.
Mo X.-F.
Luo X.
Wang W.-P.
Yang X.
Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2020, 42 (04): : 455 - 462
[39] A Deep Learning-Based Hybrid Framework for Object Detection and Recognition in Autonomous Driving
Li, Yanfen
Wang, Hanxiang
Dang, L. Minh
Nguyen, Tan N.
Han, Dongil
Lee, Ahyun
Jang, Insung
Moon, Hyeonjoon
IEEE ACCESS, 2020, 8 : 194228 - 194239
[40] Deep Reinforcement Learning-based Quantization for Federated Learning
Zheng, Sihui
Dong, Yuhan
Chen, Xiang
2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC, 2023,

← 1 2 3 4 5 →