Rapid Search for Small Object in Reinforcement Learning by Combining Spatio-Temporal Contextual Information

被引:0
作者
Jiang H. [1 ]
Ma J.-J. [1 ]
Yao H.-G. [1 ]
Cheng S.-Y. [2 ]
Chen Y. [2 ]
Yu J. [1 ]
机构
[1] School of Computer Science and Engineering, Xi’an Technological University, Shaanxi, Xi’an
[2] Aeronautics Engineering College, Air Force Engineering University, Shaanxi, Xi’an
来源
Tien Tzu Hsueh Pao/Acta Electronica Sinica | 2023年 / 51卷 / 11期
关键词
human eye saccade; location context; reinforcement learning; small object detection; temporal context;
D O I
10.12263/DZXB.20220617
中图分类号
学科分类号
摘要
When searching for a object, the human eye first roughly scans based on previous scanning experience to find potential locations for the object, and then conducts a detailed search. The former can be referred to as scanning based on temporal contextual information, while the latter can be referred to as searching based on location contextual informa⁃ tion. Inspired by this, this paper proposes a rapid search method for small objects based on reinforcement learning that inte⁃ grates spatio-temporal context information. The method builds a temporal context module based on a reinforcement learn⁃ ing search strategy to simulate the human eye's ability to obtain and utilize empirical information, then constructs an adap⁃ tive multi-scale window to extract location context information to simulate the human eye's ability to search carefully at possible locations. The two kinds of information cooperate alternately in the object search process to complete the object search. The experimental results show that the proposed algorithm brings around 2.9% gain on MS COCO benchmark, and can find an object within five search counts. © 2023 Chinese Institute of Electronics. All rights reserved.
引用
收藏
页码:3176 / 3186
页数:10
相关论文
共 32 条
[1]  
MEYE A F, O'KEEFE J, POORT J., Two distinct types of eye-head coupling in freely moving mice, Current Biolo⁃ gy, 30, 11, pp. 2116-2130, (2020)
[2]  
MNIH V, KAVUKCUOGLU K, SILVER D, Et al., Human-level control through deep reinforcement learning, Na⁃ ture, 518, 7540, pp. 529-533, (2015)
[3]  
LIU S, QI L, QIN H F, Et al., Path aggregation network for instance segmentation, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8759-8768, (2018)
[4]  
LENG J X, REN Y H, JIANG W X, Et al., Realize your sur⁃ roundings: Exploiting context information for small object detection, Neurocomputing, 433, pp. 287-299, (2021)
[5]  
EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, Et al., The pascal visual object classes (VOC) challenge, In⁃ ternational Journal of Computer Vision, 88, 2, pp. 303-338, (2010)
[6]  
LIN T Y, MAIRE M, BELONGIE S, Et al., Microsoft CO⁃ CO: Common objects in context, European Conference on Computer Vision, pp. 740-755, (2014)
[7]  
REN S Q, HE K M, GIRSHICK R, Et al., Faster R-CNN: Towards real-time object detection with region proposal networks, Proceedings of the 28th International Confer⁃ ence on Neural Information Processing Systems, pp. 91-99, (2015)
[8]  
REDMON J, DIVVALA S, GIRSHICK R, Et al., You only look once: Unified, real-time object detection, 2016 IEEE Conference on Computer Vision and Pattern Recog⁃ nition (CVPR), pp. 779-788, (2016)
[9]  
LIU W, ANGUELOV D, ERHAN D, Et al., SSD: Single shot MultiBox detector, European Conference on Com⁃ puter Vision, pp. 21-37, (2016)
[10]  
LI B Q, HE Y Y, QIANG W, Et al., SSD with parallel ad⁃ ditional feature extraction network for ground small tar⁃ get detection, Acta Electronica Sinica, 48, 1, pp. 84-91, (2020)