Pyramid attention object detection network with multi-scale feature fusion

被引:8
作者
Chen, Xiu [1 ]
Li, Yujie [1 ,2 ]
Nakatoh, Yoshihisa [2 ]
机构
[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Peoples R China
[2] Kyushu Inst Technol, Sch Engn, Kitakyushu, Japan
关键词
Multi-scale features; Small objects; Object detection; Contextual information; Feature pyramid; Attention module;
D O I
10.1016/j.compeleceng.2022.108436
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of deep learning, object detection has made substantial progress. However, when the object to be detected in the image is small or partially occluded, the detection network often fails to detect it successfully. We propose a multi-scale feature fusion pyramid attention module, which effectively combines the global average pooling results of multiple scales with the upper features in the residual blocks of the feature extraction network to obtain more spatial context information in the original feature map. We added the multi-scale feature fusion pyramid attention module proposed in this paper based on YoloV3 and conducted experiments on the PASCALL VOC and MS COCO datasets. The experimental results show that the attention module can effectively help the network detect small objects and accurately detect partially occlusion objects.
引用
收藏
页数:10
相关论文
共 29 条
[1]  
[Anonymous], P INT C LEARN REPR I
[2]   Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [J].
Bell, Sean ;
Zitnick, C. Lawrence ;
Bala, Kavita ;
Girshick, Ross .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2874-2883
[3]   The PASCAL Visual Object Classes Challenge: A Retrospective [J].
Everingham, Mark ;
Eslami, S. M. Ali ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136
[4]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[5]   SPANET: SPATIAL PYRAMID ATTENTION NETWORK FOR ENHANCED IMAGE RECOGNITION [J].
Guo, Jingda ;
Ma, Xu ;
Sansom, Andrew ;
McGuire, Mara ;
Kalaani, Andrew ;
Chen, Qi ;
Tang, Sihai ;
Yang, Qing ;
Fu, Song .
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[6]  
He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
[7]   Gradient Response Maps for Real-Time Detection of Textureless Objects [J].
Hinterstoisser, Stefan ;
Cagniart, Cedric ;
Ilic, Slobodan ;
Sturm, Peter ;
Navab, Nassir ;
Fua, Pascal ;
Lepetit, Vincent .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (05) :876-888
[8]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]
[9]  
Hu Y., 2018, arXiv
[10]  
Joseph RK, 2016, CRIT POL ECON S ASIA, P1