FE-CSP: a fast and efficient pedestrian detector with center and scale prediction

被引:5
作者
Qin, Yugang [1 ]
Qian, Yurong [1 ,2 ,3 ]
Wei, Hongyang [1 ]
Fan, Yingying [2 ]
Feng, Peiyun [1 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi 830000, Peoples R China
[2] Key Lab Signal Detect & Proc Xinjiang Uygur Auton, Urumqi 830000, Peoples R China
[3] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830000, Peoples R China
基金
中国国家自然科学基金;
关键词
Pedestrian detection; Channel attention; Spatial attention; Deformable convolution; Feature pyramid network; ATTENTION;
D O I
10.1007/s11227-022-04815-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There are still many pressing problems in pedestrian detection, such as difficulty in detection due to severe pedestrian occlusion, difficulty in detecting small objects and low detection speed. In this paper, we propose A Fast and Efficient Pedestrian Detector with Center and Scale Prediction (FE-CSP). We combine channel attention with spatial attention, replace the traditional convolution with deformable convolution, and embed the backbone network to propose CSANet (Channel and Spatial Attention Network), which efficiently extracts the semantic features of the object, and then propose a feature pyramid network to replace the traditional concatenation to perform multi-scale feature detection, which effectively improves the detection speed. By conducting experiments on CityPersons, our method achieves 10.1%, 13.7% and 47.4% MR-2 at a speed of 0.21 s/img on the reasonable setting, small setting and heavy setting, respectively. On Caltech, our method achieves 5.2% MR-2 at a speed of 0.06 s/img on the Reasonable setting, further demonstrating the superiority and generalization ability of the proposed method.
引用
收藏
页码:4084 / 4104
页数:21
相关论文
共 54 条
[31]   Fighting against COVID-19: A novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection [J].
Loey, Mohamed ;
Manogaran, Gunasekaran ;
Taha, Mohamed Hamed N. ;
Khalifa, Nour Eldeen M. .
SUSTAINABLE CITIES AND SOCIETY, 2021, 65
[32]   Image super-resolution via channel attention and spatial attention [J].
Lu, Enmin ;
Hu, Xiaoxiao .
APPLIED INTELLIGENCE, 2022, 52 (02) :2260-2268
[33]   3-D Channel and Spatial Attention Based Multiscale Spatial-Spectral Residual Network for Hyperspectral Image Classification [J].
Lu, Zhenyu ;
Xu, Bin ;
Sun, Le ;
Zhan, Tianming ;
Tang, Songze .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 :4311-4324
[34]  
Redmon J, 2018, Arxiv, DOI arXiv:1804.02767
[35]   You Only Look Once: Unified, Real-Time Object Detection [J].
Redmon, Joseph ;
Divvala, Santosh ;
Girshick, Ross ;
Farhadi, Ali .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :779-788
[36]   Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149
[37]   The dynamic representation of scenes [J].
Rensink, RA .
VISUAL COGNITION, 2000, 7 (1-3) :17-42
[38]   Small-Scale Pedestrian Detection Based on Topological Line Localization and Temporal Feature Aggregation [J].
Song, Tao ;
Sun, Leiyu ;
Xie, Di ;
Sun, Haiming ;
Pu, Shiliang .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :554-569
[39]   PRNet plus plus : Learning towards generalized occluded pedestrian detection via progressive refinement network [J].
Song, Xiaolin ;
Chen, Binghui ;
Li, Pengyu ;
Wang, Biao ;
Zhang, Honggang .
NEUROCOMPUTING, 2022, 482 :98-115
[40]   Progressive Refinement Network for Occluded Pedestrian Detection [J].
Song, Xiaolin ;
Zhao, Kaili ;
Chu, Wen-Sheng ;
Zhang, Honggang ;
Guo, Jun .
COMPUTER VISION - ECCV 2020, PT XXIII, 2020, 12368 :32-48