Pedestrian Detection Using Stationary Wavelet Dilated Residual Super-Resolution

被引:42
作者
Hsu, Wei-Yen [1 ,2 ]
Chen, Pei-Ci [3 ]
机构
[1] Natl Chung Cheng Univ, Dept Informat Management, Adv Inst Mfg High Tech Innovat, Chiayi 62102, Taiwan
[2] Natl Chung Cheng Univ, Ctr Innovat Res Aging Soc CIRAS, Chiayi 62102, Taiwan
[3] Natl Chung Cheng Univ, Dept Informat Management, Chiayi 62102, Taiwan
关键词
Feature extraction; Object detection; Detectors; Convolutional neural networks; Superresolution; Interpolation; Image edge detection; Low resolution (LR); pedestrian detection; small size; super resolution; wavelet transform; you only look once version 4 (YOLOv4); YOLO;
D O I
10.1109/TIM.2022.3142061
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Pedestrian detection has become an important subject in applications such as automobile autopilot assistance and pedestrian tracking in surveillance systems. A number of powerful object detectors have been used in pedestrian detection to solve problems created by pedestrians' posture, stature, clothing, viewing angle, shielding, and illumination; however, it is still a challenge to detect small-size and low-resolution (LR) images in pedestrian detection. In order to capture more scenes, surveillance cameras are usually set high above the ground, causing image sizes to be too small or the resolution too low. When these blurred or noisy images are imported into the object detector, the extraction of pedestrian feature information will lead to a relatively low detection rate. To solve this problem, this study proposes a novel stationary wavelet dilated residual super-resolution (SWDR-SR) network to greatly enhance the edge information of SR images and thereby improve pedestrian detection. The LR image is decomposed with stationary wavelet transform (SWT) into low- and high-frequency sub-images, which are then processed to restore the structure of low-frequency content and the details of high-frequency information so that the reconstructed SR image can also retain edge details, which is conducive to pedestrian detection. In addition, this study proposes a novel low-to-high frequency connection (L2HFC) to better retain edge details so as to further enhance pedestrian detection. Finally, you only look once version 4 (YOLOv4) is used for pedestrian detection on the SR images to validate the performance of the proposed method. Using seven common datasets, the experimental results indicate that the proposed SWDR-SR method is effective at detecting small-size and LR images.
引用
收藏
页数:11
相关论文
共 44 条
[1]  
Bochkovskiy A., 2020, YOLOv4: Optimal Speed and Accuracy of Object Detection
[2]   Learning Multilayer Channel Features for Pedestrian Detection [J].
Cao, Jiale ;
Pang, Yanwei ;
Li, Xuelong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) :3210-3220
[3]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[4]   Pedestrian Detection: An Evaluation of the State of the Art [J].
Dollar, Piotr ;
Wojek, Christian ;
Schiele, Bernt ;
Perona, Pietro .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (04) :743-761
[5]   Image Super-Resolution Using Deep Convolutional Networks [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307
[6]  
Ess A, 2007, IEEE I CONF COMP VIS, P2065
[7]   The PASCAL Visual Object Classes Challenge: A Retrospective [J].
Everingham, Mark ;
Eslami, S. M. Ali ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136
[8]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237
[9]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[10]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587