Adaptive Fusion of Multi-Scale YOLO for Pedestrian Detection

被引:54
作者
Hsu, Wei-Yen [1 ,2 ,3 ]
Lin, Wen-Yen [1 ]
机构
[1] Natl Chung Cheng Univ, Dept Informat Management, Chiayi 62102, Taiwan
[2] Natl Chung Cheng Univ, Adv Inst Mfg High Tech Innovat, Chiayi 62102, Taiwan
[3] Natl Chung Cheng Univ, Ctr Innovat Res Aging Soc CI RAS, Chiayi 62102, Taiwan
关键词
Pedestrian detection; multi-scale YOLO; adaptive fusion;
D O I
10.1109/ACCESS.2021.3102600
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although pedestrian detection technology is constantly improving, pedestrian detection remains challenging because of the uncertainty and diversity of pedestrians in different scales and of occluded pedestrian modes. This study followed the common framework of single-shot object detection and proposed a divide-and-rule method to solve the aforementioned problems. The proposed model introduced a segmentation function that can split pedestrians who do not overlap in one image into two subimages. By using a network architecture, multiresolution adaptive fusion was performed on the output of all images and subimages to generate the final detection result. This study conducted an extensive evaluation of several challenging pedestrian detection data sets and finally proved the effectiveness of the proposed model. In particular, the proposed model achieved the most advanced performance on data sets from Visual Object Classes 2012 (VOC 2012), the French Institute for Research in Computer Science and Automation, and the Swiss Federal Institute of Technology in Zurich and obtained the most competitive results in a triple-width VOC 2012 experiment carefully designed by the present study.
引用
收藏
页码:110063 / 110073
页数:11
相关论文
共 53 条
[41]  
Tian YL, 2015, PROC CVPR IEEE, P5079, DOI 10.1109/CVPR.2015.7299143
[42]   Robust real-time face detection [J].
Viola, P ;
Jones, MJ .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 57 (02) :137-154
[43]   Pedestrian detection in underground mines via parallel feature transfer network [J].
Wei, Xing ;
Zhang, Haitao ;
Liu, Shaofan ;
Lu, Yang .
PATTERN RECOGNITION, 2020, 103
[44]  
Xingyi Yang, 2020, Advances in Visual Computing. 15th International Symposium, ISVC 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12510), P15, DOI 10.1007/978-3-030-64559-5_2
[45]   Multi-model ensemble with rich spatial information for object detection [J].
Xu, Jie ;
Wang, Wei ;
Wang, Hanyuan ;
Guo, Jinhong .
PATTERN RECOGNITION, 2020, 99
[46]   Design and Optimization Procedure of a Mechanical-Offset Complementary-Stator Flux-Reversal Permanent-Magnet Machine [J].
Yu, Jincheng ;
Liu, Chunhua ;
Zhao, Hang .
IEEE TRANSACTIONS ON MAGNETICS, 2019, 55 (07)
[47]   Gated CNN: Integrating multi-scale feature layers for object detection [J].
Yuan, Jin ;
Xiong, Heng-Chang ;
Xiao, Yi ;
Guan, Weili ;
Wang, Meng ;
Hong, Richang ;
Li, Zhi-Yong .
PATTERN RECOGNITION, 2020, 105
[48]   Local Deep-Feature Alignment for Unsupervised Dimension Reduction [J].
Zhang, Jian ;
Yu, Jun ;
Tao, Dacheng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (05) :2420-2432
[49]   Is Faster R-CNN Doing Well for Pedestrian Detection? [J].
Zhang, Liliang ;
Lin, Liang ;
Liang, Xiaodan ;
He, Kaiming .
COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :443-457
[50]   Informed Haar-like Features Improve Pedestrian Detection [J].
Zhang, Shanshan ;
Bauckhage, Christian ;
Cremers, Armin B. .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :947-954