Design of Efficient Human Head Statistics System in the Large-Angle Overlooking Scene

被引：6

作者：

Wang, An ^{[1
]}

Cao, Xiaohong ^{[1
]}

Lu, Lei ^{[1
]}

Zhou, Xinjing ^{[2
]}

Sun, Xuecheng ^{[3
]}

机构：

[1] Shanghai Feilo Acoust Co Ltd, Shanghai 200233, Peoples R China

[2] Shanghai Univ, Microelect Res & Dev Ctr, Shanghai 200444, Peoples R China

[3] Shanghai Univ, Dept Mechatron Engn & Automat, Shanghai 200444, Peoples R China

来源：

ELECTRONICS | 2021年 / 10卷 / 15期

关键词：

YOLOv5-H; DeepSORT-FH; scene segmentation; cross-boundary counting; CIoU; overlooking scene;

D O I：

10.3390/electronics10151851

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Human head statistics is widely used in the construction of smart cities and has great market value. In order to solve the problem of missing pedestrian features and poor statistics results in a large-angle overlooking scene, in this paper we propose a human head statistics system that consists of head detection, head tracking and head counting, where the proposed You-Only-Look-Once-Head (YOLOv5-H) network, improved from YOLOv5, is taken as the head detection benchmark, the DeepSORT algorithm with the Fusion-Hash algorithm for feature extraction (DeepSORT-FH) is proposed to track heads, and heads are counted by the proposed cross-boundary counting algorithm based on scene segmentation. Specifically, Complete-Intersection-over-Union (CIoU) is taken as the loss function of YOLOv5-H to make the predicted boxes more in line with the real boxes. The results demonstrate that the recall rate and mAP@.5 of the proposed YOLOv5-H can reach up to 94.3% and 93.1%, respectively, on the SCUT_HEAD dataset. The statistics system has an extremely low error rate of 3.5% on the TownCentreXVID dataset while maintaining a frame rate of 18FPS, which can meet the needs of human head statistics in monitoring scenarios and has a good application prospect.

引用

页数：10

共 19 条

[1]

[Anonymous], 2016, YOLO9000 BETTER FAST

[2]

[Anonymous], 2011, Acm T. Intel. Syst. Tec., DOI DOI 10.1145/1961189.1961199

[3]

Bochkovskiy A., 2020, YOLO4 OPTIMAL SPEED

[4] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[5] People counting based on head detection combining Adaboost and CNN in crowded surveillance environment [J].

Gao, Chenqiang ;

Li, Pei ;

Zhang, Yajun ;

Liu, Jiang ;

Wang, Lan .

NEUROCOMPUTING, 2016, 208 :108-116

[6] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[7] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916

[8] HeadNet: An End-to-End Adaptive Relational Network for Head Detection [J].

Li, Wei ;

Li, Hongliang ;

Wu, Qingbo ;

Meng, Fanman ;

Xu, Linfeng ;

Ngan, King Ngi .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (02) :482-494

[9] AdaBoost with SVM-based component classifiers [J].

Li, Xuchun ;

Wang, Lei ;

Sung, Eric .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2008, 21 (05) :785-795

[10]

Peng DZ, 2018, INT C PATT RECOG, P2528, DOI 10.1109/ICPR.2018.8545068

← 1 2 →