Detection of construction workers under varying poses and changing background in image sequences via very deep residual networks

被引：102

作者：

Son, Hyojoo ^{[1
]}

Choi, Hyunchul ^{[1
]}

Seong, Hyeonwoo ^{[1
]}

Kim, Changwan ^{[1
]}

机构：

[1] Chung Ang Univ, Dept Architectural Engn, Seoul 06974, South Korea

来源：

AUTOMATION IN CONSTRUCTION | 2019年 / 99卷

基金：

新加坡国家研究基金会;

关键词：

Construction worker detection; Deep learning; Two-stage approach; Very deep residual networks; PERFORMANCE EVALUATION; PEDESTRIAN DETECTION; OBJECT DETECTION; VISION; TRACKING; RECOGNITION; EQUIPMENT; SEGMENTATION; FRAMEWORK; SYSTEM;

D O I：

10.1016/j.autcon.2018.11.033

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Analyzing the location and behavior of construction workers using construction site images has been recognized as a means of providing useful information for safety management and productivity analysis. Although effective utilization of analyzed image data requires accurate and timely detection of workers in complex, continuously changing working environments, the previous methods that detect construction workers still require improvement because of the poor detection performance. This study proposes the use of very deep residual networks to accurately and rapidly detect construction workers under varying poses and against changing backgrounds in image sequences. The architecture of construction worker detection in this study is based on convolutional neural networks (CNNs). The proposed method is divided into two stages: extracting feature maps via very deep residual networks (ResNet-152) and bounding box regression and labeling from the original image via Faster regions with CNN features (R-CNN). The experiments were conducted at actual construction sites by acquiring 1.3-megapixel and 3.1-megapixel images from a movable digital camera to verify the proposed method for images from fixed and moving cameras. Faster R-CNN with ResNet-152 had accuracy, precision, and recall rates of 94.3%, 96.03%, and 98.13% for 3241 images, respectively. The proposed method processed 0.2 s per frame (i.e., 5 frames per second) on average. The results show that it is possible to accurately and rapidly detect multiple workers in construction site images by employing very deep residual networks without relying on limited assumptions about workers' postures, appearance, and background.

引用

页码：27 / 38

页数：12

共 65 条

[1] Identifying poses of safe and productive masons using machine learning [J].

Alwasel, Abdullatif ;

Sabet, Ali ;

Nahangi, Mohammad ;

Haas, Carl T. ;

Abdel-Rahman, Eihab .

AUTOMATION IN CONSTRUCTION, 2017, 84 :345-355

[2]

[Anonymous], 2017, ARXIV 1605 07678

[3]

[Anonymous], 2015, ICLR

[4]

[Anonymous], PROC CVPR IEEE

[5]

[Anonymous], 2018, J COMPUT CIV ENG

[6]

[Anonymous], IMAGENET MS COCO VIS

[7]

[Anonymous], PROC CVPR IEEE

[8]

[Anonymous], ADV NEURAL INFORM PR, DOI DOI 10.1109/TPAMI.2016.2577031

[9]

[Anonymous], J COMPUT CIVIL ENG

[10] Pedestrian detection for driver assistance using multiresolution infrared vision [J].

Bertozzi, M ;

Broggi, A ;

Fascioli, A ;

Graf, T ;

Meinecke, MM .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2004, 53 (06) :1666-1678

← 1 2 3 4 5 6 7 →