Single- And multi-label classification of construction objects using deep transfer learning methods

被引：0

作者：

Nath N.D. ^{[1
]}

Chaspari T. ^{[1
]}

Behzadan A.H. ^{[1
]}

机构：

[1] Texas A&M University, United States

来源：

Journal of Information Technology in Construction | 2019年 / 24卷

关键词：

Construction photos; Convolutional neural networks; Deep learning; Multi-class classification; Multi-label classification; Transfer learning; Web mining;

D O I：

10.36680/J.ITCON.2019.028

中图分类号：

学科分类号：

摘要：

Digital images are extensively used to increase the accuracy and timeliness of progress reports, safety training, requests for information (RFIs), productivity monitoring, and claims and litigation. While these images can be sorted using date and time tags, the task of searching an image dataset for specific visual content is not trivial. In pattern recognition, generating metadata tags describing image contents (objects, scenes) or appearance (colors, context) is referred to as multi-label image annotation. Given the large number and diversity of construction images, it is desirable to generate image tags automatically. Previous work has applied pattern matching to synthetic images or images obtained from constrained settings. In this paper, we present deep learning (particularly, transfer learning) algorithms to annotate construction imagery from unconstrained real-world settings with high fidelity. We propose convolutional neural network (CNN)-based algorithms which take RGB values as input and output the labels of detected objects. Particularly, we have investigated two categories of classification tasks: single-label classification, i.e., a single class (among multiple predefined classes) is assigned to an image, and multi-label classification, i.e., a set of (one or more) classes is assigned to an image. For both cases, the VGG-16 model, pre-trained on the ImageNet dataset, is trained on construction images retrieved with web mining techniques and labeled by human annotators. Testing the trained model on previously unseen photos yields an accuracy of ~90% for single-label classification and ~85% for multi-label classification, indicating the high sensitivity and specificity of the designed methodology in reliably identifying the contents of construction imagery. COPYRIGHT: © 2019 The author(s). This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

引用

页码：511 / 526

页数：15

共 35 条

[1]

Alippi C., Disabato S., Roveri M., Moving convolutional neural networks to embedded systems: The Alexnet and VGG-16 case, Proceedings of 17th ACM/IEEE International Conference on Information Processing in Sensor Networks, pp. 212-223, (2018)

[2]

Bottou L., Large-scale machine learning with stochastic gradient descent, Proceedings of COMPSTAT, pp. 177-186, (2010)

[3]

Brilakis I., Soibelman L., Shinagawa Y., Material-based construction site image retrieval, Journal of Computing in Civil Engineering, 19, 4, pp. 341-355, (2005)

[4]

Brilakis I., Soibelman L., Shape-based retrieval of construction site photographs, Journal of Computing in Civil Engineering, 22, 1, pp. 14-20, (2008)

[5]

Buja A., Stuetzle W., Shen Y., Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications, pp. 1-49, (2005)

[6]

Chi S., Caldas C.H., Automated object identification using optical video cameras on construction sites, Computer-Aided Civil and Infrastructure Engineering, 26, 5, pp. 368-380, (2011)

[7]

Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., ImageNet: A large-scale hierarchical image database, IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, (2009)

[8]

Dimitrov A., Golparvar-Fard M., Vision-based material recognition for automated monitoring of construction progress and generating building information modeling from unordered site image collections, Advanced Engineering Informatics, 28, 1, pp. 37-49, (2014)

[9]

Ding L., Fang W., Luo H., Love P.E., Zhong B., Ouyang X., A deep hybrid learning model to detect unsafe behavior: Integrating convolution neural networks and long short-term memory, Automation in Construction, 86, pp. 118-124, (2018)

[10]

Fergus R., Fei-Fei L., Perona P., Zisserman A., Learning Object Categories from Google's Image Search, pp. 1816-1823, (2005)

← 1 2 3 4 →