Single- And multi-label classification of construction objects using deep transfer learning methods

被引:0
作者
Nath N.D. [1 ]
Chaspari T. [1 ]
Behzadan A.H. [1 ]
机构
[1] Texas A&M University, United States
来源
Journal of Information Technology in Construction | 2019年 / 24卷
关键词
Construction photos; Convolutional neural networks; Deep learning; Multi-class classification; Multi-label classification; Transfer learning; Web mining;
D O I
10.36680/J.ITCON.2019.028
中图分类号
学科分类号
摘要
Digital images are extensively used to increase the accuracy and timeliness of progress reports, safety training, requests for information (RFIs), productivity monitoring, and claims and litigation. While these images can be sorted using date and time tags, the task of searching an image dataset for specific visual content is not trivial. In pattern recognition, generating metadata tags describing image contents (objects, scenes) or appearance (colors, context) is referred to as multi-label image annotation. Given the large number and diversity of construction images, it is desirable to generate image tags automatically. Previous work has applied pattern matching to synthetic images or images obtained from constrained settings. In this paper, we present deep learning (particularly, transfer learning) algorithms to annotate construction imagery from unconstrained real-world settings with high fidelity. We propose convolutional neural network (CNN)-based algorithms which take RGB values as input and output the labels of detected objects. Particularly, we have investigated two categories of classification tasks: single-label classification, i.e., a single class (among multiple predefined classes) is assigned to an image, and multi-label classification, i.e., a set of (one or more) classes is assigned to an image. For both cases, the VGG-16 model, pre-trained on the ImageNet dataset, is trained on construction images retrieved with web mining techniques and labeled by human annotators. Testing the trained model on previously unseen photos yields an accuracy of ~90% for single-label classification and ~85% for multi-label classification, indicating the high sensitivity and specificity of the designed methodology in reliably identifying the contents of construction imagery. COPYRIGHT: © 2019 The author(s). This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
引用
收藏
页码:511 / 526
页数:15
相关论文
共 35 条
[1]  
Alippi C., Disabato S., Roveri M., Moving convolutional neural networks to embedded systems: The Alexnet and VGG-16 case, Proceedings of 17th ACM/IEEE International Conference on Information Processing in Sensor Networks, pp. 212-223, (2018)
[2]  
Bottou L., Large-scale machine learning with stochastic gradient descent, Proceedings of COMPSTAT, pp. 177-186, (2010)
[3]  
Brilakis I., Soibelman L., Shinagawa Y., Material-based construction site image retrieval, Journal of Computing in Civil Engineering, 19, 4, pp. 341-355, (2005)
[4]  
Brilakis I., Soibelman L., Shape-based retrieval of construction site photographs, Journal of Computing in Civil Engineering, 22, 1, pp. 14-20, (2008)
[5]  
Buja A., Stuetzle W., Shen Y., Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications, pp. 1-49, (2005)
[6]  
Chi S., Caldas C.H., Automated object identification using optical video cameras on construction sites, Computer-Aided Civil and Infrastructure Engineering, 26, 5, pp. 368-380, (2011)
[7]  
Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., ImageNet: A large-scale hierarchical image database, IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, (2009)
[8]  
Dimitrov A., Golparvar-Fard M., Vision-based material recognition for automated monitoring of construction progress and generating building information modeling from unordered site image collections, Advanced Engineering Informatics, 28, 1, pp. 37-49, (2014)
[9]  
Ding L., Fang W., Luo H., Love P.E., Zhong B., Ouyang X., A deep hybrid learning model to detect unsafe behavior: Integrating convolution neural networks and long short-term memory, Automation in Construction, 86, pp. 118-124, (2018)
[10]  
Fergus R., Fei-Fei L., Perona P., Zisserman A., Learning Object Categories from Google's Image Search, pp. 1816-1823, (2005)