Semi-supervised learning approach for construction object detection by integrating super-resolution and mean teacher network

被引:0
作者
Zhang, Wen-Jie [1 ]
Wan, Hua-Ping [1 ]
Hu, Peng-Hua [1 ]
Ge, Hui-Bin [1 ]
Luo, Yaozhi [1 ]
Todd, Michael D. [2 ]
机构
[1] Zhejiang Univ, Coll Civil Engn & Architecture, Hangzhou 310058, Peoples R China
[2] Univ Calif San Diego, Dept Struct Engn, 9500 Gilman Dr 0085, La Jolla, CA 92093 USA
来源
JOURNAL OF INFRASTRUCTURE INTELLIGENCE AND RESILIENCE | 2024年 / 3卷 / 04期
关键词
Construction object detection; Deep learning; Mean teacher network; Super-resolution; Semi-supervised learning; CHALLENGES;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Deep learning-based object detection methods are utilized for safety management at construction sites, which require large-scale, high-quality, and well-labeled datasets for training. The existing construction datasets are relatively small due to the high expense of labor-intensive annotation, and the varying quality of the construction images also affects the detection performance of the model. To address the limitations of datasets, this study proposes a new method for construction object detection by integrating super-resolution and semi-supervised learning. The proposed method improves the quality of construction images and achieves excellent detection performance with limited labeled data. First, the Real-ESRGAN model is introduced to improve the quality of construction images and make the construction objects visible. The proposed super-resolution method can enhance the texture details of low-resolution images, hence improving the performance of object detection models. Second, the mean-teacher network is adopted to expand the training set, thus avoiding the laborintensive annotation work. To verify the effectiveness of the proposed method, the method is applied to the state-of-the-art Yolov5 object detection model, and construction images from the Site Object Detection Dataset (SODA) with different labeled data proportions (from 10% to 50% in 10% intervals with an extreme case of 5%) are used as the training set. By comparing with the existing supervised learning method, it is shown that the proposed method can achieve better detection performance. In particular, the method is more effective in enhancing detection performance when the proportion of the labeled data is smaller, which is of great practical value in real-world engineering. The experimental results show the potential of the proposed method in improving image quality and reducing the expense of developing construction datasets.
引用
收藏
页数:12
相关论文
共 41 条
[11]   Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) :1904-1916
[12]  
Hinton G., 2015, NIPS DEEP LEARN REPR
[13]   Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network [J].
Ledig, Christian ;
Theis, Lucas ;
Huszar, Ferenc ;
Caballero, Jose ;
Cunningham, Andrew ;
Acosta, Alejandro ;
Aitken, Andrew ;
Tejani, Alykhan ;
Totz, Johannes ;
Wang, Zehan ;
Shi, Wenzhe .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :105-114
[14]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[15]   Path Aggregation Network for Instance Segmentation [J].
Liu, Shu ;
Qi, Lu ;
Qin, Haifang ;
Shi, Jianping ;
Jia, Jiaya .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8759-8768
[16]  
Ministry of Housing and Urban-Rural Development of the People's Republic of China, 2020, Circular of the General Office of the Ministry of Housing and Urban-Rural Development on the Production Safety Accidents of Housing and Municipal Engineering in 2019
[17]   RETRACTED: Breast Tumor Detection and Classification in Mammogram Images Using Modified YOLOv5 Network (Retracted Article) [J].
Mohiyuddin, Aqsa ;
Basharat, Asma ;
Ghani, Usman ;
Peter, Vesely ;
Abbas, Sidra ;
Naeem, Osama Bin ;
Rizwan, Muhammad .
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2022, 2022
[18]   Real-time monitoring unsafe behaviors of portable multi-position ladder worker using deep learning based on vision data [J].
Park, Minsoo ;
Tran, Dai Quoc ;
Bak, Jinyeong ;
Kulinan, Almo Senja ;
Park, Seunghee .
JOURNAL OF SAFETY RESEARCH, 2023, 87 :465-480
[19]  
Redmon J., 2018, arXiv
[20]   Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149