CWPR: An optimized transformer-based model for construction worker pose estimation on construction robots

被引:0
|
作者
Zhou, Jiakai [1 ]
Zhou, Wanlin [1 ]
Wang, Yang [2 ,3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Mech & Elect Engn, Nanjing 210000, Peoples R China
[2] Anhui Univ Technol, Sch Mech Engn, Maanshan 243000, Peoples R China
[3] Anhui Prov Key Lab Special Heavy Load Robot, Maanshan 243000, Peoples R China
关键词
Construction worker pose; Construction robots; Transformer; Multi-human pose estimation; SURVEILLANCE VIDEOS; RECOGNITION;
D O I
10.1016/j.aei.2024.102894
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Estimating construction workers' poses is critically important for recognizing unsafe behaviors, conducting ergonomic analyses, and assessing productivity. Recently, utilizing construction robots to capture RGB images for pose estimation offers flexible monitoring perspectives and timely interventions. However, existing multi- human pose estimation (MHPE) methods struggle to balance accuracy and speed, making them unsuitable for real-time applications on construction robots. This paper introduces the Construction Worker Pose Recognizer (CWPR), an optimized Transformer-based MHPE model tailored for construction robots. Specifically, CWPR utilizes a lightweight encoder equipped with a multi-scale feature fusion module to enhance operational speed. Then, an Intersection over Union (IoU)-aware query selection strategy is employed to provide high- quality initial queries for the hybrid decoder, significantly improving performance. Besides, a decoder denoising module is used to incorporate noisy ground truth into the decoder, mitigating sample imbalance and further improving accuracy. Additionally, the Construction Worker Pose and Action (CWPA) dataset is collected from 154 videos captured in real construction scenarios. The dataset is annotated for different tasks: a pose benchmark for MHPE and an action benchmark for action recognition. Experiments demonstrate that CWPR achieves top-level accuracy and the fastest inference speed, attaining 68.1 Average Precision (AP) with a processing time of 26 ms on the COCO test set and 76.2 AP with 21 ms on the CWPA pose benchmark. Moreover, when integrated with the action recognition method ST-GCN on construction robot hardware, CWPR achieves 78.7 AP and a processing time of 19 ms on the CWPA action benchmark, validating its effectiveness for practical deployment.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A Transformer-Based Network for Full Object Pose Estimation with Depth Refinement
    Abdulsalam, Mahmoud
    Ahiska, Kenan
    Aouf, Nabil
    ADVANCED INTELLIGENT SYSTEMS, 2024, 6 (10)
  • [2] A vision-based marker-less pose estimation system for articulated construction robots
    Liang, Ci-Jyun
    Lundeen, Kurt M.
    McGee, Wes
    Menassa, Carol C.
    Lee, SangHyun
    Kamat, Vineet R.
    AUTOMATION IN CONSTRUCTION, 2019, 104 : 80 - 94
  • [3] Transformer-based weakly supervised 3D human pose estimation
    Wu, Xiao-guang
    Xie, Hu-jie
    Niu, Xiao-chen
    Wang, Chen
    Wang, Ze-lei
    Zhang, Shi-wen
    Shan, Yu-ze
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2025, 109
  • [4] A transformer-based deep learning model for recognizing communication-oriented entities from patents of ICT in construction
    Wu, Hengqin
    Shen, Geoffrey Qiping
    Lin, Xue
    Li, Minglei
    Li, Clyde Zhengdao
    AUTOMATION IN CONSTRUCTION, 2021, 125
  • [5] Vision-Based Body Pose Estimation of Excavator Using a Transformer-Based Deep-Learning Model
    Ji, Ankang
    Fan, Hongqin
    Xue, Xiaolong
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2025, 39 (02)
  • [6] Transformer-based Reinforcement Learning Model for Optimized Quantitative Trading
    Kumar, Aniket
    Rizk, Rodrigue
    Santosh, K. C.
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 1454 - 1455
  • [7] Preventing falls from floor openings using quadrilateral detection and construction worker pose-estimation
    Park, Minsoo
    Kulinan, Almo Senja
    Tran, Dai Quoc
    Bak, Jinyeong
    Park, Seunghee
    AUTOMATION IN CONSTRUCTION, 2024, 165
  • [8] Transformer-based deep learning model and video dataset for unsafe action identification in construction projects
    Yang, Meng
    Wu, Chengke
    Guo, Yuanjun
    Jiang, Rui
    Zhou, Feixiang
    Zhang, Jianlin
    Yang, Zhile
    AUTOMATION IN CONSTRUCTION, 2023, 146
  • [9] A TRANSFORMER-BASED NETWORK FOR UNIFYING RADIO MAP ESTIMATION AND OPTIMIZED SITE SELECTION
    Zheng, Yi
    Liao, Cunyi
    Wang, Ji
    Liu, Shouyin
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 610 - 614
  • [10] Transformer-Based Parameter Estimation in Statistics
    Yin, Xiaoxin
    Yin, David S.
    MATHEMATICS, 2024, 12 (07)