Graph U-Shaped Network with Mapping-Aware Local Enhancement for Single-Frame 3D Human Pose Estimation

被引:0
作者
Yu, Bing [1 ]
Huang, Yan [1 ]
Cheng, Guang [1 ]
Huang, Dongjin [1 ]
Ding, Youdong [1 ]
机构
[1] Shanghai Univ, Shanghai Film Acad, Shanghai 200072, Peoples R China
关键词
3D human pose estimation; graph convolutional network; multi-scale feature fusion;
D O I
10.3390/electronics12194120
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The development of 2D-to-3D approaches for 3D monocular single-frame human pose estimation faces challenges related to noisy input and failure to capture long-range joint correlations, leading to unreasonable predictions. To this end, we propose a straightforward, but effective U-shaped network called the mapping-aware U-shaped graph convolutional network (M-UGCN) for single-frame applications. This network applies skeletal pooling/unpooling operations to expand the limited convolutional receptive field. For noisy inputs, as local nodes have direct access to the subtle discrepancies between poses, we define an additional mapping-aware local-enhancement mechanism to focus on local node interactions across multiple scales. We evaluated our proposed method on the benchmark datasets Human3.6M and MPI-INF-3DHP, and the experimental results demonstrated the robustness of the M-UGCN against noisy inputs. Notably, the average error in the proposed method was found to be 4.1% lower when compared to state-of-the-art methods adopting similar multi-scale learning approaches.
引用
收藏
页数:23
相关论文
共 58 条
  • [1] Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image
    Bogo, Federica
    Kanazawa, Angjoo
    Lassner, Christoph
    Gehler, Peter
    Romero, Javier
    Black, Michael J.
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 561 - 578
  • [2] Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
    Cai, Yujun
    Ge, Liuhao
    Liu, Jun
    Cai, Jianfei
    Cham, Tat-Jen
    Yuan, Junsong
    Thalmann, Nadia Magnenat
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2272 - 2281
  • [3] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
    Cai, Zhaowei
    Fan, Quanfu
    Feris, Rogerio S.
    Vasconcelos, Nuno
    [J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 354 - 370
  • [4] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
    Cao, Zhe
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
  • [5] Cascaded Pyramid Network for Multi-Person Pose Estimation
    Chen, Yilun
    Wang, Zhicheng
    Peng, Yuxiang
    Zhang, Zhiqiang
    Yu, Gang
    Sun, Jian
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7103 - 7112
  • [6] Cheng Y, 2021, AAAI CONF ARTIF INTE, V35, P1157
  • [7] Choi J, 2022, Arxiv, DOI arXiv:2212.02796
  • [8] Optimizing Network Structure for 3D Human Pose Estimation
    Ci, Hai
    Wang, Chunyu
    Ma, Xiaoxuan
    Wang, Yizhou
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2262 - 2271
  • [9] Defferrard M, 2016, ADV NEUR IN, V29
  • [10] Gao HY, 2019, PR MACH LEARN RES, V97