Graph U-Shaped Network with Mapping-Aware Local Enhancement for Single-Frame 3D Human Pose Estimation

被引：0

作者：

Yu, Bing ^{[1
]}

Huang, Yan ^{[1
]}

Cheng, Guang ^{[1
]}

Huang, Dongjin ^{[1
]}

Ding, Youdong ^{[1
]}

机构：

[1] Shanghai Univ, Shanghai Film Acad, Shanghai 200072, Peoples R China

来源：

ELECTRONICS | 2023年 / 12卷 / 19期

关键词：

3D human pose estimation; graph convolutional network; multi-scale feature fusion;

D O I：

10.3390/electronics12194120

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The development of 2D-to-3D approaches for 3D monocular single-frame human pose estimation faces challenges related to noisy input and failure to capture long-range joint correlations, leading to unreasonable predictions. To this end, we propose a straightforward, but effective U-shaped network called the mapping-aware U-shaped graph convolutional network (M-UGCN) for single-frame applications. This network applies skeletal pooling/unpooling operations to expand the limited convolutional receptive field. For noisy inputs, as local nodes have direct access to the subtle discrepancies between poses, we define an additional mapping-aware local-enhancement mechanism to focus on local node interactions across multiple scales. We evaluated our proposed method on the benchmark datasets Human3.6M and MPI-INF-3DHP, and the experimental results demonstrated the robustness of the M-UGCN against noisy inputs. Notably, the average error in the proposed method was found to be 4.1% lower when compared to state-of-the-art methods adopting similar multi-scale learning approaches.

引用

页数：23

共 58 条

[1] Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image
Bogo, Federica
Kanazawa, Angjoo
Lassner, Christoph
Gehler, Peter
Romero, Javier
Black, Michael J.
[J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 561 - 578
[2] Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
Cai, Yujun
Ge, Liuhao
Liu, Jun
Cai, Jianfei
Cham, Tat-Jen
Yuan, Junsong
Thalmann, Nadia Magnenat
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2272 - 2281
[3] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
Cai, Zhaowei
Fan, Quanfu
Feris, Rogerio S.
Vasconcelos, Nuno
[J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 354 - 370
[4] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Cao, Zhe
Simon, Tomas
Wei, Shih-En
Sheikh, Yaser
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
[5] Cascaded Pyramid Network for Multi-Person Pose Estimation
Chen, Yilun
Wang, Zhicheng
Peng, Yuxiang
Zhang, Zhiqiang
Yu, Gang
Sun, Jian
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7103 - 7112
[6] Cheng Y, 2021, AAAI CONF ARTIF INTE, V35, P1157
[7] Choi J, 2022, Arxiv, DOI arXiv:2212.02796
[8] Optimizing Network Structure for 3D Human Pose Estimation
Ci, Hai
Wang, Chunyu
Ma, Xiaoxuan
Wang, Yizhou
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2262 - 2271
[9] Defferrard M, 2016, ADV NEUR IN, V29
[10] Gao HY, 2019, PR MACH LEARN RES, V97

← 1 2 3 4 5 6 →