GMDN: A lightweight graph-based mixture density network for 3D human pose regression

被引：12

作者：

Zou, Lu ^{[1
]}

Huang, Zhangjin ^{[1
]}

Gu, Naijie ^{[1
]}

Wang, Fangjun ^{[1
]}

Yang, Zhouwang ^{[1
]}

Wang, Guoping ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei 230026, Peoples R China

[2] Peking Univ, Beijing 100000, Peoples R China

来源：

COMPUTERS & GRAPHICS-UK | 2021年 / 95卷

基金：

中国国家自然科学基金;

关键词：

3D human pose estimation; Graph convolutional network; Mixture density network;

D O I：

10.1016/j.cag.2021.01.010

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

3D human pose estimation from 2D detections is an ill-posed problem because multiple solutions may exist due to the inherent ambiguity and occlusion. In this paper, we propose a novel graph-based mixture density network (GMDN) to tackle the 2D-to-3D human pose estimation problem. We formulate the 2D joint locations of the human body as a graph, and thus the pose estimation task can be redefined as a graph regression problem. Additionally, we present a novel graph convolutional operation with the incorporation of structural knowledge about human body configurations to assist with reasoning of the structural relations implied in the human bodies. Furthermore, we employ mixture density networks to formulate the 3D human poses as a multimodal distribution. The presented GMDN is lightweight with only 0.30M parameters, and the experimental results demonstrate that it achieves state-of-the-art performance. ? 2021 Elsevier Ltd. All rights reserved. 3D human pose estimation from 2D detections is an ill-posed problem because multiple solutions may exist due to the inherent ambiguity and occlusion. In this paper, we propose a novel graph-based mixture density network (GMDN) to tackle the 2D-to-3D human pose estimation problem. We formulate the 2D joint locations of the human body as a graph, and thus the pose estimation task can be redefined as a graph regression problem. Additionally, we present a novel graph convolutional operation with the incorporation of structural knowledge about human body configurations to assist with reasoning of the structural relations implied in the human bodies. Furthermore, we employ mixture density networks to formulate the 3D human poses as a multimodal distribution. The presented GMDN is lightweight with only 0.30M parameters, and the experimental results demonstrate that it achieves state-of-the-art performance.

引用

页码：115 / 122

页数：8

共 30 条

[21] Stacked Hourglass Networks for Human Pose Estimation [J].

Newell, Alejandro ;

Yang, Kaiyu ;

Deng, Jia .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :483-499

[22] Ordinal Depth Supervision for 3D Human Pose Estimation [J].

Pavlakos, Georgios ;

Zhou, Xiaowei ;

Daniilidis, Kostas .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7307-7316

[23] Harvesting Multiple Views for Marker-less 3D Human Pose Annotations [J].

Pavlakos, Georgios ;

Zhou, Xiaowei ;

Derpanis, Konstantinos G. ;

Daniilidis, Kostas .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1253-1262

[24] Multimodal 3D Human Pose Estimation from a Single Image [J].

Spurlock, Scott ;

Souvenir, Richard .

2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, :663-670

[25] Non-local Neural Networks [J].

Wang, Xiaolong ;

Girshick, Ross ;

Gupta, Abhinav ;

He, Kaiming .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7794-7803

[26]

Yan SJ, 2018, AAAI CONF ARTIF INTE, P7444

[27] 3D Human Pose Estimation in the Wild by Adversarial Learning [J].

Yang, Wei ;

Ouyang, Wanli ;

Wang, Xiaolong ;

Ren, Jimmy ;

Li, Hongsheng ;

Wang, Xiaogang .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5255-5264

[28] Semantic Graph Convolutional Networks for 3D Human Pose Regression [J].

Zhao, Long ;

Peng, Xi ;

Tian, Yu ;

Kapadia, Mubbasir ;

Metaxas, Dimitris N. .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3420-3430

[29] Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach [J].

Zhou, Xingyi ;

Huang, Qixing ;

Sun, Xiao ;

Xue, Xiangyang ;

Wei, Yichen .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :398-407

[30]

Zou Z., 2020, BMVC

← 1 2 3 →