LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION

被引：2

作者：

Chen, Ziyi ^{[1
]}

Sugimoto, Akihiro ^{[2
]}

Lai, Shang-Hong ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Hsinchu, Taiwan

[2] Natl Inst Informat, Tokyo, Japan

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

Data augmentation; skeletal interpolation; transformer; 3D human pose estimation;

D O I：

10.1109/ICASSP43922.2022.9746410

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep learning has achieved unprecedented accuracy for monocular 3D human pose estimation. However, current learning-based 3D human pose estimation still suffers from poor generalization. Inspired by skeletal animation, which is popular in game development and animation production, we put forward an simple, intuitive yet effective interpolation-based data augmentation approach to synthesize continuous and diverse 3D human body sequences to enhance model generalization. The Transformer-based lifting network, trained with the augmented data, utilizes the self-attention mechanism to perform 2D-to-3D lifting and successfully infer high-quality predictions in the qualitative experiment. The quantitative result of cross-dataset experiment demonstrates that our resulting model achieves superior generalization accuracy on the publicly available dataset.

引用

页码：4218 / 4222

页数：5

共 22 条

[1]

Akhter I, 2015, PROC CVPR IEEE, P1446, DOI 10.1109/CVPR.2015.7298751

[2]

Ba J. L., 2016, Advances in Neural Information Processing Systems (NeurIPS), P1

[3] Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation [J].

Chen, Xipeng ;

Lin, Kwan-Yee ;

Liu, Wentao ;

Qian, Chen ;

Lin, Liang .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10887-10896

[4] 3D Human Pose Estimation=2D Pose Estimation plus Matching [J].

Chen, Ching-Hang ;

Ramanan, Deva .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5759-5767

[5]

Gong Kehong, 2021, CVPR

[6] In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations [J].

Habibie, Ikhsanul ;

Xu, Weipeng ;

Mehta, Dushyant ;

Pons-Moll, Gerard ;

Theobalt, Christian .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10897-10906

[7]

Hendrycks Dan, 2016, ARXIV LEARNING

[8] Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments [J].

Ionescu, Catalin ;

Papava, Dragos ;

Olaru, Vlad ;

Sminchisescu, Cristian .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (07) :1325-1339

[9]

Ji X, 2020, Virtual Reality and Intelligent Hardware, V2, P471, DOI DOI 10.1016/J.VRIH.2020.04.005

[10] End-to-end Recovery of Human Shape and Pose [J].

Kanazawa, Angjoo ;

Black, Michael J. ;

Jacobs, David W. ;

Malik, Jitendra .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7122-7131

← 1 2 3 →