LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION

被引：2

作者：

Chen, Ziyi ^{[1
]}

Sugimoto, Akihiro ^{[2
]}

Lai, Shang-Hong ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Hsinchu, Taiwan

[2] Natl Inst Informat, Tokyo, Japan

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

Data augmentation; skeletal interpolation; transformer; 3D human pose estimation;

D O I：

10.1109/ICASSP43922.2022.9746410

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep learning has achieved unprecedented accuracy for monocular 3D human pose estimation. However, current learning-based 3D human pose estimation still suffers from poor generalization. Inspired by skeletal animation, which is popular in game development and animation production, we put forward an simple, intuitive yet effective interpolation-based data augmentation approach to synthesize continuous and diverse 3D human body sequences to enhance model generalization. The Transformer-based lifting network, trained with the augmented data, utilizes the self-attention mechanism to perform 2D-to-3D lifting and successfully infer high-quality predictions in the qualitative experiment. The quantitative result of cross-dataset experiment demonstrates that our resulting model achieves superior generalization accuracy on the publicly available dataset.

引用

页码：4218 / 4222

页数：5

共 50 条

[21] Image-Based Synthesis for Deep 3D Human Pose Estimation
Grégory Rogez
Cordelia Schmid
International Journal of Computer Vision, 2018, 126 : 993 - 1008
[22] Exploiting Temporal Contexts With Strided Transformer for 3D Human Pose Estimation
Li, Wenhao
Liu, Hong
Ding, Runwei
Liu, Mengyuan
Wang, Pichao
Yang, Wenming
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1282 - 1293
[23] Combination of Deep Learner Network and Transformer for 3D Human Pose Estimation
Tien-Dat Tran
Xuan-Thuy Vo
Duy-Linh Nguyen
Jo, Kang-Hyun
2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 174 - 178
[24] PROGRESSIVE MULTI-VIEW FUSION FOR 3D HUMAN POSE ESTIMATION
Zhang, Lijun
Zhou, Kangkang
Liu, Liangchen
Li, Zhenghao
Zhao, Xunyi
Zhou, Xiang-Dong
Shi, Yu
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1600 - 1604
[25] LOCAL TO GLOBAL TRANSFORMER FOR VIDEO BASED 3D HUMAN POSE ESTIMATION
Ma, Haifeng
Ke Lu
Xue, Jian
Niu, Zehai
Gao, Pengcheng
2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,
[26] Joint multi-scale transformers and pose equivalence constraints for 3D human pose estimation
Wu, Yongpeng
Kong, Dehui
Gao, Junna
Li, Jinghua
Yin, Baocai
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 103
[27] HDPose: Post-Hierarchical Diffusion with Conditioning for 3D Human Pose Estimation
Lee, Donghoon
Kim, Jaeho
SENSORS, 2024, 24 (03)
[28] Multi-hop graph transformer network for 3D human pose estimation
Islam, Zaedul
Ben Hamza, A.
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 101
[29] A Novel Auxiliary Task Framework in 3D Human Pose Estimation for Opera Videos
Cai, Xingquan
Zhang, Haoyu
He, Shanshan
Song, Haoyu
Sun, Haiyan
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 202 - 210
[30] Global and Local Spatio-Temporal Encoder for 3D Human Pose Estimation
Wang, Yong
Kang, Hongbo
Wu, Doudou
Yang, Wenming
Zhang, Longbin
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4039 - 4049

← 1 2 3 4 5 →