LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION

被引：2

作者：

Chen, Ziyi ^{[1
]}

Sugimoto, Akihiro ^{[2
]}

Lai, Shang-Hong ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Hsinchu, Taiwan

[2] Natl Inst Informat, Tokyo, Japan

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

Data augmentation; skeletal interpolation; transformer; 3D human pose estimation;

D O I：

10.1109/ICASSP43922.2022.9746410

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep learning has achieved unprecedented accuracy for monocular 3D human pose estimation. However, current learning-based 3D human pose estimation still suffers from poor generalization. Inspired by skeletal animation, which is popular in game development and animation production, we put forward an simple, intuitive yet effective interpolation-based data augmentation approach to synthesize continuous and diverse 3D human body sequences to enhance model generalization. The Transformer-based lifting network, trained with the augmented data, utilizes the self-attention mechanism to perform 2D-to-3D lifting and successfully infer high-quality predictions in the qualitative experiment. The quantitative result of cross-dataset experiment demonstrates that our resulting model achieves superior generalization accuracy on the publicly available dataset.

引用

页码：4218 / 4222

页数：5

共 50 条

[31] Bidirectional temporal feature for 3D human pose and shape estimation from a video
Sun, Libo
Tang, Ting
Qu, Yuke
Qin, Wenhu
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2023, 34 (3-4)
[32] STRFormer: Spatial-Temporal-ReTemporal Transformer for 3D human pose estimation
Liu, Xing
Tang, Hao
IMAGE AND VISION COMPUTING, 2023, 140
[33] SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation
Liang, Jiayao
Yin, Mengxiao
APPLIED SCIENCES-BASEL, 2024, 14 (04):
[34] Parallel-branch network for 3D human pose and shape estimation in video
Wu, Yuanhao
Wang, Chenxing
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
[35] A Study on 3D Human Pose Estimation with a Hybrid Algorithm of Spatio-temporal Semantic Graph Attention and Deep Learning
Lin, Shengqing
INFORMATION TECHNOLOGY AND CONTROL, 2024, 53 (04): : 1042 - 1059
[36] HOGFormer: high-order graph convolution transformer for 3D human pose estimation
Xie, Yuhong
Hong, Chaoqun
Zhuang, Weiwei
Liu, Lijuan
Li, Jie
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (01) : 599 - 610
[37] Multi-scale Feature Injection for Occluded 3D Human Pose and Shape Estimation
Shi, Yunhui
Ge, Yangyang
Wang, Jin
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4881 - 4886
[38] Spatio-Temporal Dynamic Interlaced Network for 3D human pose estimation in video
Xu, Feiyi
Wang, Jifan
Sun, Ying
Qi, Jin
Dong, Zhenjiang
Sun, Yanfei
COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 251
[39] Efficient Hierarchical Multi-view Fusion Transformer for 3D Human Pose Estimation
Zhou, Kangkang
Zhang, Lijun
Lu, Feng
Zhou, Xiang-Dong
Shi, Yu
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7512 - 7520
[40] Innovate Spatial-Temporal Attention Network (STAN) for Accurate 3D Mice Pose Estimation with a Single Monocular RGB Camera
Gong, Liyun
Yu, Miao
Kashyap, Gautam Siddharth
Mccall, Sheldon
Thota, Mamatha
Ardakani, Saeid Pourroostaei
32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 616 - 620

← 1 2 3 4 5 →