LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION

被引:2
作者
Chen, Ziyi [1 ]
Sugimoto, Akihiro [2 ]
Lai, Shang-Hong [1 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu, Taiwan
[2] Natl Inst Informat, Tokyo, Japan
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年
关键词
Data augmentation; skeletal interpolation; transformer; 3D human pose estimation;
D O I
10.1109/ICASSP43922.2022.9746410
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep learning has achieved unprecedented accuracy for monocular 3D human pose estimation. However, current learning-based 3D human pose estimation still suffers from poor generalization. Inspired by skeletal animation, which is popular in game development and animation production, we put forward an simple, intuitive yet effective interpolation-based data augmentation approach to synthesize continuous and diverse 3D human body sequences to enhance model generalization. The Transformer-based lifting network, trained with the augmented data, utilizes the self-attention mechanism to perform 2D-to-3D lifting and successfully infer high-quality predictions in the qualitative experiment. The quantitative result of cross-dataset experiment demonstrates that our resulting model achieves superior generalization accuracy on the publicly available dataset.
引用
收藏
页码:4218 / 4222
页数:5
相关论文
共 50 条
  • [21] Image-Based Synthesis for Deep 3D Human Pose Estimation
    Grégory Rogez
    Cordelia Schmid
    International Journal of Computer Vision, 2018, 126 : 993 - 1008
  • [22] Exploiting Temporal Contexts With Strided Transformer for 3D Human Pose Estimation
    Li, Wenhao
    Liu, Hong
    Ding, Runwei
    Liu, Mengyuan
    Wang, Pichao
    Yang, Wenming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1282 - 1293
  • [23] Combination of Deep Learner Network and Transformer for 3D Human Pose Estimation
    Tien-Dat Tran
    Xuan-Thuy Vo
    Duy-Linh Nguyen
    Jo, Kang-Hyun
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 174 - 178
  • [24] PROGRESSIVE MULTI-VIEW FUSION FOR 3D HUMAN POSE ESTIMATION
    Zhang, Lijun
    Zhou, Kangkang
    Liu, Liangchen
    Li, Zhenghao
    Zhao, Xunyi
    Zhou, Xiang-Dong
    Shi, Yu
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1600 - 1604
  • [25] LOCAL TO GLOBAL TRANSFORMER FOR VIDEO BASED 3D HUMAN POSE ESTIMATION
    Ma, Haifeng
    Ke Lu
    Xue, Jian
    Niu, Zehai
    Gao, Pengcheng
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,
  • [26] Joint multi-scale transformers and pose equivalence constraints for 3D human pose estimation
    Wu, Yongpeng
    Kong, Dehui
    Gao, Junna
    Li, Jinghua
    Yin, Baocai
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 103
  • [27] HDPose: Post-Hierarchical Diffusion with Conditioning for 3D Human Pose Estimation
    Lee, Donghoon
    Kim, Jaeho
    SENSORS, 2024, 24 (03)
  • [28] Multi-hop graph transformer network for 3D human pose estimation
    Islam, Zaedul
    Ben Hamza, A.
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 101
  • [29] A Novel Auxiliary Task Framework in 3D Human Pose Estimation for Opera Videos
    Cai, Xingquan
    Zhang, Haoyu
    He, Shanshan
    Song, Haoyu
    Sun, Haiyan
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 202 - 210
  • [30] Global and Local Spatio-Temporal Encoder for 3D Human Pose Estimation
    Wang, Yong
    Kang, Hongbo
    Wu, Doudou
    Yang, Wenming
    Zhang, Longbin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4039 - 4049