LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION

被引:2
作者
Chen, Ziyi [1 ]
Sugimoto, Akihiro [2 ]
Lai, Shang-Hong [1 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu, Taiwan
[2] Natl Inst Informat, Tokyo, Japan
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年
关键词
Data augmentation; skeletal interpolation; transformer; 3D human pose estimation;
D O I
10.1109/ICASSP43922.2022.9746410
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep learning has achieved unprecedented accuracy for monocular 3D human pose estimation. However, current learning-based 3D human pose estimation still suffers from poor generalization. Inspired by skeletal animation, which is popular in game development and animation production, we put forward an simple, intuitive yet effective interpolation-based data augmentation approach to synthesize continuous and diverse 3D human body sequences to enhance model generalization. The Transformer-based lifting network, trained with the augmented data, utilizes the self-attention mechanism to perform 2D-to-3D lifting and successfully infer high-quality predictions in the qualitative experiment. The quantitative result of cross-dataset experiment demonstrates that our resulting model achieves superior generalization accuracy on the publicly available dataset.
引用
收藏
页码:4218 / 4222
页数:5
相关论文
共 50 条
  • [31] Bidirectional temporal feature for 3D human pose and shape estimation from a video
    Sun, Libo
    Tang, Ting
    Qu, Yuke
    Qin, Wenhu
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2023, 34 (3-4)
  • [32] STRFormer: Spatial-Temporal-ReTemporal Transformer for 3D human pose estimation
    Liu, Xing
    Tang, Hao
    IMAGE AND VISION COMPUTING, 2023, 140
  • [33] SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation
    Liang, Jiayao
    Yin, Mengxiao
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [34] Parallel-branch network for 3D human pose and shape estimation in video
    Wu, Yuanhao
    Wang, Chenxing
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
  • [35] A Study on 3D Human Pose Estimation with a Hybrid Algorithm of Spatio-temporal Semantic Graph Attention and Deep Learning
    Lin, Shengqing
    INFORMATION TECHNOLOGY AND CONTROL, 2024, 53 (04): : 1042 - 1059
  • [36] HOGFormer: high-order graph convolution transformer for 3D human pose estimation
    Xie, Yuhong
    Hong, Chaoqun
    Zhuang, Weiwei
    Liu, Lijuan
    Li, Jie
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (01) : 599 - 610
  • [37] Multi-scale Feature Injection for Occluded 3D Human Pose and Shape Estimation
    Shi, Yunhui
    Ge, Yangyang
    Wang, Jin
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4881 - 4886
  • [38] Spatio-Temporal Dynamic Interlaced Network for 3D human pose estimation in video
    Xu, Feiyi
    Wang, Jifan
    Sun, Ying
    Qi, Jin
    Dong, Zhenjiang
    Sun, Yanfei
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 251
  • [39] Efficient Hierarchical Multi-view Fusion Transformer for 3D Human Pose Estimation
    Zhou, Kangkang
    Zhang, Lijun
    Lu, Feng
    Zhou, Xiang-Dong
    Shi, Yu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7512 - 7520
  • [40] Innovate Spatial-Temporal Attention Network (STAN) for Accurate 3D Mice Pose Estimation with a Single Monocular RGB Camera
    Gong, Liyun
    Yu, Miao
    Kashyap, Gautam Siddharth
    Mccall, Sheldon
    Thota, Mamatha
    Ardakani, Saeid Pourroostaei
    32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 616 - 620