LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION

被引:2
作者
Chen, Ziyi [1 ]
Sugimoto, Akihiro [2 ]
Lai, Shang-Hong [1 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu, Taiwan
[2] Natl Inst Informat, Tokyo, Japan
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年
关键词
Data augmentation; skeletal interpolation; transformer; 3D human pose estimation;
D O I
10.1109/ICASSP43922.2022.9746410
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep learning has achieved unprecedented accuracy for monocular 3D human pose estimation. However, current learning-based 3D human pose estimation still suffers from poor generalization. Inspired by skeletal animation, which is popular in game development and animation production, we put forward an simple, intuitive yet effective interpolation-based data augmentation approach to synthesize continuous and diverse 3D human body sequences to enhance model generalization. The Transformer-based lifting network, trained with the augmented data, utilizes the self-attention mechanism to perform 2D-to-3D lifting and successfully infer high-quality predictions in the qualitative experiment. The quantitative result of cross-dataset experiment demonstrates that our resulting model achieves superior generalization accuracy on the publicly available dataset.
引用
收藏
页码:4218 / 4222
页数:5
相关论文
共 50 条
  • [1] TSwinPose: Enhanced monocular 3D human pose estimation with JointFlow
    Li, Muyu
    Hu, Henan
    Xiong, Jingjing
    Zhao, Xudong
    Yan, Hong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [2] SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
    Xu, Xiangyu
    Liu, Lijuan
    Yan, Shuicheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3275 - 3289
  • [3] Monocular 3D Pose Estimation via Pose Grammar and Data Augmentation
    Xu, Yuanlu
    Wang, Wenguan
    Liu, Tengyu
    Liu, Xiaobai
    Xie, Jianwen
    Zhu, Song-Chun
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6327 - 6344
  • [4] Frame-Padded Multiscale Transformer for Monocular 3D Human Pose Estimation
    Zhong, Yuanhong
    Yang, Guangxia
    Zhong, Daidi
    Yang, Xun
    Wang, Shanshan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6191 - 6201
  • [5] SlowFastFormer for 3D human pose estimation
    Zhou, Lu
    Chen, Yingying
    Wang, Jinqiao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 243
  • [6] SCALE-Pose: Skeletal Correction and Language Knowledge-assisted for 3D Human Pose Estimation
    Ma, Xinnan
    Li, Yaochen
    Zhao, Limeng
    Zhou, ChenXu
    Xu, Yuncheng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XI, 2025, 15041 : 578 - 592
  • [7] Hierarchical Spatial-Temporal Adaptive Graph Fusion for Monocular 3D Human Pose Estimation
    Zhang, Lijun
    Lu, Feng
    Zhou, Kangkang
    Zhou, Xiang-Dong
    Shi, Yu
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 61 - 65
  • [8] A survey on deep 3D human pose estimation
    Neupane, Rama Bastola
    Li, Kan
    Boka, Tesfaye Fenta
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 58 (01)
  • [9] 3D Hand Pose Estimation From Monocular RGB With Feature Interaction Module
    Guo, Shaoxiang
    Rigall, Eric
    Ju, Yakun
    Dong, Junyu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (08) : 5293 - 5306
  • [10] MixPose: 3D Human Pose Estimation with Mixed Encoder
    Cheng, Jisheng
    Cheng, Qin
    Yang, Mengjie
    Liu, Zhen
    Zhang, Qieshi
    Cheng, Jun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII, 2024, 14432 : 353 - 364