Bidirectional temporal feature for 3D human pose and shape estimation from a video

被引:9
|
作者
Sun, Libo [1 ,2 ]
Tang, Ting [1 ]
Qu, Yuke [1 ]
Qin, Wenhu [1 ,2 ]
机构
[1] Southeast Univ, Sch Instrument Sci & Engn, Nanjing, Peoples R China
[2] Southeast Univ, Sch Instrument Sci & Engn, Nanjing 210096, Peoples R China
关键词
Bi-LSTM; human pose and shape estimation; transformer;
D O I
10.1002/cav.2187
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
3D human pose and shape estimation is the foundation of analyzing human motion. However, estimating accurate and temporally consistent 3D human motion from a video remains a challenge. By now, most of the video-based methods for estimating 3D human pose and shape rely on unidirectional temporal features and lack more comprehensive information. To solve this problem, we propose a novel model "bidirectional temporal feature for human motion recovery" (BTMR), which consists of a human motion generator and a discriminator. The transformer-based generator effectively captures the forward and reverse temporal features to enhance the temporal correlation between frames and reduces the loss of spatial information. The motion discriminator based on Bi-LSTM can distinguish whether the generated pose sequences are consistent with the realistic sequences of the AMASS dataset. In the process of continuous generation and discrimination, the model can learn more realistic and accurate poses. We evaluate our BTMR on 3DPW and MPI-INF-3DHP datasets. Without the training set of 3DPW, BTMR outperforms VIBE by 2.4 mm and 14.9 mm/s(2) in PA-MPJPE and Accel metrics and outperforms TCMR by 1.7 mm in PA-MPJPE metric on 3DPW. The results demonstrate that our BTMR produces better accurate and temporal consistent 3D human motion.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] 3D Human Pose Estimation in Video with Temporal and Spatial Transformer
    Peng, Sha
    Hu, Jiwei
    Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12707
  • [2] Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video
    Wei, Wen-Li
    Lin, Jen-Chun
    Liu, Tyng-Luh
    Liao, Hong-Yuan Mark
    Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, 2022-June : 13201 - 13210
  • [3] Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video
    Wei, Wen-Li
    Lin, Jen-Chun
    Liu, Tyng-Luh
    Liao, Hong-Yuan Mark
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13201 - 13210
  • [4] Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video
    Wei, Wen-Li
    Lin, Jen-Chun
    Liu, Tyng-Luh
    Liao, Hong-Yuan Mark
    arXiv, 2022,
  • [5] Joint Path Alignment Framework for 3D Human Pose and Shape Estimation From Video
    Hong, Ji Woo
    Yoon, Sunjae
    Kim, Junyeong
    Yoo, Chang D.
    IEEE ACCESS, 2023, 11 : 43267 - 43275
  • [6] Parallel-branch network for 3D human pose and shape estimation in video
    Wu, Yuanhao
    Wang, Chenxing
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
  • [7] Multi-Person Absolute 3D Pose and Shape Estimation from Video
    Zhang, Kaifu
    Li, Yihui
    Guan, Yisheng
    Xi, Ning
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2021, PT III, 2021, 13015 : 189 - 200
  • [8] Multi-scale Feature Injection for Occluded 3D Human Pose and Shape Estimation
    Shi, Yunhui
    Ge, Yangyang
    Wang, Jin
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4881 - 4886
  • [9] Self-supervised 3D human pose estimation from video
    Gholami, Mohsen
    Rezaei, Ahmad
    Rhodin, Helge
    Ward, Rabab
    Wang, Z. Jane
    NEUROCOMPUTING, 2022, 488 : 97 - 106
  • [10] Sequential 3D Human Pose and Shape Estimation from Point Clouds
    Wang, Kangkan
    Xie, Jin
    Zhang, Guofeng
    Liu, Lei
    Yang, Jian
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 7273 - 7282