Bidirectional temporal feature for 3D human pose and shape estimation from a video

被引：9

作者：

Sun, Libo ^{[1
,2
]}

Tang, Ting ^{[1
]}

Qu, Yuke ^{[1
]}

Qin, Wenhu ^{[1
,2
]}

机构：

[1] Southeast Univ, Sch Instrument Sci & Engn, Nanjing, Peoples R China

[2] Southeast Univ, Sch Instrument Sci & Engn, Nanjing 210096, Peoples R China

来源：

COMPUTER ANIMATION AND VIRTUAL WORLDS | 2023年 / 34卷 / 3-4期

关键词：

Bi-LSTM; human pose and shape estimation; transformer;

D O I：

10.1002/cav.2187

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

3D human pose and shape estimation is the foundation of analyzing human motion. However, estimating accurate and temporally consistent 3D human motion from a video remains a challenge. By now, most of the video-based methods for estimating 3D human pose and shape rely on unidirectional temporal features and lack more comprehensive information. To solve this problem, we propose a novel model "bidirectional temporal feature for human motion recovery" (BTMR), which consists of a human motion generator and a discriminator. The transformer-based generator effectively captures the forward and reverse temporal features to enhance the temporal correlation between frames and reduces the loss of spatial information. The motion discriminator based on Bi-LSTM can distinguish whether the generated pose sequences are consistent with the realistic sequences of the AMASS dataset. In the process of continuous generation and discrimination, the model can learn more realistic and accurate poses. We evaluate our BTMR on 3DPW and MPI-INF-3DHP datasets. Without the training set of 3DPW, BTMR outperforms VIBE by 2.4 mm and 14.9 mm/s(2) in PA-MPJPE and Accel metrics and outperforms TCMR by 1.7 mm in PA-MPJPE metric on 3DPW. The results demonstrate that our BTMR produces better accurate and temporal consistent 3D human motion.

引用

页数：13

共 50 条

[1] 3D Human Pose Estimation in Video with Temporal and Spatial Transformer
Peng, Sha
Hu, Jiwei
Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12707
[2] Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video
Wei, Wen-Li
Lin, Jen-Chun
Liu, Tyng-Luh
Liao, Hong-Yuan Mark
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, 2022-June : 13201 - 13210
[3] Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video
Wei, Wen-Li
Lin, Jen-Chun
Liu, Tyng-Luh
Liao, Hong-Yuan Mark
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13201 - 13210
[4] Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video
Wei, Wen-Li
Lin, Jen-Chun
Liu, Tyng-Luh
Liao, Hong-Yuan Mark
arXiv, 2022,
[5] Joint Path Alignment Framework for 3D Human Pose and Shape Estimation From Video
Hong, Ji Woo
Yoon, Sunjae
Kim, Junyeong
Yoo, Chang D.
IEEE ACCESS, 2023, 11 : 43267 - 43275
[6] Parallel-branch network for 3D human pose and shape estimation in video
Wu, Yuanhao
Wang, Chenxing
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
[7] Multi-Person Absolute 3D Pose and Shape Estimation from Video
Zhang, Kaifu
Li, Yihui
Guan, Yisheng
Xi, Ning
INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2021, PT III, 2021, 13015 : 189 - 200
[8] Multi-scale Feature Injection for Occluded 3D Human Pose and Shape Estimation
Shi, Yunhui
Ge, Yangyang
Wang, Jin
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4881 - 4886
[9] Self-supervised 3D human pose estimation from video
Gholami, Mohsen
Rezaei, Ahmad
Rhodin, Helge
Ward, Rabab
Wang, Z. Jane
NEUROCOMPUTING, 2022, 488 : 97 - 106
[10] Sequential 3D Human Pose and Shape Estimation from Point Clouds
Wang, Kangkan
Xie, Jin
Zhang, Guofeng
Liu, Lei
Yang, Jian
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 7273 - 7282

← 1 2 3 4 5 →