Parallel-branch network for 3D human pose and shape estimation in video

被引:5
作者
Wu, Yuanhao [1 ]
Wang, Chenxing [2 ]
机构
[1] Southeast Univ, Suzhou Res Inst, Nanjing, Peoples R China
[2] Southeast Univ, Nanjing, Peoples R China
关键词
human pose estimation; parallel networks; transformer;
D O I
10.1002/cav.2078
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Human pose and shape estimation have developed rapidly, where a skinned multi-person linear (SMPL) approach performs excellent recently. However, the prior template of the human body in the SMPL model is fixed, thus a deviation may be resulted in the reconstructed body shape if a human body acts sharp movements such as sporting or dancing. To address this problem, we propose a parallel-branch network including a designed spatial-temporal (ST) branch and a SMPL branch. The ST branch essentially performs the 2D-to-3D lifting for more accurate joint prediction, by the designed spatial transformer and temporal transformer. The 3D joints from the ST branch are used to supervise the 3D joints from the SMPL branch and further correct the deviation of the SMPL model. Experiments on some popular benchmarks like 3DPW and MPI-INF-3DHP show that our method has better performance than other methods with video input. Our code is available at
引用
收藏
页数:10
相关论文
共 36 条
[21]   SMPL: A Skinned Multi-Person Linear Model [J].
Loper, Matthew ;
Mahmood, Naureen ;
Romero, Javier ;
Pons-Moll, Gerard ;
Black, Michael J. .
ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (06)
[22]  
Luan Tom H, 2021, ARXIV PREPRINT ARXIV
[23]  
Luo Zhengyi, 2020, P ASIAN C COMPUTER V
[24]   A simple yet effective baseline for 3d human pose estimation [J].
Martinez, Julieta ;
Hossain, Rayat ;
Romero, Javier ;
Little, James J. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2659-2668
[25]   InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image [J].
Moon, Gyeongsik ;
Yu, Shoou-I ;
Wen, He ;
Shiratori, Takaaki ;
Lee, Kyoung Mu .
COMPUTER VISION - ECCV 2020, PT XX, 2020, 12365 :548-564
[26]   Ordinal Depth Supervision for 3D Human Pose Estimation [J].
Pavlakos, Georgios ;
Zhou, Xiaowei ;
Daniilidis, Kostas .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7307-7316
[27]   3D human pose estimation in video with temporal convolutions and semi-supervised training [J].
Pavllo, Dario ;
Feichtenhofer, Christoph ;
Grangier, David ;
Auli, Michael .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7745-7754
[28]   Deep High-Resolution Representation Learning for Human Pose Estimation [J].
Sun, Ke ;
Xiao, Bin ;
Liu, Dong ;
Wang, Jingdong .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5686-5696
[29]   Monocular, One-stage, Regression of Multiple 3D People [J].
Sun, Yu ;
Bao, Qian ;
Liu, Wu ;
Fu, Yili ;
Black, Michael J. ;
Mei, Tao .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :11159-11168
[30]   Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation [J].
Sun, Yu ;
Ye, Yun ;
Liu, Wu ;
Gao, Wenpeng ;
Fu, YiLi ;
Mei, Tao .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5348-5357