Evaluation of hybrid deep learning and optimization method for 3D human pose and shape reconstruction in simulated depth images

被引:2
作者
Wang, Xiaofang [1 ,2 ]
Prevost, Stephanie [2 ]
Boukhayma, Adnane [3 ]
Desjardin, Eric [2 ]
Loscos, Celine [2 ]
Morisset, Benoit [1 ]
Multon, Franck [3 ,4 ]
机构
[1] AI Verse, Biot, France
[2] Univ Reims, Reims, France
[3] Univ Rennes, Inria, CNRS, IRISA, Rennes, France
[4] Univ Rennes, M2S, Rennes, France
来源
COMPUTERS & GRAPHICS-UK | 2023年 / 115卷
关键词
Human motion capture; Shape reconstruction; Deep learning; Computer vision; Depth sensor;
D O I
10.1016/j.cag.2023.07.005
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we address the problem of capturing both the shape and the pose of a human character using a single depth sensor. Some previous works proposed to fit a parametric generic human template into the depth image, while others developed deep learning (DL) approaches to find the correspondence between depth pixels and vertices of the template. We designed a hybrid approach, combining the advantages of both methods, and conducted extensive experiments on the SURREAL Varol et al. (2017), DFAUST datasets Bogo etal (2017) and a subset of AMASS Mahmood et al. (2019). Results show that this hybrid approach enables us to enhance pose and shape estimation compared to using DL or model fitting separately. We also evaluated the ability of the DL-based dense correspondence method to segment also the background - not only the body parts. We also evaluated 4 different methods to perform the model fitting based on a dense correspondence, where the number of available 3D points differs from the number of corresponding template vertices. These two results enabled us to better understand how to combine DL and model fitting, and the potential limits of this approach to deal with real-depth images. Future works could explore the potential of taking temporal information into account, which has proven to increase the accuracy of pose and shape reconstruction based on a unique depth or RGB image.& COPY; 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:158 / 166
页数:9
相关论文
共 51 条
[1]   SCAPE: Shape Completion and Animation of People [J].
Anguelov, D ;
Srinivasan, P ;
Koller, D ;
Thrun, S ;
Rodgers, J ;
Davis, J .
ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416
[2]  
[Anonymous], VERS
[3]  
Baak A, 2011, IEEE I CONF COMP VIS, P1092, DOI 10.1109/ICCV.2011.6126356
[4]   Real-time RGBD-based Extended Body Pose Estimation [J].
Bashirov, Renat ;
Ianina, Anastasia ;
Iskakov, Karim ;
Kononenko, Yevgeniy ;
Strizhkova, Valeriya ;
Lempitsky, Victor ;
Vakhitov, Alexander .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, :2806-2815
[5]  
Bhatnagar Bharat Lal, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12347), P311, DOI 10.1007/978-3-030-58536-5_19
[6]   Dynamic FAUST: Registering Human Bodies in Motion [J].
Bogo, Federica ;
Romero, Javier ;
Pons-Moll, Gerard ;
Black, Michael J. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5573-5582
[7]   Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image [J].
Bogo, Federica ;
Kanazawa, Angjoo ;
Lassner, Christoph ;
Gehler, Peter ;
Romero, Javier ;
Black, Michael J. .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :561-578
[8]   Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences [J].
Bogo, Federica ;
Black, Michael J. ;
Loper, Matthew ;
Romero, Javier .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2300-2308
[9]   Monocular human pose estimation: A survey of deep learning-based methods [J].
Chen, Yucheng ;
Tian, Yingli ;
He, Mingyi .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2020, 192
[10]  
Choi Hongsuk, 2020, EUROPEAN C COMPUTER