LiveCap: Real-Time Human Performance Capture From Monocular Video4

被引:172
作者
Habermann, Marc [1 ]
Xu, Weipeng [1 ]
Zollhofer, Michael [2 ]
Pons-Moll, Gerard [1 ]
Theobalt, Christian [1 ]
机构
[1] Max Planck Inst Informat, Campus E1 4,Stuhlsatzenhausweg, D-66123 Saarbrucken, Germany
[2] Stanford Univ, 353 Serra Mall,RM 386, Stanford, CA 94305 USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2019年 / 38卷 / 02期
关键词
Monocular performance capture; 3D pose estimation; human body; non-rigid surface deformation; INTERACTING CHARACTERS; MOTION CAPTURE; SHAPE; POSE; TRACKING;
D O I
10.1145/3311970
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per frame are solved with specially tailored data-parallel Gauss-Newton solvers. To achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques while being orders of magnitude faster.
引用
收藏
页数:17
相关论文
共 100 条
[1]  
Allain B, 2015, PROC CVPR IEEE, P268, DOI 10.1109/CVPR.2015.7298623
[2]   SCAPE: Shape Completion and Animation of People [J].
Anguelov, D ;
Srinivasan, P ;
Koller, D ;
Thrun, S ;
Rodgers, J ;
Davis, J .
ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416
[3]  
[Anonymous], 2014, ACM T GRAPHIC, DOI DOI 10.1145/2601097.2601165
[4]  
[Anonymous], 2012, ACM TOG PROC SIGGRAP
[5]  
Balan AO, 2008, LECT NOTES COMPUT SC, V5303, P15, DOI 10.1007/978-3-540-88688-4_2
[6]  
Balan AO, 2007, IEEE I CONF COMP VIS, P1379
[7]   Shape-from-Template [J].
Bartoli, Adrien ;
Gerard, Yan ;
Chadebecq, Francois ;
Collins, Toby ;
Pizarro, Daniel .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (10) :2099-2118
[8]   Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image [J].
Bogo, Federica ;
Kanazawa, Angjoo ;
Lassner, Christoph ;
Gehler, Peter ;
Romero, Javier ;
Black, Michael J. .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :561-578
[9]   Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences [J].
Bogo, Federica ;
Black, Michael J. ;
Loper, Matthew ;
Romero, Javier .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2300-2308
[10]  
Bray M, 2006, LECT NOTES COMPUT SC, V3952, P642