LiveCap: Real-Time Human Performance Capture From Monocular Video4

被引：172

作者：

Habermann, Marc ^{[1
]}

Xu, Weipeng ^{[1
]}

Zollhofer, Michael ^{[2
]}

Pons-Moll, Gerard ^{[1
]}

Theobalt, Christian ^{[1
]}

机构：

[1] Max Planck Inst Informat, Campus E1 4,Stuhlsatzenhausweg, D-66123 Saarbrucken, Germany

[2] Stanford Univ, 353 Serra Mall,RM 386, Stanford, CA 94305 USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2019年 / 38卷 / 02期

关键词：

Monocular performance capture; 3D pose estimation; human body; non-rigid surface deformation; INTERACTING CHARACTERS; MOTION CAPTURE; SHAPE; POSE; TRACKING;

D O I：

10.1145/3311970

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per frame are solved with specially tailored data-parallel Gauss-Newton solvers. To achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques while being orders of magnitude faster.

引用

页数：17

共 100 条

[1]

Allain B, 2015, PROC CVPR IEEE, P268, DOI 10.1109/CVPR.2015.7298623

[2] SCAPE: Shape Completion and Animation of People [J].

Anguelov, D ;

Srinivasan, P ;

Koller, D ;

Thrun, S ;

Rodgers, J ;

Davis, J .

ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416

[3]

[Anonymous], 2014, ACM T GRAPHIC, DOI DOI 10.1145/2601097.2601165

[4]

[Anonymous], 2012, ACM TOG PROC SIGGRAP

[5]

Balan AO, 2008, LECT NOTES COMPUT SC, V5303, P15, DOI 10.1007/978-3-540-88688-4_2

[6]

Balan AO, 2007, IEEE I CONF COMP VIS, P1379

[7] Shape-from-Template [J].

Bartoli, Adrien ;

Gerard, Yan ;

Chadebecq, Francois ;

Collins, Toby ;

Pizarro, Daniel .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (10) :2099-2118

[8] Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image [J].

Bogo, Federica ;

Kanazawa, Angjoo ;

Lassner, Christoph ;

Gehler, Peter ;

Romero, Javier ;

Black, Michael J. .

COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :561-578

[9] Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences [J].

Bogo, Federica ;

Black, Michael J. ;

Loper, Matthew ;

Romero, Javier .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2300-2308

[10]

Bray M, 2006, LECT NOTES COMPUT SC, V3952, P642

← 1 2 3 4 5 6 7 8 9 10 →