Towards Accurate Marker-less Human Shape and Pose Estimation over Time

被引：156

作者：

Huang, Yinghao ^{[1
]}

Bogo, Federica ^{[2
]}

Lassner, Christoph ^{[3
,7
]}

Kanazawa, Angjoo ^{[4
]}

Gehler, Peter, V ^{[5
,7
]}

Romero, Javier ^{[3
]}

Akhter, Ijaz ^{[6
]}

Black, Michael J. ^{[1
]}

机构：

[1] Max Planck Inst Intelligent Syst, Tubingen, Germany

[2] Microsoft, Redmond, WA USA

[3] Body Labs Inc, New York, NY 10003 USA

[4] Univ Calif Berkeley, Berkeley, CA USA

[5] Univ Wurzburg, Wurzburg, Germany

[6] Australian Natl Univ, Canberra, ACT, Australia

[7] MPI Intelligent Syst, Stuttgart, Germany

来源：

PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON 3D VISION (3DV) | 2017年

关键词：

3d reconstruction; shape and pose estimation; multi-view; marker-less; body model; BODY MOTION CAPTURE;

D O I：

10.1109/3DV.2017.00055

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing markerless motion capture methods often assume known backgrounds, static cameras, and sequence specific motion priors, limiting their application scenarios. Here we present a fully automatic method that, given multi-view videos, estimates 3D human pose and body shape. We take the recently proposed SMPLify method [12] as the base method and extend it in several ways. First we fit a 3D human body model to 2D features detected in multi-view images. Second, we use a CNN method to segment the person in each image and fit the 3D body model to the contours, further improving accuracy. Third we utilize a generic and robust DCT temporal prior to handle the left and right side swapping issue sometimes introduced by the 2D pose estimator. Validation on standard benchmarks shows our results are comparable to the state of the art and also provide a realistic 3D shape avatar. We also demonstrate accurate results on HumanEva and on challenging monocular sequences of dancing from YouTube.

引用

页码：421 / 430

页数：10

共 52 条

[1] Bilinear Spatiotemporal Basis Models [J].

Akhter, Ijaz ;

Simon, Tomas ;

Khan, Sohaib ;

Matthews, Iain ;

Sheikh, Yaser .

ACM TRANSACTIONS ON GRAPHICS, 2012, 31 (02) :1-12

[2] Multi-view Pictorial Structures for 3D Human Pose Estimation [J].

Amin, Sikandar ;

Andriluka, Mykhaylo ;

Rohrbach, Marcus ;

Schiele, Bernt .

PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2013, 2013,

[3] SCAPE: Shape Completion and Animation of People [J].

Anguelov, D ;

Srinivasan, P ;

Koller, D ;

Thrun, S ;

Rodgers, J ;

Davis, J .

ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416

[4]

[Anonymous], 2005, P ACM S VIRT REAL SO, DOI DOI 10.1145/1101616.1101668

[5]

[Anonymous], 2016, ARXIV161109010

[6]

[Anonymous], PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2000.854758

[7]

Balan A.O., 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition, P1, DOI DOI 10.1109/CVPR.2007.383340

[8]

Balan AO, 2008, LECT NOTES COMPUT SC, V5303, P15, DOI 10.1007/978-3-540-88688-4_2

[9]

Ballan L., 2008, 3DPVT

[10] 3D Pictorial Structures for Multiple Human Pose Estimation [J].

Belagiannis, Vasileios ;

Amin, Sikandar ;

Andriluka, Mykhaylo ;

Schiele, Bernt ;

Navab, Nassir ;

Ilic, Slobodan .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1669-1676

← 1 2 3 4 5 6 →