Appearance Consensus Driven Self-supervised Human Mesh Recovery

被引:14
作者
Kundu, Jogendra Nath [1 ]
Rakesh, Mugalodi [1 ]
Jampani, Varun [2 ]
Venkatesh, Rahul Mysore [1 ]
Babu, R. Venkatesh [1 ]
机构
[1] Indian Inst Sci, Bangalore, Karnataka, India
[2] Google Res, Cambridge, MA USA
来源
COMPUTER VISION - ECCV 2020, PT I | 2020年 / 12346卷
关键词
HUMAN POSE ESTIMATION; MODEL; SHAPE;
D O I
10.1007/978-3-030-58452-8_46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a self-supervised human mesh recovery framework to infer human pose and shape from monocular images in the absence of any paired supervision. Recent advances have shifted the interest towards directly regressing parameters of a parametric human model by supervising them on large-scale datasets with 2D landmark annotations. This limits the generalizability of such approaches to operate on images from unlabeled wild environments. Acknowledging this we propose a novel appearance consensus driven self-supervised objective. To effectively disentangle the foreground (FG) human we rely on image pairs depicting the same person (consistent FG) in varied pose and background (BG) which are obtained from unlabeled wild videos. The proposed FG appearance consistency objective makes use of a novel, differentiable Color-recovery module to obtain vertex colors without the need for any appearance network; via efficient realization of color-picking and reflectional symmetry. We achieve state-of-the-art results on the standard model-based 3D pose estimation benchmarks at comparable supervision levels. Furthermore, the resulting colored mesh prediction opens up the usage of our framework for a variety of appearance-related tasks beyond the pose and shape estimation, thus establishing our superior generalizability.
引用
收藏
页码:794 / 812
页数:19
相关论文
共 71 条
[1]   2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[2]   SCAPE: Shape Completion and Animation of People [J].
Anguelov, D ;
Srinivasan, P ;
Koller, D ;
Thrun, S ;
Rodgers, J ;
Davis, J .
ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416
[3]   Exploiting temporal context for 3D human pose estimation in the wild [J].
Arnab, Anurag ;
Doersch, Carl ;
Zisserman, Andrew .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3390-3399
[4]   Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image [J].
Bogo, Federica ;
Kanazawa, Angjoo ;
Lassner, Christoph ;
Gehler, Peter ;
Romero, Javier ;
Black, Michael J. .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :561-578
[5]   Learning 3D Human Pose from Structure and Motion [J].
Dabral, Rishabh ;
Mundhada, Anurag ;
Kusupati, Uday ;
Afaque, Safeer ;
Sharma, Abhishek ;
Jain, Arjun .
COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 :679-696
[6]   GENERALIZED PROCRUSTES ANALYSIS [J].
GOWER, JC .
PSYCHOMETRIKA, 1975, 40 (01) :33-51
[7]   Estimating Human Shape and Pose from a Single Image [J].
Guan, Peng ;
Weiss, Alexander ;
Balan, Alexandru O. ;
Black, Michael J. .
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :1381-1388
[8]   DensePose: Dense Human Pose Estimation In The Wild [J].
Guler, Riza Alp ;
Neverova, Natalia ;
Kokkinos, Lasonas .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7297-7306
[9]   Identity Mappings in Deep Residual Networks [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :630-645
[10]   Learning Single-Image 3D Reconstruction by Generative Modelling of Shape, Pose and Shading [J].
Henderson, Paul ;
Ferrari, Vittorio .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (04) :835-854