VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds

被引:16
作者
Liu, Guanze [1 ]
Rong, Yu [2 ]
Sheng, Lu [1 ]
机构
[1] Beihang Univ, Coll Software, Beijing, Peoples R China
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
中国国家自然科学基金;
关键词
3D human shape reconstruction; occlusion handling; hough voting; in point clouds; HUMAN POSE; SHAPE;
D O I
10.1145/3474085.3475309
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D human mesh recovery from point clouds is essential for various tasks, including AR/VR and human behavior understanding. Previous works in this field either require high-quality 3D human scans or sequential point clouds, which cannot be easily applied to low-quality 3D scans captured by consumer-level depth sensors. In this paper, we make the first attempt to reconstruct reliable 3D human shapes from single-frame partial point clouds. To achieve this, we propose an end-to-end learnable method, named VoteHMR. The core of VoteHMR is a novel occlusion-aware voting network that can first reliably produce visible joint-level features from the input partial point clouds, and then complete the joint-level features through the kinematic tree of the human skeleton. Compared with holistic features used by previous works, the joint-level features can not only effectively encode the human geometry information but also be robust to noisy inputs with self-occlusions and missing areas. By exploiting the rich complementary clues from the joint-level features and global features from the input point clouds, the proposed method encourages reliable and disentangled parameter predictions for statistical 3D human models, such as SMPL. The proposed method achieves state-of-the-art performances on two large-scale datasets, namely SURREAL and DFAUST. Furthermore, VoteHMR also demonstrates superior generalization ability on real-world datasets, such as Berkeley MHAD.
引用
收藏
页码:955 / 964
页数:10
相关论文
共 59 条
[1]   SCAPE: Shape Completion and Animation of People [J].
Anguelov, D ;
Srinivasan, P ;
Koller, D ;
Thrun, S ;
Rodgers, J ;
Davis, J .
ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416
[2]   Dynamic FAUST: Registering Human Bodies in Motion [J].
Bogo, Federica ;
Romero, Javier ;
Pons-Moll, Gerard ;
Black, Michael J. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5573-5582
[3]   Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image [J].
Bogo, Federica ;
Kanazawa, Angjoo ;
Lassner, Christoph ;
Gehler, Peter ;
Romero, Javier ;
Black, Michael J. .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :561-578
[4]   Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences [J].
Bogo, Federica ;
Black, Michael J. ;
Loper, Matthew ;
Romero, Javier .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2300-2308
[5]   Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds [J].
Cheng, Bowen ;
Sheng, Lu ;
Shi, Shaoshuai ;
Yang, Ming ;
Xu, Dong .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8959-8968
[6]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[7]   Fusion4D: Real-time Performance Capture of Challenging Scenes [J].
Dou, Mingsong ;
Kowdle, Adarsh ;
Khamis, Sameh ;
Escolano, Sergio Orts ;
Kohli, Pushmeet ;
Degtyarev, Yury ;
Rhemann, Christoph ;
Tankovich, Vladimir ;
Davidson, Philip ;
Kim, David ;
Izadi, Shahram ;
Fanello, Sean Ryan ;
Taylor, Jonathan .
ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (04)
[8]  
Dou MS, 2015, PROC CVPR IEEE, P493, DOI 10.1109/CVPR.2015.7298647
[9]   Point-to-Point Regression PointNet for 3D Hand Pose Estimation [J].
Ge, Liuhao ;
Ren, Zhou ;
Yuan, Junsong .
COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 :489-505
[10]   3D-CODED: 3D Correspondences by Deep Deformation [J].
Groueix, Thibault ;
Fisher, Matthew ;
Kim, Vladimir G. ;
Russell, Bryan C. ;
Aubry, Mathieu .
COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 :235-251