Depth-Based 3D Face Reconstruction and Pose Estimation Using Shape-Preserving Domain Adaptation

被引:6
作者
Zhong Y. [1 ]
Pei Y. [1 ]
Li P. [1 ]
Guo Y. [2 ]
Ma G. [1 ]
Liu M. [3 ]
Bai W. [4 ]
Wu W. [4 ]
Zha H. [1 ]
机构
[1] Department of Machine Intelligence, Key Laboratory of Machine Perception (MOE), Peking University, Beijing
[2] Department of Computer Science, Luoyang Institute of Science and Technology, Luoyang
[3] Department of It, USens Incorporation, San Jose, 95110, CA
[4] Department of It, Huawei Technologies Company Ltd., Beijing
来源
Pei, Yuru (peiyuru@cis.pku.edu.cn) | 1600年 / Institute of Electrical and Electronics Engineers Inc.卷 / 03期
关键词
Depth-based face reconstruction; pose estimation; shape code regression; shape-preserving domain adaptation;
D O I
10.1109/TBIOM.2020.3025466
中图分类号
学科分类号
摘要
Depth images are widely used in 3D head pose estimation and face reconstruction. The device-specific noise and the lack of textual constraints pose a major problem for estimating a nonrigid deformable face from a single noisy depth image. In this article, we present a deep neural network-based framework to infer a 3D face consistent with a single depth image captured by a consumer depth camera Kinect. Confronted with a lack of annotated depth images with facial parameters, we utilize the bidirectional CycleGAN-based generator for denoising and noisy image simulation, which helps to generalize the model learned from synthetic depth images to real noisy ones. We generate the code regressors in the source (synthetic) and the target (noisy) depth image domains and present a fusion scheme in the parametric space for 3D face inference. The proposed multi-level shape consistency constraint, concerning the embedded features, depth maps, and 3D surfaces, couples the code regressor and the domain adaptation, avoiding shape distortions in the CycleGAN-based generators. Experiments demonstrate that the proposed method is effective in depth-based 3D head pose estimation and expressive face reconstruction compared with the state-of-the-art. © 2019 IEEE.
引用
收藏
页码:6 / 15
页数:9
相关论文
共 56 条
[1]  
Fried O., Shechtman E., Goldman D.B., Finkelstein A., Perspective-aware manipulation of portrait photos, ACM Trans. Graph, 35, 4, (2016)
[2]  
Zhu X., Lei Z., Yan J., Yi D., Li S.Z., High-fidelity pose and expression normalization for face recognition in the wild, Proc. IEEE Conf. Comput. Vis. Pattern Recognit, pp. 787-796, (2015)
[3]  
Baltrusaitis T., Robinson P., Morency L.-P., 3D constrained local model for rigid and non-rigid facial tracking, Proc. IEEE Conf. Comput. Vis. Pattern Recognit, pp. 2610-2617, (2012)
[4]  
Zollhofer M., Et al., State of the art on monocular 3D face reconstruction, tracking, and applications, Comput. Graph. Forum, 37, 2, pp. 523-550, (2018)
[5]  
Fanelli G., Gall J., Gool L.V., Real time head pose estimation with random regression forests, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 617-624, (2011)
[6]  
Borghi G., Venturelli M., Vezzani R., Cucchiara R., POSEidon: Face-from-depth for driver pose estimation, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 5494-5503, (2017)
[7]  
Li S., Ngan K.N., Paramesran R., Sheng L., Real-time head pose tracking with online face template reconstruction, IEEE Trans. Pattern Anal. Mach. Intell, 38, 9, pp. 1922-1928, (2016)
[8]  
Weise T., Bouaziz S., Li H., Pauly M., Realtime performance-based facial animation, ACM Trans. Graph, 30, 4, (2011)
[9]  
Blanz V., Basso C., Poggio T., Vetter T., Reanimating faces in images and video, Comput. Graph. Forum, 22, 3, pp. 641-650, (2003)
[10]  
Blanz V., Vetter T., A morphable model for the synthesis of 3D faces, Proc. 26th Annu. Conf. Comput. Graph. Interact. Techn, pp. 187-194, (1999)