Real-Time 3D Facial Tracking via Cascaded Compositional Learning

被引:5
作者
Lou, Jianwen [1 ]
Cai, Xiaoxu [1 ]
Dong, Junyu [2 ]
Yu, Hui [1 ]
机构
[1] Univ Portsmouth, Sch Creat Technol, Portsmouth PO1 2DJ, Hants, England
[2] Ocean Univ China, Sch Informat Sci & Engn, Qingdao 266100, Peoples R China
基金
中国国家自然科学基金; 英国工程与自然科学研究理事会;
关键词
Training; Three-dimensional displays; Faces; Videos; Biological system modeling; Real-time systems; Geometry; 3D facial tracking; compositional learning; boosted ferns; synthetic training imagery; FACE ALIGNMENT;
D O I
10.1109/TIP.2021.3065819
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose to learn a cascade of globally-optimized modular boosted ferns (GoMBF) to solve multi-modal facial motion regression for real-time 3D facial tracking from a monocular RGB camera. GoMBF is a deep composition of multiple regression models with each is a boosted ferns initially trained to predict partial motion parameters of the same modality, and then concatenated together via a global optimization step to form a singular strong boosted ferns that can effectively handle the whole regression target. It can explicitly cope with the modality variety in output variables, while manifesting increased fitting power and a faster learning speed comparing against the conventional boosted ferns. By further cascading a sequence of GoMBFs (GoMBF-Cascade) to regress facial motion parameters, we achieve competitive tracking performance on a variety of in-the-wild videos comparing to the state-of-the-art methods which either have higher computational complexity or require much more training data. It provides a robust and highly elegant solution to real-time 3D facial tracking using a small set of training data and hence makes it more practical in real-world applications. We further deeply investigate the effect of synthesized facial images on training non-deep learning methods such as GoMBF-Cascade for 3D facial tracking. We apply three types synthetic images with various naturalness levels for training two different tracking methods, and compare the performance of the tracking models trained on real data, on synthetic data and on a mixture of data. The experimental results indicate that, i) the model trained purely on synthetic facial imageries can hardly generalize well to unconstrained real-world data, ii) involving synthetic faces into training benefits tracking in some certain scenarios but degrades the tracking model's generalization ability. These two insights could benefit a range of non-deep learning facial image analysis tasks where the labelled real data is difficult to acquire.
引用
收藏
页码:3844 / 3857
页数:14
相关论文
共 44 条
  • [1] [Anonymous], 2016, DIMENSIONAL IMAGING
  • [2] A LIMITED MEMORY ALGORITHM FOR BOUND CONSTRAINED OPTIMIZATION
    BYRD, RH
    LU, PH
    NOCEDAL, J
    ZHU, CY
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1995, 16 (05) : 1190 - 1208
  • [3] Real-Time High-Fidelity Facial Performance Capture
    Cao, Chen
    Bradley, Derek
    Zhou, Kun
    Beeler, Thabo
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04):
  • [4] Displaced Dynamic Expression Regression for Real-time Facial Tracking and Animation
    Cao, Chen
    Hou, Qiming
    Zhou, Kun
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2014, 33 (04):
  • [5] FaceWarehouse: A 3D Facial Expression Database for Visual Computing
    Cao, Chen
    Weng, Yanlin
    Zhou, Shun
    Tong, Yiying
    Zhou, Kun
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (03) : 413 - 425
  • [6] 3D Shape Regression for Real-time Facial Animation
    Cao, Chen
    Weng, Yanlin
    Lin, Stephen
    Zhou, Kun
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2013, 32 (04):
  • [7] Face Alignment by Explicit Shape Regression
    Cao, Xudong
    Wei, Yichen
    Wen, Fang
    Sun, Jian
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 107 (02) : 177 - 190
  • [8] Chen H, 2015, PROC CVPR IEEE, P1836, DOI 10.1109/CVPR.2015.7298793
  • [9] MODEL-BASED OBJECT POSE IN 25 LINES OF CODE
    DEMENTHON, DF
    DAVIS, LS
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 1995, 15 (1-2) : 123 - 141
  • [10] Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning
    Deng, Yu
    Yang, Jiaolong
    Chen, Dong
    Wen, Fang
    Tong, Xin
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5153 - 5162