Real-Time 3D Facial Tracking via Cascaded Compositional Learning

被引:5
作者
Lou, Jianwen [1 ]
Cai, Xiaoxu [1 ]
Dong, Junyu [2 ]
Yu, Hui [1 ]
机构
[1] Univ Portsmouth, Sch Creat Technol, Portsmouth PO1 2DJ, Hants, England
[2] Ocean Univ China, Sch Informat Sci & Engn, Qingdao 266100, Peoples R China
基金
中国国家自然科学基金; 英国工程与自然科学研究理事会;
关键词
Training; Three-dimensional displays; Faces; Videos; Biological system modeling; Real-time systems; Geometry; 3D facial tracking; compositional learning; boosted ferns; synthetic training imagery; FACE ALIGNMENT;
D O I
10.1109/TIP.2021.3065819
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose to learn a cascade of globally-optimized modular boosted ferns (GoMBF) to solve multi-modal facial motion regression for real-time 3D facial tracking from a monocular RGB camera. GoMBF is a deep composition of multiple regression models with each is a boosted ferns initially trained to predict partial motion parameters of the same modality, and then concatenated together via a global optimization step to form a singular strong boosted ferns that can effectively handle the whole regression target. It can explicitly cope with the modality variety in output variables, while manifesting increased fitting power and a faster learning speed comparing against the conventional boosted ferns. By further cascading a sequence of GoMBFs (GoMBF-Cascade) to regress facial motion parameters, we achieve competitive tracking performance on a variety of in-the-wild videos comparing to the state-of-the-art methods which either have higher computational complexity or require much more training data. It provides a robust and highly elegant solution to real-time 3D facial tracking using a small set of training data and hence makes it more practical in real-world applications. We further deeply investigate the effect of synthesized facial images on training non-deep learning methods such as GoMBF-Cascade for 3D facial tracking. We apply three types synthetic images with various naturalness levels for training two different tracking methods, and compare the performance of the tracking models trained on real data, on synthetic data and on a mixture of data. The experimental results indicate that, i) the model trained purely on synthetic facial imageries can hardly generalize well to unconstrained real-world data, ii) involving synthetic faces into training benefits tracking in some certain scenarios but degrades the tracking model's generalization ability. These two insights could benefit a range of non-deep learning facial image analysis tasks where the labelled real data is difficult to acquire.
引用
收藏
页码:3844 / 3857
页数:14
相关论文
共 44 条
  • [21] Real-time Hierarchical Facial Performance Capture
    Ma, Luming
    Deng, Zhigang
    [J]. ACM SIGGRAPH SYMPOSIUM ON INTERACTIVE 3D GRAPHICS AND GAMES (I3D 2019), 2019,
  • [22] Synthetic prior design for real-time face tracking
    McDonagh, Steven
    Klaudiny, Martin
    Bradley, Derek
    Beeler, Thabo
    Matthews, Iain
    Mitchell, Kenny
    [J]. PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 639 - 648
  • [23] A 3D Face Model for Pose and Illumination Invariant Face Recognition
    Paysan, Pascal
    Knothe, Reinhard
    Amberg, Brian
    Romdhani, Sami
    Vetter, Thomas
    [J]. AVSS: 2009 6TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, 2009, : 296 - 301
  • [24] ILLUMINATION FOR COMPUTER GENERATED PICTURES
    PHONG, BT
    [J]. COMMUNICATIONS OF THE ACM, 1975, 18 (06) : 311 - 317
  • [25] Ren SQ, 2015, PROC CVPR IEEE, P723, DOI 10.1109/CVPR.2015.7298672
  • [26] Face Alignment at 3000 FPS via Regressing Local Binary Features
    Ren, Shaoqing
    Cao, Xudong
    Wei, Yichen
    Sun, Jian
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1685 - 1692
  • [27] Learning Detailed Face Reconstruction from a Single Image
    Richardson, Elad
    Sela, Matan
    Or-El, Roy
    Kimmel, Ron
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5553 - 5562
  • [28] 3D Face Reconstruction by Learning from Synthetic Data
    Richardson, Elad
    Sela, Matan
    Kimmel, Ron
    [J]. PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 460 - 467
  • [29] Real-Time Facial Segmentation and Performance Capture from RGB Input
    Saito, Shunsuke
    Li, Tianye
    Li, Hao
    [J]. COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 244 - 261
  • [30] The First Facial Landmark Tracking in-the-Wild Challenge: Benchmark and Results
    Shen, Jie
    Zafeiriou, Stefanos
    Chrysos, Grigorios G.
    Kossaifi, Jean
    Tzimiropoulos, Georgios
    Pantic, Maja
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), 2015, : 1003 - 1011