Real-Time 3D Facial Tracking via Cascaded Compositional Learning

被引：5

作者：

Lou, Jianwen ^{[1
]}

Cai, Xiaoxu ^{[1
]}

Dong, Junyu ^{[2
]}

Yu, Hui ^{[1
]}

机构：

[1] Univ Portsmouth, Sch Creat Technol, Portsmouth PO1 2DJ, Hants, England

[2] Ocean Univ China, Sch Informat Sci & Engn, Qingdao 266100, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2021年 / 30卷

基金：

中国国家自然科学基金; 英国工程与自然科学研究理事会;

关键词：

Training; Three-dimensional displays; Faces; Videos; Biological system modeling; Real-time systems; Geometry; 3D facial tracking; compositional learning; boosted ferns; synthetic training imagery; FACE ALIGNMENT;

D O I：

10.1109/TIP.2021.3065819

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose to learn a cascade of globally-optimized modular boosted ferns (GoMBF) to solve multi-modal facial motion regression for real-time 3D facial tracking from a monocular RGB camera. GoMBF is a deep composition of multiple regression models with each is a boosted ferns initially trained to predict partial motion parameters of the same modality, and then concatenated together via a global optimization step to form a singular strong boosted ferns that can effectively handle the whole regression target. It can explicitly cope with the modality variety in output variables, while manifesting increased fitting power and a faster learning speed comparing against the conventional boosted ferns. By further cascading a sequence of GoMBFs (GoMBF-Cascade) to regress facial motion parameters, we achieve competitive tracking performance on a variety of in-the-wild videos comparing to the state-of-the-art methods which either have higher computational complexity or require much more training data. It provides a robust and highly elegant solution to real-time 3D facial tracking using a small set of training data and hence makes it more practical in real-world applications. We further deeply investigate the effect of synthesized facial images on training non-deep learning methods such as GoMBF-Cascade for 3D facial tracking. We apply three types synthetic images with various naturalness levels for training two different tracking methods, and compare the performance of the tracking models trained on real data, on synthetic data and on a mixture of data. The experimental results indicate that, i) the model trained purely on synthetic facial imageries can hardly generalize well to unconstrained real-world data, ii) involving synthetic faces into training benefits tracking in some certain scenarios but degrades the tracking model's generalization ability. These two insights could benefit a range of non-deep learning facial image analysis tasks where the labelled real data is difficult to acquire.

引用

页码：3844 / 3857

页数：14

共 44 条

[21] Real-time Hierarchical Facial Performance Capture
Ma, Luming
Deng, Zhigang
[J]. ACM SIGGRAPH SYMPOSIUM ON INTERACTIVE 3D GRAPHICS AND GAMES (I3D 2019), 2019,
[22] Synthetic prior design for real-time face tracking
McDonagh, Steven
Klaudiny, Martin
Bradley, Derek
Beeler, Thabo
Matthews, Iain
Mitchell, Kenny
[J]. PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 639 - 648
[23] A 3D Face Model for Pose and Illumination Invariant Face Recognition
Paysan, Pascal
Knothe, Reinhard
Amberg, Brian
Romdhani, Sami
Vetter, Thomas
[J]. AVSS: 2009 6TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, 2009, : 296 - 301
[24] ILLUMINATION FOR COMPUTER GENERATED PICTURES
PHONG, BT
[J]. COMMUNICATIONS OF THE ACM, 1975, 18 (06) : 311 - 317
[25] Ren SQ, 2015, PROC CVPR IEEE, P723, DOI 10.1109/CVPR.2015.7298672
[26] Face Alignment at 3000 FPS via Regressing Local Binary Features
Ren, Shaoqing
Cao, Xudong
Wei, Yichen
Sun, Jian
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1685 - 1692
[27] Learning Detailed Face Reconstruction from a Single Image
Richardson, Elad
Sela, Matan
Or-El, Roy
Kimmel, Ron
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5553 - 5562
[28] 3D Face Reconstruction by Learning from Synthetic Data
Richardson, Elad
Sela, Matan
Kimmel, Ron
[J]. PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 460 - 467
[29] Real-Time Facial Segmentation and Performance Capture from RGB Input
Saito, Shunsuke
Li, Tianye
Li, Hao
[J]. COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 244 - 261
[30] The First Facial Landmark Tracking in-the-Wild Challenge: Benchmark and Results
Shen, Jie
Zafeiriou, Stefanos
Chrysos, Grigorios G.
Kossaifi, Jean
Tzimiropoulos, Georgios
Pantic, Maja
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), 2015, : 1003 - 1011

← 1 2 3 4 5 →