Real-Time 3D Facial Tracking via Cascaded Compositional Learning

被引：5

作者：

Lou, Jianwen ^{[1
]}

Cai, Xiaoxu ^{[1
]}

Dong, Junyu ^{[2
]}

Yu, Hui ^{[1
]}

机构：

[1] Univ Portsmouth, Sch Creat Technol, Portsmouth PO1 2DJ, Hants, England

[2] Ocean Univ China, Sch Informat Sci & Engn, Qingdao 266100, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2021年 / 30卷

基金：

中国国家自然科学基金; 英国工程与自然科学研究理事会;

关键词：

Training; Three-dimensional displays; Faces; Videos; Biological system modeling; Real-time systems; Geometry; 3D facial tracking; compositional learning; boosted ferns; synthetic training imagery; FACE ALIGNMENT;

D O I：

10.1109/TIP.2021.3065819

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose to learn a cascade of globally-optimized modular boosted ferns (GoMBF) to solve multi-modal facial motion regression for real-time 3D facial tracking from a monocular RGB camera. GoMBF is a deep composition of multiple regression models with each is a boosted ferns initially trained to predict partial motion parameters of the same modality, and then concatenated together via a global optimization step to form a singular strong boosted ferns that can effectively handle the whole regression target. It can explicitly cope with the modality variety in output variables, while manifesting increased fitting power and a faster learning speed comparing against the conventional boosted ferns. By further cascading a sequence of GoMBFs (GoMBF-Cascade) to regress facial motion parameters, we achieve competitive tracking performance on a variety of in-the-wild videos comparing to the state-of-the-art methods which either have higher computational complexity or require much more training data. It provides a robust and highly elegant solution to real-time 3D facial tracking using a small set of training data and hence makes it more practical in real-world applications. We further deeply investigate the effect of synthesized facial images on training non-deep learning methods such as GoMBF-Cascade for 3D facial tracking. We apply three types synthetic images with various naturalness levels for training two different tracking methods, and compare the performance of the tracking models trained on real data, on synthetic data and on a mixture of data. The experimental results indicate that, i) the model trained purely on synthetic facial imageries can hardly generalize well to unconstrained real-world data, ii) involving synthetic faces into training benefits tracking in some certain scenarios but degrades the tracking model's generalization ability. These two insights could benefit a range of non-deep learning facial image analysis tasks where the labelled real data is difficult to acquire.

引用

页码：3844 / 3857

页数：14

共 44 条

[1] [Anonymous], 2016, DIMENSIONAL IMAGING
[2] A LIMITED MEMORY ALGORITHM FOR BOUND CONSTRAINED OPTIMIZATION
BYRD, RH
LU, PH
NOCEDAL, J
ZHU, CY
[J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1995, 16 (05) : 1190 - 1208
[3] Real-Time High-Fidelity Facial Performance Capture
Cao, Chen
Bradley, Derek
Zhou, Kun
Beeler, Thabo
[J]. ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (04):
[4] Displaced Dynamic Expression Regression for Real-time Facial Tracking and Animation
Cao, Chen
Hou, Qiming
Zhou, Kun
[J]. ACM TRANSACTIONS ON GRAPHICS, 2014, 33 (04):
[5] FaceWarehouse: A 3D Facial Expression Database for Visual Computing
Cao, Chen
Weng, Yanlin
Zhou, Shun
Tong, Yiying
Zhou, Kun
[J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (03) : 413 - 425
[6] 3D Shape Regression for Real-time Facial Animation
Cao, Chen
Weng, Yanlin
Lin, Stephen
Zhou, Kun
[J]. ACM TRANSACTIONS ON GRAPHICS, 2013, 32 (04):
[7] Face Alignment by Explicit Shape Regression
Cao, Xudong
Wei, Yichen
Wen, Fang
Sun, Jian
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 107 (02) : 177 - 190
[8] Chen H, 2015, PROC CVPR IEEE, P1836, DOI 10.1109/CVPR.2015.7298793
[9] MODEL-BASED OBJECT POSE IN 25 LINES OF CODE
DEMENTHON, DF
DAVIS, LS
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 1995, 15 (1-2) : 123 - 141
[10] Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning
Deng, Yu
Yang, Jiaolong
Chen, Dong
Wen, Fang
Tong, Xin
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5153 - 5162

← 1 2 3 4 5 →