Tree-Gated Deep Mixture-of-Experts for Pose-Robust Face Alignment

被引:2
作者
Arnaud E. [1 ]
Dapogny A. [2 ]
Bailly K. [1 ]
机构
[1] ISIR, Sorbonne Université, CNRS, Institut des Systèmes Intelligents et de Robotique, Paris
[2] Datakalab, Paris
来源
IEEE Transactions on Biometrics, Behavior, and Identity Science | 2020年 / 2卷 / 02期
关键词
cascaded regression; deep mixture-of-experts; ensemble methods; Face alignment; head pose estimation;
D O I
10.1109/TBIOM.2019.2950032
中图分类号
学科分类号
摘要
Face alignment consists of aligning a shape model on a face image. It is an active domain in computer vision as it is a preprocessing for a number of face analysis and synthesis applications. Current state-of-the-art methods already perform well on 'easy' datasets, with moderate head pose variations, but may not be robust for 'in-the-wild' data with poses up to 90°. In order to increase robustness to an ensemble of factors of variations (e.g., head pose or occlusions), a given layer (e.g., a regressor or an upstream CNN layer) can be replaced by a Mixture of Experts (MoE) layer that uses an ensemble of experts instead of a single one. The weights of this mixture can be learned as gating functions to jointly learn the experts and the corresponding weights. In this paper, we propose to use tree-structured gates which allows a hierarchical weighting of the experts (Tree-MoE). We investigate the use of Tree-MoE layers in different contexts in the frame of face alignment with cascaded regression, firstly for emphasizing relevant, more specialized feature extractors depending of a high-level semantic information such as head pose (Pose-Tree-MoE), and secondly as an overall more robust regression layer. We perform extensive experiments on several challenging face alignment datasets, demonstrating that our approach outperforms the state-of-the-art methods. © 2019 IEEE.
引用
收藏
页码:122 / 132
页数:10
相关论文
共 31 条
[1]  
Xiong X., De La Torre F., Supervised descent method and its applications to face alignment, Proc. CVPR, Portland, OR, USA, pp. 532-539, (2013)
[2]  
Ren S., Cao X., Wei Y., Sun J., Face alignment at 3000 FPS via regressing local binary features, Proc. CVPR, Columbus, OH, USA, pp. 1685-1692, (2014)
[3]  
Sun Y., Wang X., Tang X., Deep convolutional network cascade for facial point detection, Proc. CVPR, Portland, OR, USA, pp. 3476-3483, (2013)
[4]  
Trigeorgis G., Snape P., Nicolaou M.A., Antonakos E., Zafeiriou S.P., Mnemonic descent method: A recurrent process applied for end-to-end face alignment, Proc. CVPR, Las Vegas, NV, USA, pp. 4177-4187, (2016)
[5]  
Kontschieder P., Fiterau M., Criminisi A., Bulo S.R., Deep neural decision forests, Proc. ICCV, pp. 1467-1475, (2015)
[6]  
Zhang Z., Luo P., Loy C.C., Tang X., Learning deep representation for face alignment with auxiliary attributes, IEEE Trans. Pattern Anal. Mach. Intell., 38, 5, pp. 918-930, (2016)
[7]  
Honari S., Molchanov P., Tyree S., Vincent P., Pal C., Kautz J., Improving landmark localization with semi-supervised learning, Proc. CVPR, pp. 1546-1555, (2018)
[8]  
Burgos-Artizzu X.P., Perona P., Dollar P., Robust face landmark estimation under occlusion, Proc. ICCV, Sydney, NSW, Australia, pp. 1513-1520, (2013)
[9]  
Ghiasi G., Fowlkes C.C., Occlusion coherence: Localizing occluded faces with a hierarchical deformable part model, Proc. CVPR, pp. 1899-1906, (2014)
[10]  
Yu X., Lin Z.L., Brandt J., Metaxas D.N., Consensus of regression for occlusion-robust facial feature localization, Proc. ECCV, pp. 105-118, (2014)