Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks

被引:302
作者
Feng, Zhen-Hua [1 ]
Kittler, Josef [1 ]
Awais, Muhammad [1 ]
Huber, Patrik [1 ]
Wu, Xiao-Jun [2 ]
机构
[1] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford GU2 7XH, Surrey, England
[2] Jiangnan Univ, Sch IoT Engn, Wuxi 214122, Peoples R China
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
基金
英国工程与自然科学研究理事会; 中国国家自然科学基金;
关键词
FACE; REGRESSION; CASCADE;
D O I
10.1109/CVPR.2018.00238
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new loss function, namely Wing loss, for robust facial landmark localisation with Convolutional Neural Networks (CNNs). We first compare and analyse different loss functions including L2, L1 and smooth L1. The analysis of these loss functions suggests that, for the training of a CNN-based localisation model, more attention should be paid to small and medium range errors. To this end, we design a piece-wise loss function. The new loss amplifies the impact of errors from the interval (-w, w) by switching from L1 loss to a modified logarithm function. To address the problem of under-representation of samples with large out-of-plane head rotations in the training set, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation approaches. Last, the proposed approach is extended to create a two-stage framework for robust facial landmark localisation. The experimental results obtained on AFLW and 300W demonstrate the merits of the Wing loss function, and prove the superiority of the proposed method over the state-of-the-art approaches.
引用
收藏
页码:2235 / 2245
页数:11
相关论文
共 75 条
[1]  
[Anonymous], 2017, IEEE T ACTIONS IMAGE
[2]  
[Anonymous], 1999, 2 INT C AUD VID BAS
[3]  
[Anonymous], 2015, ARXIV150703409
[4]  
[Anonymous], 2006, BRIT MACH VIS C
[5]  
Belhumeur PN, 2011, PROC CVPR IEEE, P545, DOI 10.1109/CVPR.2011.5995602
[6]   EmotioNet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild [J].
Benitez-Quiroz, C. Fabian ;
Srinivasan, Ramprakash ;
Martinez, Aleix M. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5562-5570
[7]   Faster Than Real-time Facial Alignment: A 3D Spatial Transformer Network Approach in Unconstrained Poses [J].
Bhagavatula, Chandrasekhar ;
Zhu, Chenchen ;
Luu, Khoa ;
Savvides, Marios .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4000-4009
[8]   Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources [J].
Bulat, Adrian ;
Tzimiropoulos, Georgios .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3726-3734
[9]   Robust face landmark estimation under occlusion [J].
Burgos-Artizzu, Xavier P. ;
Perona, Pietro ;
Dollar, Piotr .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :1513-1520
[10]   Face Alignment by Explicit Shape Regression [J].
Cao, Xudong ;
Wei, Yichen ;
Wen, Fang ;
Sun, Jian .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 107 (02) :177-190