Rectified Wing Loss for Efficient and Robust Facial Landmark Localisation with Convolutional Neural Networks

被引：23

作者：

Feng, Zhen-Hua ^{[1
]}

Kittler, Josef ^{[1
]}

Awais, Muhammad ^{[1
]}

Wu, Xiao-Jun ^{[2
]}

机构：

[1] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford GU2 7XH, Surrey, England

[2] Jiangnan Univ, Sch Internet Things Engn, Wuxi 214122, Jiangsu, Peoples R China

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2020年 / 128卷 / 8-9期

基金：

英国工程与自然科学研究理事会;

关键词：

Facial landmark localisation; Deep convolutional neural networks; Rectified Wing Loss; Pose-based data balancing; Coarse-to-fine networks; FACE ALIGNMENT; REGRESSION; CASCADE;

D O I：

10.1007/s11263-019-01275-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Efficient and robust facial landmark localisation is crucial for the deployment of real-time face analysis systems. This paper presents a new loss function, namely Rectified Wing (RWing) loss, for regression-based facial landmark localisation with Convolutional Neural Networks (CNNs). We first systemically analyse different loss functions, including L2, L1 and smooth L1. The analysis suggests that the training of a network should pay more attention to small-medium errors. Motivated by this finding, we design a piece-wise loss that amplifies the impact of the samples with small-medium errors. Besides, we rectify the loss function for very small errors to mitigate the impact of inaccuracy of manual annotation. The use of our RWing loss boosts the performance significantly for regression-based CNNs in facial landmarking, especially for lightweight network architectures. To address the problem of under-representation of samples with large pose variations, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation strategies. Last, the proposed approach is extended to create a coarse-to-fine framework for robust and efficient landmark localisation. Moreover, the proposed coarse-to-fine framework is able to deal with the small sample size problem effectively. The experimental results obtained on several well-known benchmarking datasets demonstrate the merits of our RWing loss and prove the superiority of the proposed method over the state-of-the-art approaches.

引用

页码：2126 / 2145

页数：20

共 105 条

[1] [Anonymous], 2015, ARXIV150703409
[2] [Anonymous], 2015, ARXIV PREPRINT ARXIV
[3] Bansal Ankan, 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB), P464, DOI 10.1109/BTAS.2017.8272731
[4] Belhumeur PN, 2011, PROC CVPR IEEE, P545, DOI 10.1109/CVPR.2011.5995602
[5] EmotioNet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild
Benitez-Quiroz, C. Fabian
Srinivasan, Ramprakash
Martinez, Aleix M.
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5562 - 5570
[6] Faster Than Real-time Facial Alignment: A 3D Spatial Transformer Network Approach in Unconstrained Poses
Bhagavatula, Chandrasekhar
Zhu, Chenchen
Luu, Khoa
Savvides, Marios
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4000 - 4009
[7] Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources
Bulat, Adrian
Tzimiropoulos, Georgios
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3726 - 3734
[8] Robust face landmark estimation under occlusion
Burgos-Artizzu, Xavier P.
Perona, Pietro
Dollar, Piotr
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1513 - 1520
[9] Face Alignment by Explicit Shape Regression
Cao, Xudong
Wei, Yichen
Wen, Fang
Sun, Jian
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 107 (02) : 177 - 190
[10] Cootes T. F., 2006, British Mach. Vision Conf, P929

← 1 2 3 4 5 6 7 8 9 10 →